Learning Data Science – 4 untold truths

Last updated on April 27, 2021

Did you flirt with the idea of learning data science? You are not alone. This has been a really hot topic in the last few years and it will be one in the upcoming few, for sure. Yet, very few people actually become data scientists.

Why?

Well, part of the problem is that many aspiring data scientists don’t know what to expect from this field. Or even worse, based on the many misleading (sometimes scammy) “how to become a data scientist” articles, they have false expectations. And when they hit the wall, they get demotivated and quit.

In this article, I want to show you four untold truths that you should know about learning data science – and I have never seen them written down anywhere else before.

This article is available in video format, as well:

Untold truth #1: Learning Data Science is Hard!

Learning data science is not easy.
It will take a lot of work, a lot of energy and a lot of time from you.

I have seen an ad recently in my Instagram feed that said:
“Take this course and master data science in 1 month!”

And I was like: what the fudge!?

I’ve been practicing data science for 6+ years now. I’ve held senior DS positions (in addition to teaching). But I wouldn’t say that I mastered data science or analytics. I know for a fact that no one can master data science in 1 month. In fact, my personal estimation (based on students I worked with) is that from zero to the junior level the learning process will take ~6-9 months.
(More about that in this free course: How to become a data scientist.

Learning data science is hard!
A few online education platforms imply the opposite.

  • “Just change one word in this query. Run it! And boom, you’ve learned SQL…”
  • “Just watch this video course of the instructor running Python code, and you will know Python, too…”
  • “Just play around with this interactive chart and you will understand regression analysis immediately…”

Two years ago, I interviewed a guy for a junior DS position. He didn’t have any hands-on experience, but he learned SQL on a popular “just-type-your-code-into-the-browser” kind of online learning platform. (I won’t name the exact platform here. :-))

I gave him a computer with an SQL manager open – and a simple real-life task. He had to JOIN two SQL tables, then do a simple segmentation. He couldn’t solve the task! He ran into syntax errors, he couldn’t debug his code, he didn’t get the context, he couldn’t discover the data…

And that’s when I realized that many of these online schools give people only the illusion of data science knowledge.

You want to have real data science knowledge

You want to have real data science knowledge.
But what does it take?

Well, first and foremost: (1) a lot of practicing (2) in true-to-life data environments.
Don’t try to skip forward: take the time and the energy and set up your own data server!

Yes, sometimes (well, quite often in the beginning) you will mistype a code-snippet, your computer will throw an error and it will be very annoying. But this is how it works! We make mistakes, we learn from them and next time we will do much better.

And also take the time to practice a lot!
When you practice, it’s okay to make stupid mistakes. For instance, it’s okay to accidentally mess up your previously built data pipelines and lose hours of work… (This happens from time to time with my students.) But again: we all do stupid things in real life data projects, too. At least, I did in my junior years — and it cost me a lot of extra work-hours. But I learned from that.

We make mistakes, we learn from them and we don’t make them again.

Note: How to practice? I shared a few ideas (and even more) in the above-mentioned free online course: How to become a data scientist?

Learning data science is not easy and it will take time. If you can’t accept this fact, then maybe this profession is not the best choice for you. But if you are okay with learning data science the hard way, this learning period of a few months will be one of your best long-term investments. (I’ll get back to this below.)

Do you like the article so far? If so, you’ll love this 6-week data science course on Data36: The Junior Data Scientist’s First Month. It’s a 6-week simulation of being a junior data scientist at a true-to-life startup. Go check it out here: https://data36.com/jds!

Untold truth #2: It’s not “Learning Data Science”, it’s “improving your Data Science skills”

The world changes really fast and it won’t get any slower.
And I seriously believe that if one wants to keep up with the pace, the only way to do it is by focusing on improving skills.

Why?
You might already have heard that according to researchers’ predictions, ~65% of today’s grade schoolers will hold jobs that don’t exist yet.

You might also have heard that the current estimated “half-life” of engineering related information is ~4 years. So 50% of the things you learn today regarding IT will be outdated in ~4 years.

learning data science
source: Shift Happens 2018

What does it mean for you?
That the skills you acquire and improve are way more important than the actual information you learn.

It also means that “learning data science” is not about learning data science.

It’s about:

  • improving your coding skills.
  • improving your business skills.
  • improving your mathematical/statistical skills.
  • improving your data visualization, presentation, communication and other soft skills.

Learning data science is not about:

  • Learning a certain package of Python.
  • Learning the different industry benchmarks for this or that KPI.
  • Learning certain statistical models.
  • Learning how to use Google Data Studio or Tableau.

What seems important today, might be irrelevant in 5 years
Because mastering, for instance, the Scikit-learn library or Google Data Studio might seem important today… but I bet that there will be a better machine learning package and a better data visualization software in 5 years.

Don’t get me wrong, I still think that today, you should learn these things because they are part of the current data science and analytics ecosystem and also part of the learning curve itself.

I’m saying that you should keep in mind that when you learn these (or any other) tools, the important thing is not to cram in every little syntax detail or which button is where in the specific software – but to understand the big picture. Why does this tool work the way it works? What’s the underlying logic? How does this function work in other similar tools? Once you get these, changing between tools (even between programming languages) will be easy as pie.

And you will be much more prepared for the ever-changing future.

So to future-proof your data science career: focus on your skills and not on the information you learn!

Untold truth #3: Because it’s hard, Learning Data Science is a great investment

Let’s talk about career perspectives, too!
Learning data science is a great short and long-term investment.
I guess I don’t have to explain the short-term investment part.

Check out the LinkedIn Workforce Report for the US (August 2018)! It says:
“Demand for data scientists is off the charts … data science skills shortages are present in almost every large U.S. city. Nationally, we have a shortage of 151,717 people with data science skills, with particularly acute shortages in New York City, the San Francisco Bay Area, and Los Angeles.”

Also, based on Glassdoor’s research, Data Scientist was ranked as the best job three years in a row in the USA.

learning data science 2
source: glassdoor.com

Note: the above numbers apply to the US only – I don’t have hard data for the EU or any other parts of the world. But in my experience, in the EU we have the same trends.

High demand and persistent shortage puts data scientists into a really good position. It means:

  • Higher salary and better benefits
  • Better job security
  • Better work conditions (e.g. flexible hours, working from home, etc.)

Besides, data scientist is a well-respected job within the company (and in the outer world, too). You will be someone who your managers and colleagues want to listen to.

The point is: learning data science is a good short-term investment, for sure.

But is learning data science a good long-term investment, too?

My answer is yes and I have two reasons.

REASON #1:
Just look at the data! In 2018 the shortage of data scientists in the US was 151,717 people. This number was ~140,000 in 2011. So in 7 years, the market couldn’t produce enough new data scientists to fill up the gap. (It even grew a bit.)

REASON #2:
This is something that I’ve already mentioned in the intro. Many people want to learn data science… yet, not too many of them become data scientists after all.
Why? Because learning data science is hard. It’s a combination of hard skills (like learning Python and SQL) and soft skills (like business skills or communication skills) and more.
This is an entry limit that not many students can pass. They got fed up with statistics, or coding, or too many business decisions, and quit.

So the question is:

If yes, it will be one of the best career investments of your life.

Do you like the article so far? If so, you’ll love this 6-week data science course on Data36: The Junior Data Scientist’s First Month. It’s a 6-week simulation of being a junior data scientist at a true-to-life startup. Go check it out here: https://data36.com/jds!

Untold truth #4: Learning Data Science is not about learning Machine Learning, Deep Learning (or any other data buzzwords)

If you had to guess, what would you say is the most time-consuming part of the data scientist job?

Or in other words, what do you think you’ll need to work on the most when practicing data science and analytics for real?

Hint: it’s not Machine Learning.

The answer is…
.
.
.
…data cleaning.

Data scientists often say: “80 percent of data science is data cleaning. And 20 percent is complaining about data cleaning.”
Okay, obviously, that’s a joke.

But when you get into your first data science role, you will see for yourself: it’s not about doing machine learning and predictive analytics 24/7.

Because to be able to run a proper ML algorithm you have to complete many other steps first:

  • data collection
  • data formatting
  • data cleaning
  • transforming your data to the right format
  • discovering and understanding the data
  • running other data analytics projects
  • data visualization
  • automating the above steps
  • and so on…

And believe me when I say: when you are working with real data, these things are just as exciting as the machine learning and predictive analytics parts.

What’s important then?

When you are learning data science, you should not focus on polishing your ML skills. Instead you should focus on:

  • being fluent with Python and SQL
  • understanding the business logic behind simpler analytical methods
  • being familiar with the basics of statistics
  • practicing and experiencing the pain of working with a raw and uncleaned data set
  • learning how to automate
  • and so on…

These things will help you to become a better data scientist and eventually get your first job — not another deep learning or artificial intelligence course.

So to summarize:

  • Learning Python and SQL –» important
  • Learning about Deep Learning –» not important
  • Learning the basics of statistics –» important
  • Learning about Artificial Intelligence –» not important
  • Practicing data cleaning, data formatting and automation –» important
  • Understanding “artificial neural networks” –» not important

At least, at the junior level…
Later on (in 1 or 2 years), when your career moves forward, you will have to learn these above-mentioned, fancy machine learning methods on the job, anyway.

But for now: focus on the things that are important for your next step!

Conclusion

I know: being a data scientist, a machine learning guru, a master of deep learning… These all sound exciting. And you will get there eventually.
(I mean, if you want to. For instance, I take much, much more enjoyment from working on simpler analytics projects that have bigger impacts on business. E.g. a sophisticated segmentation project rather than a deep learning project.)

But think about everything that I’ve written above: accept that learning data science is hard, focus on your skills, consider it an investment and learn the basics first!

Recommended article: What is data science?

Cheers,
Tomi

← Previous post

Next post →

63 Comments

  1. I 100 percent agree with this. I realised this when I got into datascience within one week

  2. I truly like you’re composing style, incredible data, thankyou for posting.

  3. Anwar

    I was thinking something else about Data Science after going through your lines i understood i can go through it. Thanks for explaining clearly on minute parts

  4. Rohit Passi

    Hi

    I am sales & marekting professional with graduate degree in Computers 15 years past. I am in job field where my core field is to handle channel/Team and anaylsis of markets in Excel.

    But now I am stuck in my carrer. I always eager to become a data anayls but i am out of my course since past 15 years. So is that feasible for me do a course in data scientist right now & even can I compete or upgrade my career by doing this with the current level data scientist

  5. Livingstone

    Informative article my friend, thank you. I am at a Junior level of understanding Data science and believe me, I am that guy who rushed faster to learn Machine learning and deep learning but this just changed my strategy

  6. Thank you so much, Tomi.

    I have never read this useful advice like your article since I made my mind to become a data scientist. It is absolutely convincing and full of proofs.

    Esmat

  7. Thanks a lot for writing such a great article. It’s really has lots of insights and valuable information.

  8. Abrar Ansari

    Dear Tomi,

    I hope this message finds you well. I am currently a college student very interested in the field of data science, however I am still new to the discipline and only undertand the gist of it. As I jump from one google search to another, my goal is to find interdisciplinary potential in the field of data science; I am curious about the different applications of data science in the world of business, most especially as it pertains to the fields of accounting, finance, and MIS – the three fields I am strongly leaning towards. From what I am understanding, data science mostly pertains to the fields of marketing and supply schain management. If so, can you think of common (and therefore presumably lucrative) applications of data science in the fields of accounting, finance and MIS? As this may be a broad topic of conversation, so much as a lead would help me greatly.

    Thank you for your time and consideration,
    Abrar

  9. Shivam sharma

    Sir , I am a mechanical engineering student from India and i am interested in data science . I want to ask you that is there any scope for M.E student in data science ?

  10. Divyajayaram

    I liked the truth you reveal on this. Thanks a lot for you suggestions . for more waiting for you.

  11. I’m very happy to see the real and practical information about data science. Thank you

  12. Nura Wario Roba

    This has been of great help sure.Thanks,cheers.

  13. Thank you for your realistic, helpful and right explanation on this data scientist job.

  14. Great job, i love this topic & especially the way you have explained it is really awsome. Thnaks for sharing this info..

  15. I being from non technical baground was so surprised when I look on above your untold truths about data science but realised when I went through your post . It was so helpful to get understand a words b/w deeper and practising of data science . And I would like to say thank you so much for your information and would like to know a best institutes which are giving real classes for data science to join !

  16. Vamsi Sanapathi

    Thanks for the article. The content is very realistic.

  17. Geetha

    Thank you .. these information is really useful

  18. Thomas Okogun

    Thank you very much for telling us the TRUTH. I can personally feel the sincerity and deep intuition you put into this analysis of Data science.
    It’s like you finished writing on this topic conclusively and then dropped the paper for us to read . As in : this is it – the truth, take it or leave it

    Thanks a lot. I appreciate

    Thomas Okogun

  19. S N Raju U

    Hello,

    This is S N Raju U

    Just gone through your complete article. Thanks for such a valuable information.

    I am not a math student. Does lack of math background effect programming?

    I’ve 10 yrs of experience in HR field, which doesn’t add much value to this course. I know I should start from zero. I’m ready to do that.

    I’m very much interested in pursuing this course.

    Please suggest.

  20. Syed Jaan Basha

    Thank you for giving valuable information.

  21. Mathias Godwin (Gwin)

    Hey Tomi, I don’t know who you’re but I assume you to be Godsent for telling me all this big secrets. I’m into data science but the field seems filled with lots of math and all the fancy neural networks algorithms and I thought maybe I may never make it up to becoming the man I want cus a lot of books recomend knowing all this big math (calculus, probability, statistics and linear algebra). Yeah I know I should know them fair enough but not spending too much time on math instead of data and decision making. Sorry for been too wordy 🙂 , Thanks for those awesome advice 🙂

  22. Sriram

    Hi.. I’m an aspirant of those details you explained and also a little fan of those fancy words you said.
    Can a data scientist be an AI/ML engineer in market if he had good skills??

  23. priyanka

    its useful

  24. Srinath

    Srinath here!

    This post really helped in finding out the difference between data scientists and data analysts, being from Civil Engineering background i learned lot more things and future is pretty clear.

    keep posted and Thanks a lot.

  25. sheba

    Hey Tomi,

    Glad that I bumped into your page about data science. I have no computer knowledge (IT). my undergraduate is in psychology but I’m very much open to acquiring new skills especially skills that will better my future. is it possible for someone like me with no computer knowledge but a will to study hard become a data scientist? Thank you,

    Cheers!

  26. Sonari Dhanusha

    Thank you for this sir. It will be very useful for every future data scientist.

  27. Lekan Akindele

    Thank You Tomi for this article.
    I am beginning my journey into the data science field.
    I am still in college my final year to be precise, studying Information and Communication Engineering.
    I just hope I’m making the right decision.

  28. Danielle

    Question. I am considering changing careers into data analysis from my current one (vet tech). There is a local State university offering a analytics boot camp course that is about 3 months long. What would be the differences in pay and finding employment if someone only has the cert but not a bachelors degree?

  29. Joanita Nsimbe

    Thank you for this great submission and advice. I now know where to start from and what skills to focus on for now.

  30. Riddhi Doshi

    This is the most precisely written article I read on data science. The work that you’re doing is awesome. I will definitely check out the course. Thank you.

  31. Hey Tomi,
    Very comprehensive and critical article ! The above stated steps are actionable and practical. However, I have a question : I am 20 right now but I intend to go into marketing in a span of 5 years or so and I was wondering how much of data science do I need to learn if my favoured position is that of a CMO ( which takes roughly 10 -13 yrs)?
    Secondly, do I need to get a degree for this and what are the accredited certifications of data science?Are MIT and Harvard free online courses enough?

    • hey Airaf,

      I think, for a CMO (or any related marketing positions) it’s nice to know what data science is and maybe pickup some basic coding and stats skills, too. But on the day-to-day job (as a CMO), you won’t really use these… you can still profit from it because you know what you can expect from the data scientists on your team (and probably, you’ll be able to communicate with them more efficiently, too).

      But I’d certainly not spend more than a few months with data science, right now, if your long-term goals are related more to leading marketing teams. But that’s only my 2 cents.

      Hope it helps though!
      Tomi

  32. Hi Tomi,

    I always had confusion around these buzz words 🙂 so started reading many blogs, articles etc but those never cleared my confusion.
    I accidentally find your blog during my regular search today. I read your blog and understood the differences between blockchain and DataScience. It’s simple and clear info what a beginner expects and is exactly matched for what I am searching for. Now I am feeling lite and got a direction where should I kick start to upgrade my skill. Thank you Thank you so much. Much Appreciated!!

    I am a software professional with 7 years of experience in Technical into the Middleware Integration platform. Would like to upgrade my skill into the Data Science stream and your post helped me where to start.

  33. Great insights… Helpful before starting the course.

Leave a Reply