In this article, I’ll show you a few junior data scientist job interview questions… And also how to answer them.
Before we get started…
This article is part of a five article series called: How to get a job in data science and analytics.
Here are all the articles:
- episode #1 — Intro: What is a data analyst/scientist and what skills do you need?
- episode #2 — What do you need to do before you apply? (resume/cover letter/website/GitHub help)
- episode #3 — How to apply and how to prepare for data science job interviews and how to ace the take-home assignment
- episode #4 — Common junior data science job interview questions and how to answer them — this article
- episode #5 — How do you negotiate? Should you negotiate? What is the career trajectory for someone in data science and analytics?
Now, it’s time to get through those job interview questions.
Here’s the list..
#1 The most common job interview question: Why are you interested in [said company]?
This is a question you’ll get for any role — not just for data science, of course…
And you just need to have a cohesive answer about why you’re interested in a specific company.
- Do you use their product?
- Are you excited about the chance to use their unique data to answer interesting business questions?
- Have you heard amazing things about their culture?
I mean, you don’t have to worry too much about this, you don’t have to be a pro in the given field where your targeted company is in. But it’s important that you have a general sense of their business and actual answers for why it is interesting for you.
Nothing’s more disappointing in an application process than a candidate who doesn’t even know where they applied to and why.
Note by Tomi Mester: I remember when I applied for a data position in Sweden at iZettle. It’s a fintech startup… and I didn’t know too much about fintech at the time. But I was generally interested in it and I took 2-3 hours to research the company before I actually sent my application. So when it came to the question of why I was interested, I had a clear one-sentence answer.
General data science job interview questions about projects you’ve done
#2 Can you discuss a project you worked on that used messy data (or had to join various datasets together)?
This is a more data science related question.
And usually, someone on the data science and analytics team will ask this. A data scientist/analyst spends a ton of time dealing with messy data and figuring out the best way to join/merge data-tables. You need to be able to talk clearly about a time you had to handle 2/3/4+ large datasets:
- What issues did you run into and how did you deal with them?
- What datasets did you have to begin with, and
- What dataset were you interested in putting together?
#3 Can you discuss a project you worked on that used statistical modeling?
Again, usually someone on the data science and analytics team will ask this question. You need to have a clear and concise answer for a time you developed a model to answer a business question.
- Why did you choose that specific model?
- What assumptions did you have to make?
- What questions were you trying to answer?
#4 Can you discuss a project you worked on that used data?
This is a variation/more general version of the previous two questions. Basically, you need to be able to talk intelligently about a time you used data to answer a question.
Ideally, you can discuss an example project in two ways: one, with a more technical interviewer that gets more in the nitty-gritty of why you made certain decisions, what data issues you had, what programming languages/packages you used, etc. And two, you also need to be able to explain an example project to a non-technical interviewer—like a product manager or an executive:
- What question were you trying to answer?
- How did you arrive at that conclusion?
You need to be able to explain a data project, from planning to the communication of results, in an interesting and non-technical way.
Now, let me mention that if you haven’t done any projects yet, I really recommend that you boost your CV with one or two hobby projects. I wrote about this in the articles called Before you apply for a data science position and in the Data Science Projects for Boosting Your Resume.
#5 When was a time you had to explain your data science project to a non-technical audience?
This is pretty self-explanatory. The bulk of your job as a data scientist/analyst is taking complex analyses and explaining them in interesting and clear ways. Being able to talk about a time you effectively did this is critical during the interview process.
#6 When was a time you had to explain poor results from an analysis?
This happens quite a bit. As a data scientist/analyst, you will end up analyzing and evaluating a new product feature, initiative, marketing strategy, etc. that ends up failing. You need to be able to demonstrate that you’re not afraid to be honest, but giving bad news is definitely a challenge. You need to say that you understand that a lot of time and effort goes into a new feature/initiative/strategy, and that being mindful of that is important. Your job isn’t to drop data bombs and call out people and teams—it’s to be a resource and to help the company reach its goals.
Specific data science job interview questions (about different real-life situations + statistics and analytics methods)
#7 The product team is interested in testing a new feature for returning users from the United States. Plan an experiment, describe your analysis, and discuss how you would communicate results.
Quick reminder: On Data36 you’ll find an awesome case study on an A/B test Tomi has done — as well as a complete course on A/B testing! Regardless, here are the most important things you want to discuss with this question during the interview:
a. Make sure to discuss what metrics are important to evaluate!
Discuss that it is a common mistake with A/B testing to not define what metrics you are trying to move before you start an experiment. Also, discuss that it is important to not just throw in every possible metric and evaluate everything. When you’re designing experiments, you need to know how to react when an experiment finishes. So you need to decide what success looks like.
b. Calculating the sample size
- Why is calculating the sample size important? You need to have an appropriate sample size for experiments in order to get enough power for valid results. You don’t want to undershoot—and have to run an experiment again.
- You can discuss Type I errors (false positive) and Type II errors (false negative).
- You can discuss that when you’re designing an experiment, (in most cases) you can either run the experiment for longer/shorter, or show the feature to more/fewer users in order to show the new version to the correct sample size.
- You can point out that when calculating the sample size, you need to only look at historical metrics for returning users from the United States.
c. While the experiment is running
- You can discuss that it is important to not run general experiments during strange times or holidays—i.e., best to not run an experiment during a holiday because there is a greater chance the data might be noisy, and starting an experiment over a weekend is risky if there are any errors.
- Also regarding errors… oftentimes, the analytics team is responsible for high-level monitoring of experiments while they are running to make sure nothing is broken. This can involve reviewing error logs, raw logs, etc. Making a quick mention of that is a good idea as well.
- Do not suggest p-hacking—or any variation of it. Instead, mention that it is important to wait until the experiment has finished before reviewing/analyzing results. (More about A/B testing statistics: here.)
d. Analyzing the experiment results
- Suggest filtering the analysis somehow. For example, the experiment was looking at returning U.S. users. How did various cohorts respond? Did users who have been around for 1+ years behave similarly to users who have only been around for 1-2 months?
- Obviously, you need to discuss which statistical tests to use to determine whether the results are significant. Is the data binary (then chi-square, proportion test, etc.) or continuous (t-test)? If you’re confused by this stuff, spend 20-30 minutes reading through some basic stats stuff online. (Or get this book.)
- Lastly, spend some time discussing that it is important to interpret the results—do the experiment results suggest anything about LTV? Or user behavior? Or potentially a product feature issue? You need to bring it back to what’s most important: using data and statistics to answer a business question.
Again, if you want to dig deeper into A/B testing, take this course: A/B Test like a Pro!
#8 A dashboard created by the data team is showing a strange drop in a key metric. How would you go about investigating this?
There are a number of ways you can answer this, I’ll highlight a couple things you can discuss below:
- There might actually just be an error in the data—like a possible change in how something is reported or other data engineering issue.
- Maybe the KPI is opt-ins, and there is a worrying drop in opt-ins over the past 48 hours. Was there a change in the opt-in or upsell screen? You can discuss talking with the product or marketing team to make sure there weren’t any changes you were not aware of before you dive more deeply into the data.
- Discuss how else you can filter the analysis. Is the drop in opt-ins for all users? Or only U.S. users on mobile? Is the drop just because of a holiday or seasonality? Is traffic from a specific source down, and that might be causing the dip?
These are the types of things to think about and discuss.
#9 Based on your experience with our company, suggest a feature change!
This evaluates a few things: one, that you can talk intelligently about the company and their product (if they’re product-based). And two, that you can talk intelligently about how to use data analysis to make feature improvements.
Below are a few things to consider when answering this question:
- Not sure if this is even necessary to say—but obviously don’t say one of two extremes: that there is nothing to improve, “I love your company and your product, I can’t think of anything I’d want to change!” or insinuate that there are a lot of improvements you think are necessary.
- Once you bring up a potential feature change, make sure to discuss how you would test that feature and what metrics are important to consider.
- When asked this type of question, you have the opportunity to discuss what you like and dislike about the product and show that you’ve at least spent time using or thinking about the product.
#10 You have a table with the following schema: activity_date, device_type, source, content_type.
Answer the following questions:
- What type of dashboard can you develop with these metrics?
There are a ton of dashboard/visualization ideas you can discuss: from most popular content by device or by source, a daily chart of top content, visualizations to see where users are coming from, etc. In addition, is there a way to cluster by content type?
- How can we make sure to feature the most popular content?
Is certain content more popular based on the type of device a user is on, where they are coming from, or when they are visiting the site? You can discuss the possibility of featuring content based on these
- What other data would be helpful to include with the above table?
Just a couple of ideas: geo/region_id/state, content_position, activity_starttime/activity_endtime, search_term…
And a data science job interview question you shouldn’t forget about…
#11 “What questions do you have for me?”
Make sure you have at least a few questions prepared. If you’re talking to a product manager, ask them about the product roadmap—what are they working on? What are they especially excited about? If you’re talking to someone on the data team, ask them about how the team works, or data issues and problems they’ve faced.
Go on to the next episode!
Okay, we went through a lot of typical data science job interview questions for juniors….
This is not an exhaustive list but I hope that will make you more comfortable and more prepared for your next onsite interview.
If you nailed this part, you should be at the final stage… Getting an offer. Which leads us to the next and final episode of this series: Negotiating the data science job salary.
Continue here: data scientist salary negotiation.
- If you want to learn more about how to become a data scientist, take Tomi Mester’s 50-minute video course: How to Become a Data Scientist. (It’s free!)
- Also check out the 6-week online course: The Junior Data Scientist’s First Month video course.