In the last 2 weeks I’ve introduced 9 common statistical bias types. If you went through them, you have already done the first very important step to overcome these issues and not let yourself be biased: awareness. In this article I’ll share a few more practical advices on how to prevent biased statistics in your data science and analytics projects – or just in everyday life.

If you missed the first two articles:
Statistical Bias Types part1
Statistical Bias Types part2

#1: Do not underestimate the amount of stupidity around you!

This is a non-scientific study, but it is worth a read, because it seems to be legit: the basic laws of human stupidity.
But here’s the one sentence summary: just because someone says something, it’s not necessarily true. We tend to have trust in what people say to us. Especially when they have a higher social status (eg. celebrities) or they are in higher company positions (eg. the boss of the boss of your boss’ boss). But the thing is, that people in higher positions are human beings as well. I’m not suggesting that they are stupid, but:

  1. They are usually just as biased as everyone else. (If not more.)
  2. They are not data analysts, so they don’t have second thoughts on possibly fake statistics. (In most of the cases, at least.)
  3. They usually have second (third (or fourth) hand information.
  4. Not even talking about the different personal ambitions and internal politics.

So dare to question your boss, if his/her statements are based on data and hard facts – or maybe just on gossip, assumptions or opinions.

#2: Always ask about the research method!

Anytime you see statistics: learn about the research method first. Remember the different survey errors, I have mentioned in the previous articles? Every second online available study is skewed by one or more of the statistical bias types. And they are online. Nobody criticizes them. Nobody asks questions. People are reading them, liking them, sharing them. Writing articles about them. And fake stuff spread the world.
So, you, dear fellow data-driven Friend: be critical and always check if the research was done right, when you read a study.

Note: and never ever trust studies, where they don’t mention the research method at all.

#3: Do your own analyses and researches!

“If you want something done right, do it yourself!” – Charles-Guillaume Étienne

Nothing is more trustworthy, than your own researches… Of course, you have to be critical with yourself as well, because there is always a potential to make mistakes. But! At least you know about yourself, that:

  1. you care about doing your researches right (I know you do, otherwise, you would not read this article :-))
  2. you have the statistical base-knowledge to do your researches right (if you don’t, please learn it first! Book recommendation: Practical Statistics for Data Scientists)

If you are still not 100% sure about your results, that’s normal. In fact, that’s fantastic in a way, because being skeptic with yourself means, that you really want to do first class researches. If that’s the case, don’t worry, just…

#4: Ask smarter people!

I’ve been always having a few mentors along the way, when I’ve been doing my researches. I know about these mentors, that:

  1. they are smarter than me.
  2. very critical about what I do.

And this is just perfect for me, because when they say, that my analysis/research looks correct, that means that it hit even their high standards, so my findings are most probably good.

Where to find these smarter people? Tough question. But – for instance – it can be a senior data professional at the company you are working for, or your favorite data professor at the university. If you are a younger data scientist, who doesn’t have these kind of connections, try to reach out to people on Twitter/Linkedin and ask them to give a second thought on your recent publication (or something) – if you are lucky, somebody will help you out!

#5: Think!

Remember that I’ve started my first article about Statistical Bias Types with the sentence: “Humans are stupid”. I was writing something similar in this article too. But I was just kidding. Humans are not stupid. Only too lazy – or more often too busy – to think.

And that’s the bottom line:
When you are doing researches, when you are interpreting statistics, when you are working with data: spend time with actual thinking!

If you are aware of the different possible statistical biases and if you are spending enough time to think about things, there is a very high chance that you will avoid every common traps, that can possibly bias your results. Spending 1 month on a research and delivering 100% accurate and actionable information is infinite times better, than spending only 2 weeks (2 weeks data analysis + 0 day thinking) and delivering misleading results.

If your manager doesn’t support this idea, show him the whole article series to let him understand the possible disaster, that statistical biases can cause! 😉

Conclusion

Everybody does mistakes. I hope this article series will help you a lot to avoid these potential mistakes coming from statistical biases! Remember the receipt is: be aware of the different statistical bias types, be critical and think!

If you have stories about statistical biases at your company or from your everyday life, let me and other readers know in the comment section! Otherwise subscribe to my Newsletter List for your weekly data science + analytics dose!

Cheers,
Tomi Mester