Why are people losing at the casinos? Why shouldn’t you buy a lottery ticket? How do you account for uncertainty when you invest a smaller or bigger amount of money? And what should you consider when you calculate the ROI of a data science project? Behind all these questions there is one powerful statistical concept: expected value!
In this article, I’ll show you:
- What expected value is
- The expected value formula
- Examples of applying and calculating it
- How to use it in your everyday life
- How to use it in your data science career
- A fun game to test whether you really get what expected value is
What is Expected Value?
Expected value is a theoretical value that shows the average return of an action you’d get if it was repeated infinite times. You can calculate expected value as the weighted average of all the possible outcome values — where the weight is the probability of the given outcome.
I know, I know… on the first read, this sounds complicated. But believe me, it’s not. Let me give you a simple example and everything will fall into place immediately.
A friendly game
You and your friend play a game. Your friend has a hat with 10 balls in it:
- 5 blue balls
- 4 yellow balls
- 1 red ball
You draw one ball from the hat. If you draw:
- a blue ball, you’ll win $0
- a yellow ball, you’ll win $2
- a red ball, you’ll win $10
Let’s calculate the expected value of this game!
Take all the possible outcomes and calculate their weighted average — where the weight is the probability of the given outcome. For your convenience, I put all the details into one table:
|the color of the ball||probability of drawing it||you get||weighted value|
The calculation goes:
(0.5 * $0) + (0.4 * $2) + (0.1 * $10) = $1.80
So the expected value of this game is:
In other words if you played it long enough, let’s say for 10,000 rounds, you’d end up with something pretty close to
$18,000 (which is
10,000 * $1.80, you know).
Obviously, if you played only one round, you’d get $10, $2 or $0… and not $1.80.
As I said: expected value is a theoretical value. But it shows itself on bigger sample sizes in practice, too.
A less friendly game
In the previous example you played with a friend. She didn’t ask you to risk your money. You could only win. How nice of her! In real life though, it’s more likely that you’ll have to pay a fee to get into the game.
Now that you know the expected value of this game (
$1.80) you can immediately tell how much money you can risk to stay profitable in the long term.
Your expected value calculation changes like this:
(0.5 * 0) + (0.4 * 2) + (0.1 * 10) - entrance_fee = ...
The only new variable is the entrance fee, of course.
- If it’s exactly
$1.80, your expected revenue in this game is
- if it’s less than
$1.80, you make profit,
- if it’s more than
$1.80, you lose money
…in the long term.
As I said, the concept of expected value is so, so simple. And that’s why my mind is always blown when I see people ignore it in so many parts of their life.
The Expected Value Formula
The expected value formula is this:
E(x) = x1 * P(x1) + x2 * P(x2) + x3 * P(x3)…
- x is the outcome of the event
- P(x) is the probability of the event occurring
You can have as many
xz * P(xz)s in the equation as there are possible outcomes for the action you’re examining.
There is a short form for the expected value formula, too.
E(x) = ∑x * P(x)
The formula, by the way, shows the same thing you have seen in the examples before: it’s the weighted mean of the possible outcomes, where the weight is the probability of each event occurring.
Examples of applying and calculating Expected Value
Let me give you a few more real-life examples to hammer home the concept and the math!
Example #1 – Coin flip
What is the most fair gamble in the world? Flipping a coin!
You have two outcomes: heads or tails. The probabilities of both are 50%.
Let’s say that you play 100 rounds with your friend. You risk $1 in each round. If it’s tails, you double your money, if it’s heads, you lose your money.
Using the expected value formula:
($0 * 0.5) + ($2 * 0.5) = $1
The expected revenue from this game is
$1. And you have to invest
$1 in each round. So your expected value of your profit is
$0. In other words, if you play this game long enough, you won’t lose or win any money.
Okay, so this is the theory. But does it work out in practice?
Let’s run a simulation to discover that!
Here’s a visual!
- The orange line represents the expected value in each round. This is the theoretical value. Luck is eliminated. Again, it’s always $0 because your investment ($1) equals your expected revenue ($1).
- The blue line is the real stack. It has a natural variance. It goes up and down, depending whether you were lucky (you got heads) or unlucky (you got tails).
As you can see, the expected value was $0 – but you ended up with $5 after all. Yeah, this happens, you know, it’s called blind luck. But calculating the expected value helps rationalize that.
Speaking of luck…
Most people misinterpret the probability of improbable things. Here’s the same game, the same simulation, the same fair coin — but over 10,000 rounds this time. And look at that lucky run between round #3000 and #5000. This is natural variance in action, again. The word natural fits well in this situation because seeing a fluctuation like this in real life is totally normal.
Interestingly enough, it goes back to 0, after all.
That’s called the central tendency and the more you play, the more it applies.
Example #2 – Roulette (black vs. red)
I never play roulette.
Why? Because I know that the more I play, the higher the chance that I’ll lose. But how much exactly?
Just apply the expected value formula here, too.
Let’s say that you want to put $1 on black.
What’s your expected value?
- black: probability 18/37, you win $2
- red: probability 18/37, you get nothing
- zero: probability 1/37, you get nothing
Using the expected value formula:
($2 * 18/37) + ($0 * 18/37) + ($0 * 1/37) = $0.97
$0.97 is the expected revenue. Given that you invest
$1, your expected profit is
-$0.03… so in theory, you lose 3 cents in each round.
Let’s see the 10,000-round simulation of this one! It’s really sobering:
- the orange line shows the expected value of your stack (theoretical value)
- the blue line shows the real value of your stack (luck and natural variance involved)
In this particular simulation, we were very lucky because we ended up above the expected value. Yet with a $200 loss.
Oh, and if you think I went with the example that serves my message, here’s the next six simulations I ran right after this one:
Expected value and central tendency is powerful.
As they say: the house always wins. (Check out my new Youtube video on the topic: Why You Shouldn’t Go to Casinos… you can do it in podcast format, as well.)
Example #3 – “Risk-free” investments
There is no such a thing as risk-free investment.
There are low-risk investments and high-risk investments. And if you are smart enough, you can pick a low-risk investment with a high enough expected value.
But again, all investments involve some risk. And you should account for that before you put your money (or any other resources) into it.
Here’s a simple example:
Most European countries offer government bonds. They usually pay ~4% interest per year. And they are considered to be extremely secure investments. After all, countries don’t go bankrupt very often, right? (Sometimes they do though.)
But let’s run the numbers!
You want to invest €100,000 and you’d realize a 4% yield after one year.
If there were no risk at all, your expected value would be simply:
(€100,000 * 1.04 * 1) = €104,000
So it’s a €4,000 profit. Yay.
But you have to account for the potential risks, too!
Let’s say there’s a marginal chance that the country goes bankrupt and you lose all your money (again: it’s improbable but can happen). Set an extremely low probability for that: 0.01%.
Your expected value formula changes this way:
(€100,000 * 0 * 0.0001) + (€100,000 * 1.04 * 0.9999) = €103,989.6
Okay, it seems that we still have a very good expected value. Country bankruptcy is not a significant factor. Great!
Another risk is that you might need your money and take it out earlier than in 1 year. In that case, you’d lose the yield and usually, you’d have to pay a penalty, too. The usual penalty rate is ~2%. The unknown variable is the probability that you’ll have to take out your money — let’s go with an estimated value: 20%.
So the EV calculation goes:
(€100,000 * 0* 0.0001) + (€100,000 * 0.98 * 0.2) + (€100,000 * 1.04 * 0.7999) = €102,789.6
Still a positive value — although
€2,789.6 is much lower than the original
Are government bonds good or bad investments? I don’t care — this is not a money blog. 🙂 All I’m saying is that before any investment, you have to run your numbers, account for all possible outcomes — and calculate expected value to have a realistic picture.
The hard part: guessing probability (stock market, poker, etc.)
Applying the expected value formula is simple. Knowing all the variables in it is the hard part.
Especially the probability of the specific events. Even in that simpler bond-investment example above, I had to go with estimates and guesses — because I don’t have solid information on the likelihood of a country going bankrupt.
But that’s fine. Sometimes you have clear numbers and it’s easier to make the right call (e.g. not playing roulette). In other cases, you don’t. Regardless, in these cases, your goal is to collect as much information as you can and come up with estimates that are as realistic as possible.
Note: A good example can be playing poker. You know what’s in your hand. And that’s important information – you can already calculate your chances based on that. But you can improve your math if you can narrow down what could be in your opponents’ hands. Real poker pros know all these tricks and it’s not an accident that they win more than others.
How to use expected value in your everyday life
If you think expected value is a new concept or that you can use it in data science only, let me mention that the great Blaise Pascal tried to use it to argue whether it’s worth it to believe in God or not. 🙂 Well, that’s an extreme (and maybe not the best) application of the formula. But it shows very well that statistics also has its philosophical depths. For me, starting to apply expected value in my life was a true mindshift. I realized that nothing is certain, but most things have a high enough probability and reward to take a risk.
E.g. quitting your full-time job and starting your own company instead. Is it a good or a bad financial decision? You just have to estimate your outcomes and their probabilities.
|you’ll make X-times less/more money than in your day job. X is:||probability||weighted value|
Using the expected value formula:
(0.1 * 0) + (0.2 * 0.5) + (0.3 * 1) + (0.25 * 2) + (0.09 * 3) + (0.05 * 10) + (0.01 * 100) = 2.67
Again, I just came up with these numbers, they differ from person to person. And I know this is an oversimplification, too. But the point is: using expected value as a concept in your everyday life can help you to rationalize emotionally stressful and/or scary decisions.
It can also help you to avoid bad decisions.
Note: Homework! Is it worth speeding on highways? Try to run the expected value calculation by yourself! (Hint: How much time do you save by driving at 150 kmph instead of 120 kmph? 10 minutes? 20 minutes? And what’s the probability that you’ll die and lose 20 years or 30 years on the other hand? What’s the expected value of speeding?)
Okay, so before we go too deep into these philosophical questions, let me answer a more data science related one, too…
How to use expected value in your data science career
When it comes to data science, you can take advantage of expected value in (at least) two ways.
First, you can use it directly in any situation where you are working with probability values. E.g:
- You classify users as potential buyers with 80% probability. Is it worth spending money on reaching out to them? The expected value formula can help you with the answer…
- You run an e-commerce store selling fast-moving consumer goods (FMCG). What’s better: running out of your top-selling product from time to time vs. overstocking it and accepting that you’ll have waste from time to time?
- Your new version in an A/B test reached only a 90% statistical significance. Is it worth the risk to go with it, regardless? (Hint: usually it isn’t.)
And secondly, you can try to calculate whether it’s worth running a given data science project at all. What will be the return on the time you invest on that project? What’s the probability that you’ll get the results that you are aiming for? These are, of course, again questions where answers need a lot of guesswork. But even with a ballpark estimate, you can rationalize your decisions and say yes or no to a project idea with more certainty.
Test yourself! How rational of an investor are you?
Applying the concept of expected value in a simpler money decision should be easy. But I learned that it isn’t for everyone. It takes time and experience to get good at it. So I created a little online game to help you practice. Check it out and figure out how good of an investor you are. (At the end of the game you’ll see where you are ranking compared to all other players.)
Check it out here: https://bestbet.data36.com/
I know, folks, not everything has to be rationalized, formulatized and calculated. But the concept of expected value will come in handy so many times in your life and in your career! Especially when you’ll have to make big decisions. So use it to:
- List all the different outcomes of a decision,
- Get (or try to estimate the) probability each of these outcomes,
- Run the expected value formula as you learned it,
- And make a better decision!
- If you want to learn more about how to become a data scientist, take my 50-minute video course: How to Become a Data Scientist. (It’s free!)
- Also check out my 6-week online course: The Junior Data Scientist’s First Month video course.