This is a follow up to https://colorblindprogramming.com/round-probabilities-before. Last year I stopped after discovering that the only correct way to calculate the odds is to look at the probability trees. This year I took this one step forward and created a script that would calculate the correct probabilities. I intend to reuse this script for the future draws, and a year it’s a long time for my memory so I am adding some notes here.
The incorrect approach: the big-bowl
The first approach last year was to calculate all the possible pairs, eliminate the invalid ones and then calculate the associated percentages for each pair. In hindsight, this approach was obviously wrong, because it doesn’t replicate the actual draw. This approach would only be accurate if the draw consisted of a single draw – from a very big bowl of all the valid options. This is obviously not how the actual draw works, so even if the final numbers were pretty close to the correct ones, it was not the correct approach.
The correct approach, using conditional probabilities
The correct way to look at this is by understanding that we are talking about dependent events. Each draw depends on the actual result of the previous draw. It’s identical to this process, beautifully explained on MathIsFun.com:
So how do we actually do it?
There are two approaches: The first one is a bit more complicated and implies creating the tree above for the 16 teams and 16 steps (each team pick is a step). It has the advantage of producing accurate results, but it’s a bit more difficult to implement. The second one consists of simulating the draw process and repeating it a lot of times. I found this approach easier, here is the pseudo-code of the draw process:
for each unseeded team
if there is a mandatory draw (starting from the 5th unseeded team)
then automatically create the pair and add it to the draw
otherwise, pick a random unseeded team
get the list of available seeded teams
randomly pick a seeded team from the list above
add pair to the draw
Repeating this process a few millions of times would lead to millions of possible draws, and based on that we can calculate the percentages.
But there are 2 catches: 1. Checking both sides of the draw. Have a look at the step 2 above, checking if there is a mandatory draw: let’s say you are left with 4 unseeded teams and 4 seeded teams. It’s not enough to look at the unseeded teams options, you also need to look the other way around. Example: Unseeded teams: Liverpool, United, Shalke, Lyon Seeded teams: PSG, City, Real, Barcelona Liverpool has 2 options, United 3, Shalke 4 and Lyon 2. But if you randomly pick Shalke and you pair it with any of PSG, Real or Barcelona, then you leave an impossible draw for City (which cannot be drawn against any of the 3 English teams left). So the solution is to count the number of options for both unseeded and seeded teams. If there is a single option, pick it.
2. Go back if needed. Even with the above safety mechanism in place things can still go wrong. Example: Unseeded teams: Roma, Liverpool, Shalke, Lyon Seeded teams: Porto, Barcelona, PSG, City Options for the unseeded teams: Rome -4, Liverpool -2, Shalke -4, Lyon -2. Options for the seeded teams: Porto -3, Barcelona -4, PSG -2, City -2. The safety mechanism above (counting the number of options for both seeded and unseeded teams) tells us that everything is fine. So we go ahead and pair Rome with Porto. We are now left with: Unseeded: Liverpool -1, Shalke -3, Lyon -1 Seeded: Barcelona -3, PSG -1, City -1. The problem is that both PSG and City have an option, and that option is Shalke. So this leads to an impossible draw, so the solution in this case is to go back one step and pick another draw instead of Roma v Porto. According to my calculations this could happen in about 0.4% of cases, and I am really curious how UEFA would handle it if it happened on stage. In the scenario above, if Roma was selected as unseeded team, I expect that the computer will only allow PSG and City to be one of the seeded teams, but I am really curious to hear the hosts explanation about this constraint (since both Porto and Barcelona are, at first sight, also valid options for Roma) 🙂
Using the algorithm above, I ran the simulation 2 million times. These are the results:
Checking the results
The nice thing about being both a geek and a football lover is that you get to know smart persons at the intersection of science and football. Two of them are Julien Guyon and Emmanuel Syrmoudis. They also spent time thinking about this topic. Julien came up with a great explanation of the draw process and probabilities, while Emmanuel went one step forward and actually created an interactive draw simulator.
My results come pretty close to theirs, so I’m quite confident that my method is decent enough. I plan to reuse it again next year and, perhaps, also try to create the actual probability tree to get the exact percentages.
TL;DR: The Model S is a great car and it offers you an extraordinary driving experience. Tesla customer service in Europe (more specifically in Belgium) is dreadful. Tesla doesn’t take security seriously and the Model S doesn’t seem to be very mature yet. I am still loving my Model S.
The statements above are not mutually exclusive. I’m loving it, but I don’t recommend buying one. Not for the moment at least. Customer care is part of the experience of owning a car, and Tesla does it badly here in Europe.
More details about my statements above:
if you plan to call your customer service, expect waiting times in the region 1-2 hours. Yes, you read that right. Freaking hours, on the phone.
if you get in touch with someone from support, there’s no guarantee that they will actually do something to help. Recent example: called to report a problem with the left mirror. After an hour of waiting, I am told to send an email to the technical team, and they will reach back to me. That was 2 weeks ago. Nobody called.
the Model S is not mature enough. As a technical guy, I am used to technical issues. I see my Model S as a computer on wheels, so a few non-safety related bugs are tolerable. Resetting your car to fix the air conditioning flow or the internet connectivity is fine for me. But when these things start to happen on a regular basis, things can get annoying. Especially when Tesla doesn’t seem to care about it.
the Model S’s produced before June 2018 have a known vulnerability that can lead to the car being stolen with minimal effort. The solution is simple: upgrade the chip on the key fobs and re-link them with your car. Tesla fixed this problem for the cars produced after June 2018, but is asking the existing owners (pre-June 2018) to pay for the fix from their own pocket (about 250 EUR). The alternative recommended by Tesla is to disable the Passive Entry. Because that’s the normal thing to do after you sell a $100k+ car with a security hole in it: ask the customer to disable a feature for which he already paid. But hey, they take security seriously…
All that being said, I still love to drive my Model S. But I don’t recommend anyone buying one. There are other electric car producers out there. Look for one that actually gives a s*it about you. Unfortunately Tesla is not one of them. Yet.