CL draws reloaded

Written on 14 March 2019, 04:25pm

Tagged with: , , ,

You have 8 teams. They will be drawn one against each other, so 4 pairs in total.

Question 1: how many distinct pair sets are possible?

105. I got to this number after running a large number of simulations. Then I did a little bit of research and I also found the formula:

k=4

Question 2: if 4 of the 8 teams are from England, what is the probability that all 4 of them will be drawn together?

Again, after analyzing the 105 distinct pair sets, I found that only 9 of them have all-English pairs. The full probability set is:

  • two English pairs: 9/105 or 8.57%
  • exactly one English pair: 72/105 or 68.57%
  • no English pair: 24/105 or 22.86%

Football is just business

Written on 1 March 2019, 05:08pm

Tagged with: ,

I recently wrote a post on a Romanian sports site about trying to apply an IT process to find the root cause of a football problem. I will try to make a summary here since the post that I wrote is in Romanian.

It’s about Liverpool FC and it essentially starts from 4 facts about Premier League (PL) football:

  1. Modern football is a business. There are revenue streams (media, commercial, match-day), expenses (squad, facilities, etc), assets (the players and the staff) and risk management governing the entire process. In order to survive, a business needs to turn a profit.
  2. For the Top 6 PL teams, the main source of revenue is the participation in Champions League (CL). The difference between finishing 1st and 6th in PL is few million pounds (basically peanuts), while missing out the Top 4 (ensuring CL participation) could have significant financial impact. Example – Liverpool reaching the CL final last season meant that their profits tripled compared to the previous year.
  3. Every business has a vision, and a strategy for implementing the vision. The vision means the desired future position, while the strategy is a long-term plan to implement the vision. To implement the overall strategic plan, a shorter-term, tactical plan might be needed – easier to monitor and coordinate.
  4. Higher ambitions means higher risks. It all depends on the risk appetite of the business.

With these facts in mind, I try to make a root cause analysis of the reasons why Liverpool seems to lose pace in the recent period. Many supporters see the recent transfer window as a missed opportunity. Despite several injuries, in January 2019 the club was still on the 1st place with several points advantage. You would expect a club to strengthen from a position of strength. It didn’t, and the takeaway is that the vision of the club is different from the vision of the supporter. While the supporters would aim to win the PL, the club vision is to maintain a sustainable growth and sound financial management. This can be done by remaining in the Top 4 (virtually achieved at this stage of the season) and staying in CL for as long as possible. Aiming for the first place would involve bringing in new players, which introduce additional costs and risks. Higher ambitions means higher risks, which are not necessarily accepted by the business.

In these conditions, winning the PL would be simply a happy side-effect.

In the end, I touch on two more things. First, having the realization above made have a more relaxed approach in supporting Liverpool FC. I understand that my expectations are not necessarily aligned with the club priorities, and therefore I have to manage these expectations. For the first time in the last few years, last weekend I decided to skip watching a Liverpool game and enjoy some time with the family:

Time to chill

The second point is that I am getting a little bit annoyed with the extremists-optimists supporters on social media. This tweet is spot on:

Nobody is denying the progress Liverpool made since Klopp took over. I fully support him and I think he’s a perfect match for the club. In any other season, having 66 points after 27 games would virtually guarantee you winning the league.
But that doesn’t mean that we can pretend things are going great: Liverpool won a single game from the last 5, it’s definitely not the right time to celebrate being first of the league with a mere point ahead of the second place, when all the bookmakers and predictive models predict that Liverpool will finish second. Hypothetical statements such as ‘if you would have known this at the beginning of the season…‘ don’t go anywhere and tend to focus more on the past rather than the future.
This is the political correctness applied to football.

Conclusion: doing a root cause analysis using the 5 whys to find why Liverpool is no longer favorite to win the PL leads to the following results:

  1. Why? Because the squad is not good enough to fight the financial giant currently on the 2nd place
  2. Why? Because there are big differences between the starting XI and the bench
  3. Why? Because the club did not bring in enough players during the last 2 transfer windows
  4. Why? Because it was not needed; finishing in Top 4 is virtually guaranteed
  5. Why? Because football is a business and trophies are just a caprice of the supporters

UEFA CL draw probabilities – 2018 edition

Written on 19 December 2018, 06:44pm

Tagged with: , , , ,

This is a follow up to https://colorblindprogramming.com/round-probabilities-before. Last year I stopped after discovering that the only correct way to calculate the odds is to look at the probability trees. This year I took this one step forward and created a script that would calculate the correct probabilities. I intend to reuse this script for the future draws, and a year it’s a long time for my memory so I am adding some notes here.

The incorrect approach: the big-bowl

The first approach last year was to calculate all the possible pairs, eliminate the invalid ones and then calculate the associated percentages for each pair. In hindsight, this approach was obviously wrong, because it doesn’t replicate the actual draw. This approach would only be accurate if the draw consisted of a single draw – from a very big bowl of all the valid options. This is obviously not how the actual draw works, so even if the final numbers were pretty close to the correct ones, it was not the correct approach.  

The correct approach, using conditional probabilities

The correct way to look at this is by understanding that we are talking about dependent events. Each draw depends on the actual result of the previous draw. It’s identical to this process, beautifully explained on MathIsFun.com:

So how do we actually do it?

There are two approaches:
The first one is a bit more complicated and implies creating the tree above for the 16 teams and 16 steps (each team pick is a step). It has the advantage of producing accurate results, but it’s a bit more difficult to implement.
The second one consists of simulating the draw process and repeating it a lot of times. I found this approach easier, here is the pseudo-code of the draw process:

  1. for each unseeded team
  2. if there is a mandatory draw (starting from the 5th unseeded team)
    1. then automatically create the pair and add it to the draw
  3. otherwise, pick a random unseeded team
    1. get the list of available seeded teams
    2. randomly pick a seeded team from the list above
    3. add pair to the draw
  4. end

Repeating this process a few millions of times would lead to millions of possible draws, and based on that we can calculate the percentages.

But there are 2 catches:
1. Checking both sides of the draw. Have a look at the step 2 above, checking if there is a mandatory draw: let’s say you are left with 4 unseeded teams and 4 seeded teams. It’s not enough to look at the unseeded teams options, you also need to look the other way around. Example:
Unseeded teams: Liverpool, United, Shalke, Lyon
Seeded teams: PSG, City, Real, Barcelona
Liverpool has 2 options, United 3, Shalke 4 and Lyon 2. But if you randomly pick Shalke and you pair it with any of PSG, Real or Barcelona, then you leave an impossible draw for City (which cannot be drawn against any of the 3 English teams left). So the solution is to count the number of options for both unseeded and seeded teams. If there is a single option, pick it.

2. Go back if needed. Even with the above safety mechanism in place things can still go wrong. Example:
Unseeded teams: Roma, Liverpool, Shalke, Lyon
Seeded teams: Porto, Barcelona, PSG, City
Options for the unseeded teams: Rome -4, Liverpool -2, Shalke -4, Lyon -2.
Options for the seeded teams: Porto -3, Barcelona -4, PSG -2, City -2. 
The safety mechanism above (counting the number of options for both seeded and unseeded teams) tells us that everything is fine. So we go ahead and pair Rome with Porto. We are now left with:
Unseeded: Liverpool -1, Shalke -3, Lyon -1
Seeded: Barcelona -3, PSG -1, City -1.
The problem is that both PSG and City have an option, and that option is Shalke. So this leads to an impossible draw, so the solution in this case is to go back one step and pick another draw instead of Roma v Porto.
According to my calculations this could happen in about 0.4% of cases, and I am really curious how UEFA would handle it if it happened on stage. In the scenario above, if Roma was selected as unseeded team, I expect that the computer will only allow PSG and City to be one of the seeded teams, but I am really curious to hear the hosts explanation about this constraint (since both Porto and Barcelona are, at first sight, also valid options for Roma) 🙂

Using the algorithm above, I ran the simulation 2 million times. These are the results:

Checking the results

The nice thing about being both a geek and a football lover is that you get to know smart persons at the intersection of science and football. Two of them are Julien Guyon and Emmanuel Syrmoudis. They also spent time thinking about this topic. Julien came up with a great explanation of the draw process and probabilities, while Emmanuel went one step forward and actually created an interactive draw simulator.  

My results come pretty close to theirs, so I’m quite confident that my method is decent enough. I plan to reuse it again next year and, perhaps, also try to create the actual probability tree to get the exact percentages.