The fun part of analysis, at least to me, is to make predictions. Since the new season starts next week, I'll try to predict the final standings at the end of the season with my algorithm.
Most predictions algorithms out there are evaluating the teams' playing strength based on the performance in the previous seasons. As the team is the atomic structure in these, they can't take easily new transfers into account. Goalimpact is evaluating players and thus can, in principle, take team changes due to transfers into account. However, it causes other headaches. Most teams have 22 or more players to choose from, but some, often even many, of them will only get few minutes playing time in a season. A team's playing strength is mainly based on subset of the players, maybe 15 or 16 players.
If I'm going to predict team results without knowing the XI that actual plays, I have to guess the players that will be part of the game. In this case I even need to guess the players that will mostly influence a team over the whole season. This can get very subjective quickly. My usual way around this issue is to use minute weighed average values from past games. This works quite well during a season, but I can't calculate this before the season even started. All newly bought players obviously didn't get any playing time yet and thus would get a weight of zero. My prediction would be based on a distorted estimate of the team composition.
An alternative approach, I considered, was to use the starting eleven predicted by LigaInsider. They provide quite accurate predictions for each match day in Bundesliga. The predicted starting XI for Werder Bremen is for example.
However, this has some other disadvantages. The estimate is for the next match day only. It may or may not be a good prediction for the main XI of the entire season. The main XI will be vague to some extend that early in a season in any case. Probably even the trainer will not now for sure which players will get how much playing time over the season. They are likely to have a rough idea and the have their core of six to eight players fix, but too many things are not projectable. So even though LigaInsider is doing a great job, they can't possibly be correct, independently of which XI they pick. Actually they don't even try this. As they pick the likely players for the next match only, some players are excluded because they suffer from a minor illness. Maybe a prediction for the XI of the season would still include them.
To get around the need to pick players, in the following prediction, I just use the average of all players that have been nominated for the first team as of now. Doing so, will cause a downward bias in the estimates of the team's Goalimpacts. This stems from the fact that the players actually playing in most cases are the players with higher Goalimpacts. The hope would be that the bias is about equal for all teams, but this is not the case. Some teams have a strong core team, but less strong players otherwise. Some teams, in contrast, have rather evenly distributed Goalimpacts over all 22 players. So, unfortunately, I'll have a bias due to this averaging, but I think it is still the best way to avoid introducing arbitrary selections of players. And, I admit, It has the charm of being easily done.
So this is the table with the predicted final standings for Bundesliga this season.
As comparison, I added the estimated rank implied in the Bwin odds and the current rank according to ClubElo and the Euro Club Index. The first four teams are identical in all predictions. This doesn't come as a surprise as they are identical to the first four of the last season. The only deviation here is that Bwin and Goalimpact put Schalke above Leverkusen while ClubElo and the Euro Club Index kept the order of last season. But opinions diverge a lot on many of the other league ranks.
Goalimpact predicts Wolfsburg to finish 5th and Stuttgart 6th. Interestingly, this is identical to the predictions by Bwin although both teams where nowhere close to such a good rank in the previous season. The Euro Club Index has a similar rank for both. But it sees Hanover and Mönchengladbach stronger and thus the two are on 7 and 8. ClubElo share the view of a strong Wolfsburg, albeit on rank 7, but predicts Stuttgart to finish even below last year's disappointing rank 12.
All three statistic measures see Hanover finishing slightly higher than previous year on tank 6 to 8, but bwin puts them a rank lower on 10. Similarly all statistic based predictions see Mainz heading to a better season than last year's rank 13. Goalimpact is the most optimistic with rank 8, the other put Mainz on 11. Bwin sees no improvement to last year.
The prediction of newly relegated teams is particularly difficult, because they played few games, if any, against the other teams last season. The difference between the leagues is significant and many new teams face relegation just the next season again. This is, in fact, the prediction for Eintracht Braunschweig. ClubElo, the Euro Club Index, and Bwin see them as clear number 18. If you look at score values and odds, they are predicted to be the last by quite a margin. Goalimpact is more optimistic here and ranks them on 12. There first eleven is not outstanding here either, but the other players are not much worse than the team's stars. It might be that Goalimpact is biased upwards here. The other fresh relegated team, Hertha BSC Berlin, is predicted to be save in the middle of the table by all sources. They should end up between rank 10 (GI) and 14 (ClubElo).
Looking at the lower end of the table, Goalimpact predicts Bremen, Frankfurt and Augsburg as relegated teams. Especially, Frankfurt is disputed by the other approaches. They all predict a lower rank the last year's rank 6, too, but they see Frankfurt to end in the nowhere land between rank 9 and 14. Bremen is as a relegation candidate by the club-based algorithms, too. Bwin is here much more optimistic and predicts rank 11. Augsburg is a likely relegation team by all rankings. ClubElo is the last spark of hope by predicting Augsburg to repeat last year's rank 15. 1899 Hoffenheim is predicted to be relegated by both of the club-based approaches. Goalimpact and Bwin, in contrast, both predict a final rank in the middle of the table (11-13).
We will only know with hindsight which prediction was closed to reality. However, we can have short look into the predictions now already by looking into the correlations.
We can see that the two club-based measures are very highly correlated (92%) and also show comparably high correlations to the last year's ranks (91% and 82%). The lower the correlation is to the last years final rank, the braver (but not necessarily better) is the prediction. ClubElo's 91% makes it close to the naive estimation that everything stays as it was. Bwin (75%) and Goalimpact (50%) were bolder in moving away from last year's standings. If that was too bold, we will now in one year from now.
Most predictions algorithms out there are evaluating the teams' playing strength based on the performance in the previous seasons. As the team is the atomic structure in these, they can't take easily new transfers into account. Goalimpact is evaluating players and thus can, in principle, take team changes due to transfers into account. However, it causes other headaches. Most teams have 22 or more players to choose from, but some, often even many, of them will only get few minutes playing time in a season. A team's playing strength is mainly based on subset of the players, maybe 15 or 16 players.
If I'm going to predict team results without knowing the XI that actual plays, I have to guess the players that will be part of the game. In this case I even need to guess the players that will mostly influence a team over the whole season. This can get very subjective quickly. My usual way around this issue is to use minute weighed average values from past games. This works quite well during a season, but I can't calculate this before the season even started. All newly bought players obviously didn't get any playing time yet and thus would get a weight of zero. My prediction would be based on a distorted estimate of the team composition.
An alternative approach, I considered, was to use the starting eleven predicted by LigaInsider. They provide quite accurate predictions for each match day in Bundesliga. The predicted starting XI for Werder Bremen is for example.
However, this has some other disadvantages. The estimate is for the next match day only. It may or may not be a good prediction for the main XI of the entire season. The main XI will be vague to some extend that early in a season in any case. Probably even the trainer will not now for sure which players will get how much playing time over the season. They are likely to have a rough idea and the have their core of six to eight players fix, but too many things are not projectable. So even though LigaInsider is doing a great job, they can't possibly be correct, independently of which XI they pick. Actually they don't even try this. As they pick the likely players for the next match only, some players are excluded because they suffer from a minor illness. Maybe a prediction for the XI of the season would still include them.
To get around the need to pick players, in the following prediction, I just use the average of all players that have been nominated for the first team as of now. Doing so, will cause a downward bias in the estimates of the team's Goalimpacts. This stems from the fact that the players actually playing in most cases are the players with higher Goalimpacts. The hope would be that the bias is about equal for all teams, but this is not the case. Some teams have a strong core team, but less strong players otherwise. Some teams, in contrast, have rather evenly distributed Goalimpacts over all 22 players. So, unfortunately, I'll have a bias due to this averaging, but I think it is still the best way to avoid introducing arbitrary selections of players. And, I admit, It has the charm of being easily done.
No. | Team | Goalimpact | Points | Goal Diff | Bwin Rank | ClubElo | Euro Club Index | Last Year |
1 | Bayern München | 139,8 | 84,7 | +64,8 | 1 | 1 | 1 | 1 |
2 | Borussia Dortmund | 119,8 | 60,2 | +23,1 | 2 | 2 | 2 | 2 |
3 | FC Schalke 04 | 119,0 | 59,2 | +21,3 | 3 | 4 | 4 | 4 |
4 | Bayer Leverkusen | 113,8 | 52,9 | +10,6 | 4 | 3 | 3 | 3 |
5 | VfL Wolfsburg | 112,3 | 50,9 | +7,3 | 5 | 7 | 8 | 11 |
6 | VfB Stuttgart | 107,5 | 45,0 | -2,8 | 6 | 13 | 7 | 12 |
7 | Hannover 96 | 106,1 | 43,4 | -5,6 | 10 | 8 | 6 | 9 |
8 | 1. FSV Mainz 05 | 105,7 | 42,9 | -6,4 | 13 | 11 | 11 | 13 |
9 | Bor. Mönchengladbach | 105,6 | 42,7 | -6,7 | 6 | 6 | 5 | 8 |
10 | Hertha BSC | 105,4 | 42,5 | -7,1 | 12 | 14 | 13 | (17) |
11 | 1899 Hoffenheim | 105,3 | 42,4 | -7,3 | 13 | 16 | 16 | 16 |
12 | Eintracht Braunschweig | 105,0 | 42,0 | -7,9 | 18 | 18 | 18 | (18) |
13 | SC Freiburg | 104,6 | 41,5 | -8,8 | 13 | 5 | 9 | 5 |
14 | Hamburger SV | 103,6 | 40,3 | -10,8 | 8 | 10 | 10 | 7 |
15 | 1. FC Nürnberg | 103,5 | 40,2 | -11,0 | 16 | 9 | 12 | 10 |
16 | Werder Bremen | 101,2 | 37,4 | -15,8 | 11 | 17 | 15 | 14 |
17 | Eintracht Frankfurt | 100,7 | 36,8 | -16,8 | 9 | 12 | 14 | 6 |
18 | FC Augsburg | 99,2 | 35,0 | -19,9 | 17 | 15 | 17 | 15 |
As comparison, I added the estimated rank implied in the Bwin odds and the current rank according to ClubElo and the Euro Club Index. The first four teams are identical in all predictions. This doesn't come as a surprise as they are identical to the first four of the last season. The only deviation here is that Bwin and Goalimpact put Schalke above Leverkusen while ClubElo and the Euro Club Index kept the order of last season. But opinions diverge a lot on many of the other league ranks.
Goalimpact predicts Wolfsburg to finish 5th and Stuttgart 6th. Interestingly, this is identical to the predictions by Bwin although both teams where nowhere close to such a good rank in the previous season. The Euro Club Index has a similar rank for both. But it sees Hanover and Mönchengladbach stronger and thus the two are on 7 and 8. ClubElo share the view of a strong Wolfsburg, albeit on rank 7, but predicts Stuttgart to finish even below last year's disappointing rank 12.
All three statistic measures see Hanover finishing slightly higher than previous year on tank 6 to 8, but bwin puts them a rank lower on 10. Similarly all statistic based predictions see Mainz heading to a better season than last year's rank 13. Goalimpact is the most optimistic with rank 8, the other put Mainz on 11. Bwin sees no improvement to last year.
The prediction of newly relegated teams is particularly difficult, because they played few games, if any, against the other teams last season. The difference between the leagues is significant and many new teams face relegation just the next season again. This is, in fact, the prediction for Eintracht Braunschweig. ClubElo, the Euro Club Index, and Bwin see them as clear number 18. If you look at score values and odds, they are predicted to be the last by quite a margin. Goalimpact is more optimistic here and ranks them on 12. There first eleven is not outstanding here either, but the other players are not much worse than the team's stars. It might be that Goalimpact is biased upwards here. The other fresh relegated team, Hertha BSC Berlin, is predicted to be save in the middle of the table by all sources. They should end up between rank 10 (GI) and 14 (ClubElo).
Looking at the lower end of the table, Goalimpact predicts Bremen, Frankfurt and Augsburg as relegated teams. Especially, Frankfurt is disputed by the other approaches. They all predict a lower rank the last year's rank 6, too, but they see Frankfurt to end in the nowhere land between rank 9 and 14. Bremen is as a relegation candidate by the club-based algorithms, too. Bwin is here much more optimistic and predicts rank 11. Augsburg is a likely relegation team by all rankings. ClubElo is the last spark of hope by predicting Augsburg to repeat last year's rank 15. 1899 Hoffenheim is predicted to be relegated by both of the club-based approaches. Goalimpact and Bwin, in contrast, both predict a final rank in the middle of the table (11-13).
We will only know with hindsight which prediction was closed to reality. However, we can have short look into the predictions now already by looking into the correlations.
Goalimpact | Bwin Rank | ClubElo | Euro Club Index | Last Year | |
Goalimpact | 100% | 78% | 69% | 83% | 50% |
Bwin Rank | 100% | 75% | 87% | 75% | |
ClubElo | 100% | 92% | 91% | ||
Euro Club Index | 100% | 82% | |||
Last Year | 100% |
We can see that the two club-based measures are very highly correlated (92%) and also show comparably high correlations to the last year's ranks (91% and 82%). The lower the correlation is to the last years final rank, the braver (but not necessarily better) is the prediction. ClubElo's 91% makes it close to the naive estimation that everything stays as it was. Bwin (75%) and Goalimpact (50%) were bolder in moving away from last year's standings. If that was too bold, we will now in one year from now.