Much better than Elo, Glicko, and Trueskill

Theodoros Kiriakopoulos · July 25, 2021

A team or a player who has 20 wins in 20 games is better than a team or a player who has 40 wins and 20 losses in 60 games. Whereas the both have wins-losses=20. And a team or a player that has 200 wins in 200 games is better that a team or a player who has 1200 wins and 1000 losses in 2200 games. Whereas the both have wins-losses=200. Therefore the correct rating is not wins-losses, but (wins-losses)/(number of games)=(w-l)/g, which is equivalent to (wins+0.5draws)/(number of games)=(w+0.5d)/g. But a team or a player who has (1+0.5*0)/1=1 almost certainly is much more weaker than a team or a player who has (90+0.5*0)/100=0.9, therefore the (w+0.5d)/g needs two modifications, one which corrects this mistake, and another that takes in account the (w+0.5d)/g of the opponents that he faced, and the (w+0.5d)/g of the opponents of the opponents that he faced, and so on.

Suppose a player chooses to play only against players rated Elo or Glicko 1500, and wins all games. If the rating was 1500+7(wins-losses), the more games he plays the higher rating he gets. This is a wrong inflation, as his rating when he has played 10000 games should be very close to his rating when he has played 1000 games, and not (1500+7(10000-0))/(1500+7(1000-0))≈8.4 times higher. Does Elo create such an inflation? To answer this, you need the Elo formula

f(n)=f(n-1)+14(1-1/(10^-((f(n-1)-1500)/400)+1)), f(0)=1500<=>

f(n)=f(n-1)+14(1-1/(10^((1500-f(n-1))/400)+1)),f(0)=1500

where f(n) the rating after the current game and f(n-1) the rating after the current game. If f(10000) is higher enough than f(1000) then Elo has a wrong inflation. Unfortunately, calculators cannot calculate f(10000) and f(1000). However, one can see a similar formula

f(n)=f(n-1)+14(1-(1-0.5/(0.01(f(n-1)-1500)+1))), f(0)=1500, which

www.calcul.com/show/calculator/recursive?

can calculate it, and gives

f(1000) = 2588.1687720803807

f(10000) = 5143.33386212574

here THERE IS a wrong inflation.

Elo and Glicko rating must be replaced with the “expected score” against an average strength player e.g. that Sonas has estimated:

https://en.chessbase.com//post/sonas-overall-review-of-the-fide-rating-system-220813/37

which of course is the TRUE rating. But this has been already done!: e.g. Sonas table shows that the expected score for 100 points rating difference is 0.6, so the expected score of a player rated 1542+100=1642 against the average player of 1542, is 0.6. Similarly, it had been done before Sonas research, with the relation between the Elo rating difference, and the corresponding expected score, that Elo claimed. Yes, but the new better rating would get rid of the 1500+- rating and say how to find the “expected score”=(w+0.5d)/g against an average strength player, using the information of only his past score=(w+0.5d)/g, the past score=(w+0.5d)/g of the opponents that he faced, the past score=(w+0.5d)/g of the opponents of the opponents that he faced, and so on. THIS solution should be used instead of Elo and Glicko. Have I found it? Yes, and if my solution is not correct enough, then SOMEONE must find the correct enough one.

**swansont** · July 25, 2021

1 hour ago, Theodoros Kiriakopoulos said:

Suppose a player chooses to play only against players rated Elo or Glicko 1500

What?

Sensei · July 27, 2021

On 7/25/2021 at 10:23 AM, swansont said:

What?

https://en.m.wikipedia.org/wiki/Glicko_rating_system

On 7/25/2021 at 9:11 AM, Theodoros Kiriakopoulos said:

A team or a player who has 20 wins in 20 games is better than a team or a player who has 40 wins and 20 losses in 60 games. Whereas the both have wins-losses=20. And a team or a player that has 200 wins in 200 games is better that a team or a player who has 1200 wins and 1000 losses in 2200 games.

You should say "A team or a player who has 20 wins in 20 games may be better than a team or a player who has 40 wins and 20 losses in 60 games."..

...because it does matter with who they played and how much gained experience..

On 7/25/2021 at 9:11 AM, Theodoros Kiriakopoulos said:

Suppose a player chooses to play only against players rated Elo or Glicko 1500, and wins all games. If the rating was 1500+7(wins-losses), the more games he plays the higher rating he gets. This is a wrong inflation, as his rating when he has played 10000 games should be very close to his rating when he has played 1000 games, and not (1500+7(10000-0))/(1500+7(1000-0))≈8.4 times higher. Does Elo create such an inflation?

I found elo calculator for you:

https://www.omnicalculator.com/sports/elo

Enter 1500 for both players. The result is 1510 vs 1490.

Enter 1510 for 1st player and again 1500 2nd. The result is 1519.7 vs 1490.3.

Enter 1519.7 for 1st player and 1500 2nd. The result is 1529.1 vs 1490.6.

(repeat procedure)

Clearly it is not what you claimed.. The larger difference between players result in gaining less for higher level player and losing less for less experienced player (if they win/lose accordingly to higher/lower ratings)

Conclusion is that you did not pay enough attention studying rating algorithms..

I suggest using OpenOffice Spreadsheet or Excel or if you are familiar, some computer programming language.

Edited July 27, 2021 by Sensei

John Cuthber · July 27, 2021

Fred is the worst player on the team. To be honest, it was a mistake recruiting him.

We only put him in to play when there's no choice - we really only do that when we are up against a team who we are sure we can beat.

So Fred only gets to play in matches where he is actually likely to win even though he's a bit rubbish.

So, he's usually on the winning side

What is Fred's score based on the OP's ranking system method?

Edited July 27, 2021 by John Cuthber

Sensei · July 27, 2021

@Theodoros Kiriakopoulos

Okay. I created Elo calculator spreadsheet for you using algorithm description from the website:

https://metinmediamath.wordpress.com/2013/11/27/how-to-calculate-the-elo-rating-including-example/

OpenOffice Spreadsheet file:

Elo Calculator.ods

I used constant K-factor 20 as it appears to be used by Omni Calculator. You can change it in the J column.

There seem to be also problem with how to round fractions. Floor/ceil them or >=.5 treat as +1..

**swansont** · July 27, 2021

1 hour ago, Sensei said:

https://en.m.wikipedia.org/wiki/Glicko_rating_system

Thank you

___

So this is about a rating system, and somewhat related to applied math.

Well, it is probably true that any rating system is flawed, for reasons/examples that have been pointed out (they will all assume something about the competition, and usually can be gamed)

Theodoros Kiriakopoulos · July 27, 2021

Perhaps this will wake you up: Suppose that you have to place bets at soccer, one bet for each game, e.g. you have been offered a bonus that will leave a profit if your luck will be the average and if you place your bets randomly, but you want to maximize your profit by hoping that you have a correct enough probability estimation and place bets only at the odds of the games that this estimation promises an expected profit even if no bonus was there. You have 2 tables that show the win-draw-loss probabilities estimation for each game. One table shows the probabilities estimation when 2 teams of a given Elo or Glicko rating difference face each other, and the other table shows the probabilities estimation when 2 teams of a given (w+0.5d)/g rating difference face each other. Suppose that the 2 tables were constructed from the data of 4 countries, of 7 championships (of 7 years) in each country, i.e. 4*7=28 championships. I have constructed the (w+0.5d)/g table, i.e. when a team played at home and in the rest 37 games of one of the 28 championships earned (w+0.5d)/g=0.65, faced a team of 0.45, and won, I added 1 in the first of the 3 boxes of the cell "0.6 up to 0.7 home vs 0.4 up to 0.5 away". Adding more ones to each box as the data increased, finally this cell is showing 168 wins for the home team, 45 draws, and 28 wins for the away team. The sample of many cells is small, but adding the values of the appropriate cells, you have another table that shows that when 2 teams of a given (w+0.5d)/g rating difference (e.g. 0.65-0.45=0.2 or e.g. 0.55-0.35=0.2) will face each other, then the win-draw-loss probabilities estimation is what the 3 boxes of the cell of the new table is showing, which is much larger sample, but refers to the if they 2 teams will play at neutral stadium. Easy, you add 0.05 to the rating of the team that will play at home, and subtract 0.05 from the rating of the team that will play away (this is concluded somehow), i.e. those 2 teams is like having 0.3 rating difference and not 0.2 (the 0.65 one will play at home).The cell for 0.3 rating difference shows 519-169-85. Thus the probability estimation that the team that gathered 0.65 in the previous games of the present (or and the previous) championship and plays at home against a team that gathered 0.45, will win, is 519/(519+169+85). Which of the 2 tables would you trust to place your money, the Elo (or Glicko) table, or the (w+0.5d)/g table?

Edited July 27, 2021 by Theodoros Kiriakopoulos

**swansont** · July 27, 2021

40 minutes ago, Theodoros Kiriakopoulos said:

Perhaps this will wake you up: Suppose that you have to place bets at soccer, one bet for each game, e.g. you have been offered a bonus that will leave a profit if your luck will be the average and if you place your bets randomly, but you want to maximize your profit by hoping that you have a correct enough probability estimation and place bets only at the odds of the games that this estimation promises an expected profit even if no bonus was there. You have 2 tables that show the win-draw-loss probabilities estimation for each game. One table shows the probabilities estimation when 2 teams of a given Elo or Glicko rating difference face each other

Are these rating systems purported to be used for betting purposes? And for soccer? Sensei's link says Glicko is for "games of skill, such as chess and Go"

You use the tools that are best suited to the job. It's much less useful to complain that your hammer sucks at tightening a bolt.

Sensei · July 28, 2021

On 7/25/2021 at 9:11 AM, Theodoros Kiriakopoulos said:

Does Elo create such an inflation? To answer this, you need the Elo formula

f(n)=f(n-1)+14(1-1/(10^-((f(n-1)-1500)/400)+1)), f(0)=1500<=>

f(n)=f(n-1)+14(1-1/(10^((1500-f(n-1))/400)+1)),f(0)=1500

If we have inputs:

Ra, Rb - initial ratings

K - K-factor, typical value 16, 20, 32. Check Wikipedia Elo article for a more details.

W - 1 win of A, 0.5 draw, 0 win of B.

Elo rating calculation algorithm is:

Qa=10^(Ra/400)

Qb=10^(Rb/400)

Ea=Qa/(Qa+Qb)

Eb=Qb/(Qa+Qb)

Ra'=Ra+K*(W-Ea)

Rb'=Rb+K*((1-W)-Eb)

Ra', Rb' - output ratings

So, if A wins, and if Ra=Rb=1500 and K=20, Qa=Qb, and Ea=Eb=0.5, Ra will gain +10 pkt, Rb will loss -10 pkt.

But if Ra=2000, and Rb=1500 and K=20, Ra will gain just +1.1 pkt, Rb will loss -1.1 pkt.

If Ra=2500, and Rb=1500 and K=20, Ra will gain just +0.1 pkt, Rb will loss -0.1 pkt.

Going from Ra=1500 to 2000 requires hundred wins in a row, and we can clearly see that the higher rank player has, the less pkt is added to his/her rank. Looks good to me.

Edited July 28, 2021 by Sensei

Theodoros Kiriakopoulos · July 29, 2021

One soccer championship has 15 teams, and another 30 teams. The strongest team of the first championship and the strongest team of the second championship are of the same real strength (suppose God revealed this to you) and they attained the same (w+0.5d)/g=0.8. And the distribution of real strengths and the distribution of (w+0.5d)/g of the teams of the 2 championships is the same. The next game of the first championship and the next game of the second championship is the strongest team against an average strength team. The "expected score" of the strongest team against an average strength team is ~0.7815 in both these 2 games, according to my data. Now, the strongest team of the second championship has a greater Elo and Glicko rating than the rating of the strongest team of the first championship, whereas the 2 teams of average strength of these 2 next games have the same Elo and Glicko rating. And the strongest team of the second championship has a greater "expected score" than the "expected score" of the strongest team of the first championship, regarding those 2 next games, according to Elo and Glicko. And that, just because the strongest team of the second championship has played more games than the strongest team of the first championship. Isn't that so? Well, this is simply WRONG.

And regarding the chess example I gave, it does not "look good" at all to me. You have NOT calculated the rating after 1000 consecutive wins and the rating after 10000 consecutive wins, whereas I HAVE calculated these 2 ratings of a very similar (to Elo) rating system, and wrong inflation appears again.

If you do not know what "expected score" is, read Wikipedia's article on Elo rating.

Theodoros Kiriakopoulos · July 30, 2021

You should answer, of course Elo gives more points the more games you play since when you have played 1 game and won you have e.g. 1507 and when you have played 2 games and won you have 1514, but YOU SHOULD get more points because a player with 2 wins in 2 games is most probably stronger than a player with 1 win in 1 game. So, you are saying that Elo calculated the correct solution to this by choosing the number 1500 and K=?. Suppose so. First, then why did I locate wrong inflation to the very similar to the Elo rating system with the example of 1000 vs 10000 consecutive wins. Second, then why did chessrooms choose Glicko (and abandoned Elo) that adds (for a win) and subtracts (for a loss) huge points in your very first games and then 7 points on average (7 at lichess.org). Glicko appears to be wrong again solution, because in your first games you might get lucky or unlucky i.e. a really average strength player might have e.g. 9 wins or 9 losses in his first 10 games.

Edited July 30, 2021 by Theodoros Kiriakopoulos

Sign In

Much better than Elo, Glicko, and Trueskill

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Important Information