Partial wins - accounting for luck, other factors · vivekjoshy/openskill.py · Discussion #104

jkant
Aug 6, 2023

I am interested in the TrueSkill/OpenSkill ranking system, in particular the team-based approach is very useful for what I am trying to use it for.

But I am confused as to its treatment of game outcomes. Whereas ELO or Glicko allow you to provide outcome scores from anywhere within the 0-1 range (which is important for my use case), it seems TrueSkill only accounts for wins/losses/draws, and on top of that has particular considerations for the likelihood of draws.

Is it possible to simply deal in continuous, "partial" outcomes using this rating system, as it is with ELO or Glicko?

Replies: 2 comments 1 reply

vivekjoshy
Aug 7, 2023
Maintainer

Please see #29 .

0 replies

jkant
Aug 7, 2023
Author

I have actually gotten some interesting results that differ from the binary W/L ratings, by passing fractional numbers as ranks. The results are not very predictive, but then again, the TrueSkill/OpenSkill model so far appears to be less predictive than Glicko or Elo with non-fractional ranks for my use case in general, so I'm not sure whether the fractional ranks are doing what I would like to use them for.

As far as data goes, the example given in the linked thread should work well. I understand the response that goals in an 11 v 11 game may not be a good metric because of positional roles, but in a team vs. team context, final score should be more predictive than simple win/loss (eg. Team A beating Team B 7-0 is more predictive of strength than Team A beating Team B 2-1).

In a multiplayer FFA context, kills/deaths should be more predictive than ordinal rank. For example, Player 1 with 11 kills and 0 deaths, Player 2 with 1 kill and 6 deaths, Player 3 with 1 kill and 7 deaths, would rank Player 1 way ahead of Player 2 and 3, whereas giving their ranks as 1, 2 and 3 would not really give a good picture of relative skill.

1 reply

@vivekjoshy

vivekjoshy Aug 9, 2023
Maintainer

As far as data goes, the example given in the linked thread should work well.

I don't see any datasets in that thread. We need large sets of data with lots of matches to make sure any modifications to the algorithms don't overfit. The data should contain per-player statistics like hours played or number of kills.

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Partial wins - accounting for luck, other factors #104

Uh oh!

{{title}}

Uh oh!

jkant
Aug 6, 2023

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

vivekjoshy
Aug 7, 2023
Maintainer

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

jkant
Aug 7, 2023
Author

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

vivekjoshy Aug 9, 2023
Maintainer

Select a reply

Uh oh!

Uh oh!

Partial wins - accounting for luck, other factors #104

Uh oh!

jkant Aug 6, 2023

Replies: 2 comments · 1 reply

Uh oh!

vivekjoshy Aug 7, 2023 Maintainer

Uh oh!

Uh oh!

jkant Aug 7, 2023 Author

Uh oh!

Uh oh!

vivekjoshy Aug 9, 2023 Maintainer

jkant
Aug 6, 2023

Replies: 2 comments 1 reply

vivekjoshy
Aug 7, 2023
Maintainer

jkant
Aug 7, 2023
Author

vivekjoshy Aug 9, 2023
Maintainer