Possibility for parameter for how ratings and win chances adjust for uneven teams #130

spookybear0 · 2024-03-09T21:01:09Z

I'm using openskill for a game where sometimes we have teams of, for example, 6 vs 7. When making teams we put the better players on the team with the lesser amount of players. Openskill estimates are way off results when dealing with uneven teams. It seems that it values extra players much more than the specific game I'm using for it does.

Does anyone have any insight on how to tune a parameter that makes team disparity less important?

Thanks!

vivekjoshy · 2024-03-10T10:48:32Z

Duplicate #29

spookybear0 · 2024-03-10T18:05:48Z

It's not a duplicate, this issue is about teams of different amounts, not about player performance during a game.

spookybear0 · 2024-03-10T22:02:15Z

I decided to multiply the mu of each player on the team with fewer players by a variable factor (I used 1.1) for the win_percent function. It seems to be working fine.

vivekjoshy · 2024-04-08T17:36:56Z

Is your issue solved?

spookybear0 · 2024-04-08T20:28:18Z

Not exactly, I used a temporary fix that isn't very effective. If it's not possible (or no one is willing to implement it) this issue can be closed.

vivekjoshy · 2024-04-09T09:13:17Z

Openskill estimates are way off results when dealing with uneven teams. It seems that it values extra players much more than the specific game I'm using for it does.

Can you provide a reproducible example?

philihp · 2024-04-17T02:14:43Z

Yeah, a concrete example would be helpful here. If you don't like the result of one of the functions, then what should the result be, and why should it be that?

spookybear0 · 2024-04-17T02:37:09Z

I apologize, I've been a little busy. I'll get some examples ready. Thanks for the help.

spookybear0 · 2024-04-17T03:29:08Z

model = PlackettLuce()

async def get_games_with_unbalanced_teams() -> None:
    games = await SM5Game.filter(ranked=True).all()

    print("Getting win chances for unbalanced games. Close game defined as: difference <= 5000 points")

    for game in games:
        red_entity_starts = await game.get_team_entity_starts(Team.RED)
        green_entity_starts = await game.get_team_entity_starts(Team.GREEN)
        red_players = []
        green_players = []

        if not (abs(len(red_entity_starts) - len(green_entity_starts))) > 0:
            continue

        for player in red_entity_starts:
            red_players.append(await Player.filter(entity_id=player.entity_id).first())

        for player in green_entity_starts:
            green_players.append(await Player.filter(entity_id=player.entity_id).first())

        win_chance = get_win_chance(red_players, green_players) # wraps PlackettLuce.predict_win([team1, team2])
        
        score_diff = abs(await game.get_team_score(Team.RED) - await game.get_team_score(Team.GREEN))

        #if score_diff > 5000:
        #    continue

        print(
f"""Win chance for {game.id}: ({(win_chance[0]*100):.2f}%, {(win_chance[1]*100):.2f}%) \
red: {await game.get_team_score(Team.RED)}, \
green: {await game.get_team_score(Team.GREEN)}, \
difference: {score_diff}, \
close: {score_diff <= 5000}, \
team_lengths: {len(red_players)}, {len(green_players)}\
"""
)

Here's an example from my code grabbing all games with uneven teams and displaying their win chances, scores, and team sizes.

Output for close games only. The win percent chance is way off, even though these are only a few examples, even the games that weren't close still have wildly inaccurate win chances. It seems that when the team sizes change, it isn't able to predict the outcome anymore, though the amount each player adds to a team varies by game, so if a solution is implemented it should probably be one where the amount a player contributes to a team is variable (possibly exponentially).

Getting win chances for unbalanced games. Close game defined as: difference <= 5000 points
Win chance for 68: (5.29%, 94.71%) red: 35792, green: 34511, difference: 1281, close: True, team_lengths: 6, 7
Win chance for 114: (88.29%, 11.71%) red: 34134, green: 35412, difference: 1278, close: True, team_lengths: 7, 6
Win chance for 138: (10.62%, 89.38%) red: 31350, green: 27411, difference: 3939, close: True, team_lengths: 5, 6
Win chance for 139: (89.38%, 10.62%) red: 30512, green: 29690, difference: 822, close: True, team_lengths: 6, 5
Win chance for 142: (78.77%, 21.23%) red: 36393, green: 41052, difference: 4659, close: True, team_lengths: 7, 6

Here's an example that includes all games, not only the close ones.

Getting win chances for unbalanced games. Close game defined as: difference <= 5000 points
Win chance for 20: (11.24%, 88.76%) red: 43232, green: 33092, difference: 10140, close: False, team_lengths: 6, 7
Win chance for 21: (78.39%, 21.61%) red: 21950, green: 35752, difference: 13802, close: False, team_lengths: 7, 6
Win chance for 35: (15.78%, 84.22%) red: 37389, green: 27611, difference: 9778, close: False, team_lengths: 5, 6
Win chance for 39: (5.30%, 94.70%) red: 25529, green: 36191, difference: 10662, close: False, team_lengths: 5, 6
Win chance for 40: (5.30%, 94.70%) red: 35550, green: 21289, difference: 14261, close: False, team_lengths: 5, 6
Win chance for 49: (4.73%, 95.27%) red: 15169, green: 30812, difference: 15643, close: False, team_lengths: 5, 6
Win chance for 68: (5.29%, 94.71%) red: 35792, green: 34511, difference: 1281, close: True, team_lengths: 6, 7
Win chance for 84: (14.19%, 85.81%) red: 42752, green: 37593, difference: 5159, close: False, team_lengths: 6, 7
Win chance for 85: (16.37%, 83.63%) red: 36072, green: 23250, difference: 12822, close: False, team_lengths: 6, 7
Win chance for 86: (78.54%, 21.46%) red: 25251, green: 31810, difference: 6559, close: False, team_lengths: 6, 5
Win chance for 114: (88.29%, 11.71%) red: 34134, green: 35412, difference: 1278, close: True, team_lengths: 7, 6
Win chance for 128: (90.52%, 9.48%) red: 19208, green: 32490, difference: 13282, close: False, team_lengths: 6, 5
Win chance for 132: (95.74%, 4.26%) red: 48572, green: 31368, difference: 17204, close: False, team_lengths: 7, 6
Win chance for 138: (10.62%, 89.38%) red: 31350, green: 27411, difference: 3939, close: True, team_lengths: 5, 6
Win chance for 139: (89.38%, 10.62%) red: 30512, green: 29690, difference: 822, close: True, team_lengths: 6, 5
Win chance for 140: (94.89%, 5.11%) red: 32794, green: 18447, difference: 14347, close: False, team_lengths: 7, 6
Win chance for 141: (26.01%, 73.99%) red: 34812, green: 23032, difference: 11780, close: False, team_lengths: 6, 7
Win chance for 142: (78.77%, 21.23%) red: 36393, green: 41052, difference: 4659, close: True, team_lengths: 7, 6

My ratings are defined well with small sigma values due to amount of games and it has been proven to work well with even sized teams.

Let me know if more info is needed. Thanks guys.

philihp · 2024-04-17T04:24:43Z

The win percent chance is way off

What should the win percent chance be, and why should it be that and not anything else?

spookybear0 · 2024-04-17T04:45:09Z

Since these games were so close, the teams were more even and that should be reflected in the win chances. It should be much closer to 50:50 for games that were that close (although the score doesn't always reflect the win chance, it often does)

I've provided some control data below for what the win chances look like for close evenly matched teams. I can also provide data for evenly matched games that aren't close as well if that's needed. As you can see in the data, the win chances are much more reasonable for how close the score is (removing outliers), so it only makes sense for that to be the case for teams of uneven sizes. This is a pretty big difference compared to unevenly matched teams.

Getting win chances for balanced games. Close game defined as: difference <= 5000 points
Win chance for 10: (39.97%, 60.03%) red: 34892, green: 33791, difference: 1101, close: True, team_lengths: 7, 7
Win chance for 12: (60.03%, 39.97%) red: 37213, green: 35872, difference: 1341, close: True, team_lengths: 7, 7
Win chance for 13: (34.57%, 65.43%) red: 30909, green: 29929, difference: 980, close: True, team_lengths: 5, 5
Win chance for 19: (68.61%, 31.39%) red: 33131, green: 31551, difference: 1580, close: True, team_lengths: 6, 6
Win chance for 25: (62.64%, 37.36%) red: 28510, green: 26110, difference: 2400, close: True, team_lengths: 5, 5
Win chance for 30: (42.55%, 57.45%) red: 36732, green: 35091, difference: 1641, close: True, team_lengths: 6, 6
Win chance for 32: (55.20%, 44.80%) red: 30290, green: 30270, difference: 20, close: True, team_lengths: 6, 6
Win chance for 41: (28.73%, 71.27%) red: 30170, green: 33270, difference: 3100, close: True, team_lengths: 5, 5
Win chance for 47: (73.28%, 26.72%) red: 28110, green: 25130, difference: 2980, close: True, team_lengths: 5, 5
Win chance for 48: (56.88%, 43.12%) red: 25248, green: 27170, difference: 1922, close: True, team_lengths: 5, 5
Win chance for 53: (49.90%, 50.10%) red: 36651, green: 37532, difference: 881, close: True, team_lengths: 6, 6
Win chance for 55: (29.17%, 70.83%) red: 33772, green: 31232, difference: 2540, close: True, team_lengths: 6, 6
Win chance for 64: (68.01%, 31.99%) red: 36932, green: 34854, difference: 2078, close: True, team_lengths: 7, 7
Win chance for 76: (49.53%, 50.47%) red: 23510, green: 27610, difference: 4100, close: True, team_lengths: 5, 5
Win chance for 77: (47.55%, 52.45%) red: 37652, green: 33592, difference: 4060, close: True, team_lengths: 6, 6
Win chance for 92: (73.08%, 26.92%) red: 38894, green: 34652, difference: 4242, close: True, team_lengths: 7, 7
Win chance for 95: (44.52%, 55.48%) red: 40593, green: 40434, difference: 159, close: True, team_lengths: 7, 7
Win chance for 100: (47.35%, 52.65%) red: 40792, green: 36472, difference: 4320, close: True, team_lengths: 6, 6
Win chance for 107: (46.89%, 53.11%) red: 24510, green: 26009, difference: 1499, close: True, team_lengths: 5, 5
Win chance for 110: (47.75%, 52.25%) red: 37152, green: 35832, difference: 1320, close: True, team_lengths: 6, 6
Win chance for 111: (47.75%, 52.25%) red: 36692, green: 33312, difference: 3380, close: True, team_lengths: 6, 6
Win chance for 112: (62.95%, 37.05%) red: 28329, green: 24929, difference: 3400, close: True, team_lengths: 5, 5
Win chance for 113: (33.16%, 66.84%) red: 23689, green: 27289, difference: 3600, close: True, team_lengths: 5, 5
Win chance for 116: (50.05%, 49.95%) red: 40192, green: 38654, difference: 1538, close: True, team_lengths: 7, 7
Win chance for 123: (43.48%, 56.52%) red: 31290, green: 31012, difference: 278, close: True, team_lengths: 6, 6
Win chance for 126: (56.52%, 43.48%) red: 37651, green: 32849, difference: 4802, close: True, team_lengths: 6, 6
Win chance for 130: (61.16%, 38.84%) red: 28871, green: 31930, difference: 3059, close: True, team_lengths: 6, 6

Thanks again.

philihp · 2024-04-17T05:51:40Z

I can see what you're saying, but I feel like the best path forward here would be if you wrote your own predict_win function, and then we look at how we can generalize that with a parameter. It's all open source; there are no secrets, everything's there for you to fork 😅

spookybear0 · 2024-04-17T06:00:18Z

Of course, I just don't have any of the required experience in this field of math.

This is the solution I'm using now (which is definitely not mathematically supported but works somewhat decently). This is built towards my use case (since it only supports two teams). But it could(?) be a good place to start. I do want to point out that this isn't a great solution as it doesn't solve the problem entirely, though I'm sure there's a better solution than what I'm doing.

UNEVEN_TEAM_FACTOR = 0.09

class CustomPlackettLuce(PlackettLuce):
    def predict_win(self, teams: List[List[PlackettLuceRating]]) -> List[Union[int, float]]:
        # Check Arguments
        self._check_teams(teams)

        n = len(teams)

        # uneven team adjustment is only implemented for 2 teams

        # 2 Player Case
        if n == 2:
            # CUSTOM ADDITION
            if len(teams[0]) > len(teams[1]):
                logger.debug("Adjusting team ratings for uneven team count (team 1 has more players)")
                # team 1 has more players than team 2
                for player in teams[1]:
                    # multiply by 1 + 0.1 * the difference in player count
                    player.mu *= 1 + UNEVEN_TEAM_FACTOR * abs(len(teams[0]) - len(teams[1]))
            elif len(teams[0]) < len(teams[1]):
                logger.debug("Adjusting team ratings for uneven team count (team 2 has more players)")
                # team 2 has more players than team 1
                for player in teams[0]:
                    # multiply by 1 + 0.1 * the difference in player count
                    player.mu *= 1 + UNEVEN_TEAM_FACTOR * abs(len(teams[0]) - len(teams[1]))

            total_player_count = len(teams[0]) + len(teams[1])
            teams_ratings = self._calculate_team_ratings(teams)
            a = teams_ratings[0]
            b = teams_ratings[1]

            result = phi_major(
                (a.mu - b.mu)
                / math.sqrt(
                    total_player_count * self.beta**2
                    + a.sigma_squared
                    + b.sigma_squared
                )
            )

            return [result, 1 - result]

        return PlackettLuce.predict_win(self, teams)

vivekjoshy · 2024-04-17T11:05:58Z

Is this implementation able to actually predict the outcome of real matches data? I can see why in some games, teams with fewer players might win in let's say traditional games where an extra player counts. If you could provide some match data where the predict_win function is failing for close matches, then it would be very helpful. It would also aid in making a new parameter let's call it team_parity that allows the equality of uneven teams. Apriori, it seems feasible to do this. But in reality, we won't know until we test it on real data.

spookybear0 · 2024-04-18T05:36:50Z

The implementation in my comment before is shaky at predicting the outcome for uneven teams, that's why I'm looking for a better solution. It's more of a hacky fix than a real solution. The normal implementation, on the other hand, works great for matches of even teams but fails on games with uneven player amounts.

Here's additional data for close games with team size differences only including ones that failed to predict the winner, I don't have many instances of games with uneven teams so this doesn't have much data.

Here's the data with the CustomPlackettLuce class.

Win chance for 68: (24.69%, 75.31%) red: 35792, green: 34511, difference: 1281, close: True, team_lengths: 6, 7
Win chance for 114: (56.36%, 43.64%) red: 34134, green: 35412, difference: 1278, close: True, team_lengths: 7, 6
Win chance for 138: (38.55%, 61.45%) red: 31350, green: 27411, difference: 3939, close: True, team_lengths: 5, 6

And the same data using the normal PlackettLuce model

Win chance for 68: (5.29%, 94.71%) red: 35792, green: 34511, difference: 1281, close: True, team_lengths: 6, 7
Win chance for 114: (88.29%, 11.71%) red: 34134, green: 35412, difference: 1278, close: True, team_lengths: 7, 6
Win chance for 138: (10.62%, 89.38%) red: 31350, green: 27411, difference: 3939, close: True, team_lengths: 5, 6
Win chance for 142: (78.77%, 21.23%) red: 36393, green: 41052, difference: 4659, close: True, team_lengths: 7, 6

Plus here's the data for even teams to compare (is not affected by the custom model)
Even teams

Win chance for 10: (39.97%, 60.03%) red: 34892, green: 33791, difference: 1101, close: True, team_lengths: 7, 7
Win chance for 13: (34.57%, 65.43%) red: 30909, green: 29929, difference: 980, close: True, team_lengths: 5, 5
Win chance for 30: (42.55%, 57.45%) red: 36732, green: 35091, difference: 1641, close: True, team_lengths: 6, 6
Win chance for 48: (56.88%, 43.12%) red: 25248, green: 27170, difference: 1922, close: True, team_lengths: 5, 5
Win chance for 55: (29.17%, 70.83%) red: 33772, green: 31232, difference: 2540, close: True, team_lengths: 6, 6
Win chance for 77: (47.55%, 52.45%) red: 37652, green: 33592, difference: 4060, close: True, team_lengths: 6, 6
Win chance for 95: (44.52%, 55.48%) red: 40593, green: 40434, difference: 159, close: True, team_lengths: 7, 7
Win chance for 100: (47.35%, 52.65%) red: 40792, green: 36472, difference: 4320, close: True, team_lengths: 6, 6
Win chance for 110: (47.75%, 52.25%) red: 37152, green: 35832, difference: 1320, close: True, team_lengths: 6, 6
Win chance for 111: (47.75%, 52.25%) red: 36692, green: 33312, difference: 3380, close: True, team_lengths: 6, 6
Win chance for 123: (43.48%, 56.52%) red: 31290, green: 31012, difference: 278, close: True, team_lengths: 6, 6
Win chance for 130: (61.16%, 38.84%) red: 28871, green: 31930, difference: 3059, close: True, team_lengths: 6, 6

I have lots of data for this and I can provide more if needed (or if it needs to be displayed differently). I'm also available if you would like to test any additions on my codebase, I'd be happy to help out with that part.

vivekjoshy · 2024-04-18T20:46:43Z

A few instances of data are unfortunately not sufficient. We require match counts in the tens of thousands (the bigger the better) from real and actually played games to verify the effectiveness of such changes. To show that changes work, we use these open-source datasets to perform data analysis and measure performance. You can see the datasets used in /benchmark as a starting point.

vivekjoshy added the enhancement New feature or request label Nov 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possibility for parameter for how ratings and win chances adjust for uneven teams #130

Possibility for parameter for how ratings and win chances adjust for uneven teams #130

spookybear0 commented Mar 9, 2024

vivekjoshy commented Mar 10, 2024

spookybear0 commented Mar 10, 2024

spookybear0 commented Mar 10, 2024

vivekjoshy commented Apr 8, 2024

spookybear0 commented Apr 8, 2024

vivekjoshy commented Apr 9, 2024

philihp commented Apr 17, 2024 •

edited

Loading

spookybear0 commented Apr 17, 2024

spookybear0 commented Apr 17, 2024

philihp commented Apr 17, 2024

spookybear0 commented Apr 17, 2024

philihp commented Apr 17, 2024

spookybear0 commented Apr 17, 2024 •

edited

Loading

vivekjoshy commented Apr 17, 2024

spookybear0 commented Apr 18, 2024 •

edited

Loading

vivekjoshy commented Apr 18, 2024

Possibility for parameter for how ratings and win chances adjust for uneven teams #130

Possibility for parameter for how ratings and win chances adjust for uneven teams #130

Comments

spookybear0 commented Mar 9, 2024

vivekjoshy commented Mar 10, 2024

spookybear0 commented Mar 10, 2024

spookybear0 commented Mar 10, 2024

vivekjoshy commented Apr 8, 2024

spookybear0 commented Apr 8, 2024

vivekjoshy commented Apr 9, 2024

philihp commented Apr 17, 2024 • edited Loading

spookybear0 commented Apr 17, 2024

spookybear0 commented Apr 17, 2024

philihp commented Apr 17, 2024

spookybear0 commented Apr 17, 2024

philihp commented Apr 17, 2024

spookybear0 commented Apr 17, 2024 • edited Loading

vivekjoshy commented Apr 17, 2024

spookybear0 commented Apr 18, 2024 • edited Loading

vivekjoshy commented Apr 18, 2024

philihp commented Apr 17, 2024 •

edited

Loading

spookybear0 commented Apr 17, 2024 •

edited

Loading

spookybear0 commented Apr 18, 2024 •

edited

Loading