Claude Code Plugins

Community-maintained marketplace

Feedback

elo-ratings-math

@mcclowes/elo-elo
1
0

Explains the mathematical principles behind Elo rating systems, including expected score calculation, rating updates, and the K-factor. Use when implementing or understanding competitive rating systems.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name elo-ratings-math
description Explains the mathematical principles behind Elo rating systems, including expected score calculation, rating updates, and the K-factor. Use when implementing or understanding competitive rating systems.

Elo Ratings Mathematics

Overview

The Elo rating system is a method for calculating the relative skill levels of players in competitor-versus-competitor games. Originally developed by Arpad Elo for chess, it's now used in many competitive contexts including sports, video games, and online platforms.

Core Mathematical Principles

1. Expected Score Formula

The expected score for a player is the probability of winning based on the rating difference between two players.

Formula:

E_A = 1 / (1 + 10^((R_B - R_A) / 400))

Where:

  • E_A = Expected score for player A (between 0 and 1)
  • R_A = Current rating of player A
  • R_B = Current rating of player B
  • 10^x = 10 raised to the power of x

Interpretation:

  • E_A = 1.0 means player A is expected to win with certainty
  • E_A = 0.5 means both players are equally matched (50% win probability)
  • E_A = 0.0 means player A is expected to lose with certainty

Example: If player A has rating 1600 and player B has rating 1400:

E_A = 1 / (1 + 10^((1400 - 1600) / 400))
E_A = 1 / (1 + 10^(-200 / 400))
E_A = 1 / (1 + 10^(-0.5))
E_A = 1 / (1 + 0.316)
E_A ≈ 0.76

Player A is expected to score 0.76 (76% chance of winning).

2. Rating Update Formula

After a game, ratings are updated based on the actual outcome compared to the expected outcome.

Formula:

R'_A = R_A + K × (S_A - E_A)

Where:

  • R'_A = New rating for player A
  • R_A = Old rating for player A
  • K = K-factor (determines rating volatility)
  • S_A = Actual score (1 for win, 0.5 for draw, 0 for loss)
  • E_A = Expected score (from formula above)

The Update Difference:

ΔR_A = K × (S_A - E_A)

This difference represents:

  • Positive value: Player performed better than expected (rating increases)
  • Negative value: Player performed worse than expected (rating decreases)
  • Zero: Player performed exactly as expected (no rating change)

3. The K-Factor

The K-factor controls how much ratings can change after each game.

Common K-factor values:

  • K = 32: High volatility, used for beginners or provisional ratings
  • K = 24: Medium volatility, used for intermediate players
  • K = 16: Low volatility, used for established/expert players
  • K = 10: Very stable, used for top-level players

Adaptive K-factor example (FIDE chess system):

K = 40  if games_played < 30
K = 20  if rating < 2400
K = 10  if rating >= 2400

4. Rating Difference and Win Probability

The relationship between rating difference and expected win probability:

Rating Difference Expected Score Win Probability
0 0.50 50%
50 0.57 57%
100 0.64 64%
200 0.76 76%
300 0.85 85%
400 0.91 91%
500 0.95 95%
600 0.97 97%

Formula for any rating difference:

Win_Probability = 1 / (1 + 10^(-ΔR / 400))

Where ΔR = R_A - R_B

5. Two-Player Zero-Sum Property

In a two-player game, the rating changes are equal and opposite:

ΔR_A = -ΔR_B

This is because:

E_A + E_B = 1
S_A + S_B = 1 (for decisive games)

Therefore:

ΔR_A = K × (S_A - E_A)
ΔR_B = K × (S_B - E_B) = K × ((1 - S_A) - (1 - E_A)) = -K × (S_A - E_A) = -ΔR_A

Comprehensive Example

Scenario: Player A (rating 1800) plays Player B (rating 1700), K = 32

Step 1: Calculate Expected Scores

E_A = 1 / (1 + 10^((1700 - 1800) / 400))
E_A = 1 / (1 + 10^(-0.25))
E_A = 1 / (1 + 0.562)
E_A ≈ 0.64

E_B = 1 - E_A ≈ 0.36

Step 2: Actual Outcome - Player B Wins (upset!)

S_A = 0 (loss)
S_B = 1 (win)

Step 3: Calculate Rating Changes

ΔR_A = 32 × (0 - 0.64) = 32 × (-0.64) = -20.48 ≈ -20
ΔR_B = 32 × (1 - 0.36) = 32 × (0.64) = 20.48 ≈ +20

Step 4: New Ratings

R'_A = 1800 + (-20) = 1780
R'_B = 1700 + 20 = 1720

Player B gained 20 points for the upset victory, while player A lost 20 points.

Multi-Player Extensions

For games with more than two players, the Elo system can be extended:

Pairwise Comparison Method: Each player's rating change is the sum of their changes against all opponents:

ΔR_i = K × Σ(S_ij - E_ij)

Where:

  • i = player being rated
  • j = each opponent
  • S_ij = actual score against opponent j
  • E_ij = expected score against opponent j

Mathematical Properties

1. Conservation of Rating Points: In a closed system with only two-player games, the total rating points remain constant.

2. Logistic Distribution: The expected score formula uses a logistic curve, which creates smooth probability transitions.

3. Rating Scale Calibration: The choice of 400 in the formula means a 400-point difference corresponds to a 10:1 odds ratio (91% vs 9% win probability).

4. Convergence: Over many games, ratings converge toward players' true skill levels, with convergence speed determined by K-factor.

Implementation Considerations

When implementing Elo ratings:

  1. Initial Ratings: Typically start players at 1200, 1500, or 1600
  2. Minimum Ratings: Consider setting a floor (e.g., 100) to prevent negative ratings
  3. Rating Inflation/Deflation: Monitor average ratings over time
  4. Provisional Periods: Use higher K-factors for new players
  5. Inactivity Decay: Consider rating decay for inactive players
  6. Draw Handling: Use S = 0.5 for both players in draws

Extensions and Variants

Glicko and Glicko-2: Adds rating deviation (RD) to account for uncertainty:

RD² = rating variance (higher = more uncertain)

TrueSkill: Microsoft's system using Bayesian inference with skill mean (μ) and skill standard deviation (σ).

Elo with Home Advantage: Add a constant to the home player's rating in expected score calculation:

E_home = 1 / (1 + 10^((R_away - (R_home + H)) / 400))

Where H is the home advantage (typically 30-100 points).

References

  • Elo, A. E. (1978). The Rating of Chessplayers, Past and Present
  • FIDE Handbook: Rating Regulations
  • Glickman, M. E. (1999). "Parameter estimation in large dynamic paired comparison experiments"