Rating Systems
  1. Rating
  2. Ranking
  3. Scoring
  4. Matchmaking
  5. Matchmaking
  6. Algorithmic Ranking

Rating Systems

Considerations

These are taking from the sports world, but can be generalized to work in other domains as well.

  • Home Advantage
  • Strength of Schedule
  • Quality of wins (margin of victory)
  • in-game information (Examples include time of possession of the ball, individual statistics, and lead changes)
  • team composition - are all teams created equally?
  • Cold start - account for early season matches where there’s less data available

Simple

Win/Loss

Simplistic. Players are ranked solely based on their win-loss record. This method is simple but may not accurately reflect relative skill levels, especially if players have not faced similar levels of competition. Rankings are determined purely by the number of wins, with ties broken by other criteria such as head-to-head results.

Also see: A Natural Generalization of the Win-Loss Rating System.

Dynamic Rating Systems

Basically “skill point trading”. Primary examples are chess rating systems. Players either gain or lose points based on the match, and your opponent skill level.

ELO

Not just for Chess! It’s used in many professional sports.

Here’s a great article from 538 on how the did their NFL ELO ratings. https://fivethirtyeight.com/features/introducing-nfl-elo-ratings/

Check out the World Football Elo Ratings at https://www.eloratings.net/

Glicko2

Probabilistic Rating Systems

Microsoft TrueSkill/TrueSkill2

Online multiplayer games, such as Gears of War and Halo, use skill-based matchmaking to give players fair and enjoyable matches. They depend on a skill rating system to infer accurate player skills from historical data. TrueSkill is a popular and effective skill rating system, working from only the winner and loser of each game. This paper presents an extension to TrueSkill that incorporates additional information that is readily available in online shooters, such as player experience, membership in a squad, the number of kills a player scored, tendency to quit, and skill in other game modes. This extension, which we call TrueSkill2, is shown to significantly improve the accuracy of skill ratings computed from Halo 5 matches. TrueSkill2 predicts historical match outcomes with 68% accuracy, compared to 52% accuracy for TrueSkill.

https://www.microsoft.com/en-us/research/uploads/prod/2018/03/trueskill2.pdf

Performance Rating Systems

Instead of focusing on wins and losses, these systems evaluate teams and players based on their individual contributions to their team’s performance.

  • Slugging
  • Batting Averages
  • On Base Percentage
  • K/D Ratio
  • DPM/HPM (Damage/Healing/Whatever Per Minute)
  • PER Player Efficiency Rating
  • Soccer xG - Expected Goals
  • Box Plus/Minus (BPM)
  • Quarterback Rating (QBR)
  • Hockey: Goals Above Replacement (GAR)
  • Hockey: Corsi and Fenwick Ratings

Combine these to get an idea of total player rating.

Scoring Systems

Point Systems

Players earn points based on their performance in tournaments or matches. Points can be weighted based on the prestige or difficulty of the event.

  • Tennis
  • Golf

Predictive Performance

Pythagorean expectation

Pythagorean expectation, or Pythagorean projection, calculates a percentage based on the number of points a team has scored and allowed. Typically the formula involves the number of points scored, raised to some exponent, placed in the numerator. Then the number of points the team allowed, raised to the same exponent, is placed in the denominator and added to the value in the numerator.

The following uses a hardcoded exponent of 2.

$${Win\ Ratio} = \frac{1}{1+(\text{runs allowed}/\text{runs scored})^2}$$

The exponent can be tweaked to the specifics of the user case. In baseball, The most widely known is the Pythagenport formula developed by Clay Davenport of Baseball Prospectus:

$${Exponent} = 1.50 \log\left(\frac{R+RA}G\right) +0.45$$

He concluded that the exponent should be calculated from a given team based on the team’s runs scored (R), runs allowed (RA), and games (G). By not reducing the exponent to a single number for teams in any season, Davenport was able to report a 3.991 root-mean-square error as opposed to a 4.126 root-mean-square error for an exponent of 2.

https://en.wikipedia.org/wiki/Sports_rating_system#Pythagorean

Other Systems

Michelin Star Restaurant

Consumer Credit Score

Match Making

MMR (Match Making Rating)

Rate players, then create matches based on their MMR rating. This is used in a bunch of online games.

Domains

Online Gaming

Games seem to typically use a modified version of Elo and “skin” them into a branded tier/leveling system. Players then grind against these scoring systems.

Valorant and RR/MMR

Valorant has RR (elo based “rank rating”)

Resources

Whitepapers

Misc

An interesting table of contents for a fictitious book chatgpt came up with. Might make a good prompt in the future.

I. Introduction to Rating and Ranking Systems
   A. Overview of Rating and Ranking Concepts
   B. Importance and Applications of Rating and Ranking Systems
   C. Historical Development of Rating and Ranking Systems

II. Fundamentals of Rating Systems
   A. Basic Principles of Rating Systems
   B. Types of Rating Systems
      1. Point Systems
      2. Performance Rating Systems
      3. Probabilistic Rating Systems
   C. Components of Rating Systems
      1. Rating Metrics
      2. Match Outcome Determinants
      3. Player Activity Factors
   D. Limitations and Challenges of Rating Systems

III. Traditional Rating Systems
   A. Elo Rating System
      1. History and Development
      2. Mathematical Formulation
      3. Applications in Various Fields
   B. Glicko and Glicko-2 Rating Systems
      1. Features and Advantages
      2. Comparison with Elo and Other Systems
   C. Other Classic Rating Systems
      1. Chess Rating Systems
      2. Sports Rating Systems (e.g., FIFA rankings, ATP rankings)

IV. Modern Rating Systems
   A. TrueSkill Rating System
      1. Overview and Background
      2. Probabilistic Modeling Approach
      3. Applications in Online Gaming and Beyond
   B. Performance-Based Rating Systems
      1. Metrics and Criteria
      2. Examples from Various Sports and Games
   C. Advanced Probabilistic Models
      1. Bayesian Inference Techniques
      2. Machine Learning Approaches

V. Implementation and Application of Rating Systems
   A. Design Considerations and Best Practices
   B. Challenges in Implementing Rating Systems
   C. Case Studies and Real-World Examples
      1. Rating Systems in Sports Leagues
      2. Rating Systems in Online Gaming Platforms
      3. Rating Systems in Business and Finance

VI. Future Trends and Innovations in Rating Systems
   A. Emerging Technologies and Methodologies
   B. Integration with Artificial Intelligence and Data Analytics
   C. Ethical and Social Implications of Rating Systems

VII. Conclusion and Outlook
   A. Summary of Key Concepts and Findings
   B. Future Directions and Research Opportunities
   C. Final Thoughts on the Evolution of Rating and Ranking Systems

Related Notes