Similarity Scores: an invention by sabermetric pioneer Bill James, “Similarity Scores” are a quick and easy way to compare two player and they are available on Baseball Reference for most players in history. He introduced a version of this in his 1986 Baseball Abstract and then refined the system in his 1994 book The Politics of Glory, which to me is the greatest book about the Hall of Fame ever written.
The Similarity Score concept is that if two players had absolutely identical statistics for one season (or even for an entire career), their similarity score would be 1,000. Of course the odds of this happening with any reasonable sample size are infinitesimal. Therefore, the system deducts a point for each statistical discrepancy. So, if Player A has 15 more hits than Player B, one point is deducted. If Player A has 30 more hits than Player B, 2 points are deducted, and so on.
For batters, 13 statistical categories are used. The numbers in parentheses after these categories is the differential required to deduct a point. For instance, for every difference of 20 in total games played, one point deducted. The categories for position players are Games (20), At Bats (75), Runs (10), Hits (15), Doubles (5), Triples (4), Home Runs (2), RBI (10), Walks (25), Strikeouts (150), Stolen Bases (20), Batting Average (.001), Slugging Percentage (.002).
There’s a positional value adjustment as well.
- 240 – Catcher
- 168 – Shortstop
- 132 – Second Base
- 84 – Third Base
- 48 – Outfield (James distinguishes between the three, but BR doesn’t have that data incorporated at the moment)
- 12 – First Base
- 0 – DH
If one player is a catcher and the other is a shortstop, 72 points are deducted (the difference between 240 for catcher and 168 for shortstop). If you’re wondering why a catcher is considered “most similar” to to a shortstop (as opposed to first base, which is where the best hitting catchers wind up), let me explain. It’s about positional value on the defensive spectrum.
The example of the “most similar” players James could find was 1970’s first basemen Andre Thornton and John Mayberry, who had Similarity Scores (to each other) of 964.8. Look at their numbers:
Player | G | AB | R | H | 2B | 3B | HR | RBI | BB | SO | SB | BA | SLG |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
John Mayberry | 1620 | 5447 | 733 | 1379 | 211 | 19 | 255 | 879 | 881 | 810 | 20 | .253 | .439 |
Andre Thornton | 1565 | 5291 | 792 | 1342 | 244 | 22 | 253 | 895 | 876 | 851 | 48 | .254 | .452 |
There are also 13 categories for pitchers: Wins (1 point), 2 Losses (1 point), Winning Percentage (.002), ERA (.02), Games Pitched (10), Starts (20), Complete Games (20), Innings Pitched (50), Hits Allowed (50), Strikeouts (30), Walks (10), Shutouts (5), Saves (3).
There’s a 10 point debit for starters if pitchers throw with a different hand, 25 points for relievers.
There are other tweaks for both batters and pitchers, the full rundown can be found on Baseball Reference’s site here.