Before there were such things as Dan Szymborski’s ZiPs (SZymborski Projection System) for baseball players, there was a charming little thing called Similarity Scores, created by sabermetric pioneer Bill James. If you look at baseball-reference.com, you will find a description of the formula used for similarity scores and a credit to James’ book The Politics of Glory; however, I first read about them in The Bill James Baseball Abstract 1986, in which he introduced the concept, having worked on it over the previous year. This was also the book that introduced me to the term sabermetrics, among other things.
If you didn’t read the formula in the link, the brief version is that 1000 is a perfect match between two players. Points are then deducted from there based on differences in position, a bunch of counting statistics (e.g., hits) and a few ratios (e.g., batting average, ERA). Obviously, the higher the number, the better the match.
At baseball-reference.com, they calculate the similarity scores on each sufficiently experienced player’s page, giving the top-10 career similarities, plus the top-10 at each age of the player’s career, near the bottom of the page.
For example, if you scroll toward the bottom of ex-Dodger Shawn Green’s player page, you’ll find that his number one match in career similarity score is ex-Dodger Reggie Smith at 945.0. This is what similarity scores (using this formula) seem to work best for, evaluating player production over an entire career. You’ll also see that at age 29, after his last great season (2002), the 29-year old he was most similar to was Dale Murphy with a similarity score of 942.0.
Similarity scores don’t make for a particularly great projection system, because, among other things, the stats used aren’t the best for that purpose and the similarities don’t account for trends (e.g., improving or not), but I’d bet that Bill James used similarity studies with different stats in play to fine tune his later projection system.
However, that doesn’t stop the age-based similarity score summaries at baseball-reference.com from being a fair amount of fun for a nerd like me to peruse and to compare current players to players throughout baseball history.
Here I took a look at the current age similarity scores for the players that are the candidates for the Dodgers 2020 starting rotation at this time. (Note that Dustin May and Tony Gonsolin did not have similarity scores calculated on their pages.)
For the names on Clayton Kershaw’s list that aren’t Hall of Fame pitchers, one has HOF-worth statistics (Clemens), two are solid “hall of very good” ptichers (Gooden, Santana) and the other is from deadball era (White).
Kershaw’s top-two career similarity scores are Max Scherzer, 914.3 and Sandy Koufax, 912.9. For each age from 26 to 31, his top similarity score has either been Tom Seaver or Pedro Martinez.
Walker Buehler doesn’t have a very long career for comparison yet, but the names there are pretty promising, topped by a two-time Cy Young winner and including a Marvel Superhero. His age-23 list includes both a “Highball Wilson” and a “Buster Brown”.
Old friend Pedro Astacio is ninth on his list, but I seriously doubt Buehler will be traded for Eric Young III.
Julio Urias has seen his short career slowed by injury and a changing usage pattern, so it’s no surprise that his list is a bit odd. His top match, Willie Hernandez, won a Cy Young award as a reliever in 1984 at age 29 for the World Series champion Detroit Tigers.
Kenta Maeda was a 28-year old rookie, so that has a significant impact on his list of mostly late-bloomers. He also has a 961.0 career similarity score with Zack Wheeler.
Alex Wood also had his list affected by injury stints, but it does include some pitchers you’d have been happy to have in your rotation at times. He and former teammate Hyun-Jin Ryu have a 945.1 career similarity score.
Wood’s top match is a two-time old friend. Pete Richert was signed by the Dodgers in 1958 and played in parts of three seasons in Los Angeles before being included in the FRANK HOWARD / Claude Osteen trade. He returned for the 1972 season as part of the trade that brought Frank Robinson to his only season in LA. After two more years, Richert was swapped for 1969 Mets hero Tommie Agee, but the latter was released at the end of spring training for what would have been his age-31 season, ending Agee’s playing career.
Jimmy Nelson has lost even more time to injury and fittingly his number one is an old-friend from last season who never took the mound for the Dodgers, Tom Koehler. Nelson also has 967.3 and 964.6 career similarity scores with old friends James McDonald and Eric Stults.
Number 10 on Nelson’s list is Fred Sanford, who pitched immediately after World War II for several seasons, dummy.
Finally, Ross Stripling also has a hodgepodge list thanks in part to being used both as a starter and out of the bullpen. That list includes Brian Holton, the proud possessor of a 1988 World Series ring.
In the end, this doesn’t mean much, but it’s 869 words and a fresh comments section.