While the all-time player draft started off as mainly a laugh, something to do purely for fun, everyone eventually wanted to find a way to have our teams compete against each other. While simulating games was not really feasible, I decided some sort of evaluation method was appropriate. However, I had no idea what to use. I did not want something that would take a great deal of time to go through because this was mainly for fun, but I thought it should be complete enough to stand up to some kind of criticism.
My first thought was simply to use Bill James' now fairly well known win shares. However using career win shares did not seem logical to me because it tended to reward players who just keep playing forever significantly more than those who have short but high-impact career. I then thought of using the same method that James did in his New Historical Baseball Abstract, one of my favorite baseball books ever. But, I saw three problems with this option: 1. That's boring; 2. There are a few elements to his method that I do not completely agree with; 3. Incorporating modern players would have been tedious and difficult.
To cut to the chase a bit, I decided to must make up my own formula for evaluating players. On baseball-reference.com they post a stat for every player ever called OPS+. It is one of my favorite statistics to just quickly evaluate hitters. What it simply does is, take OPS (on base percentage + slugging percentage) and contextualize it. OPS+ compares a players' OPS to the league average for when they played and also factors in park effects. It then puts the number on a scale where 100 is league average. It is a great little tool to just get a general impression of a players' hitting ability compared to league average over a season or over a career.
I decided OPS+ would be the backbone from my formula, but also believed that it was too blunt even for this fun exercise. OPS+ does not take into consideration many things that are important for a real life baseball team, most importantly: playing time, defense, base-running and personality. So my goal was to create a formula that used career OPS+ and then also brought those elements into the equation, with the end result being a single easy to understand number. Here is what I did.
The Method
The first problem with OPS+ I decided to tackle was the playing time issue. Willie Mays and Dick Allen have the same career OPS+ (158) but Allen was not Mays' equal as a hitter because Allen had 7,134 plate appearances and Mays had 12,493. So, how do we figure out exactly what the difference is? I started off by setting a baseline for expected career plate appearances of superstars. The number I chose for position players was 12,000 plate appearances (but 10,000 for catchers). This number is extremely hard to reach even for the game's best ever, but some all-time players have actually gone well past 12,000.
For players who failed to reach the baseline (almost everyone) I took the amount of plate appearances they fell short (in Allen's case it is 4,866) and credited them with a league average plate appearance for each. What this means is that we have converted Allen's career into one consisting of 7,134 plate appearances with an OPS+ of 158 and 4,866 plate appearances with an OPS+ of 100. We will call this number "Career Adjusted OPS+" or "caOPS+" for short. In Allen's case his caOPS+ is 133.3. Willie Mays gets a small bonus for going over the baseline of 12,000 career plate appearances and ends up with a caOPS+ of 158.3. This is the number we will use to value a player's hitting and as is evident we have now begun to separate a super-duper star (Mays) from a truly great player (Allen) but we are not done yet.
The next thing we need to factor in is defense. I am not going to invent a new statistic for defense and decided to merely use the best tools currently at my disposal. These are John Dewan's fielding bible, Bill James' win shares and fangraphs.com's various metrics. After compiling these each player was given a letter grade for defensive ability (note that the letter grade was only applicable for the position they played, an A first baseman did not add the same value with the glove as an A shortstop, I will get to how I worked that out shortly). I assigned each letter grade a point value that would simply be added to a player's caOPS+. An A+ would be worth 10 points, an A 9, an A- 8, all the way down to a D being worth 1 and an F worth 0.
Continuing our Mays vs. Allen comparison. Mays was graded as an A+ centerfielder, Allen (who was drafted as an LF) received a B- grade. However, we all know that centerfielders have a harder job than left-fielders. Because of this I decided to weight defensive grades for more demanding positions. In continuing with my idea of quick but effective analysis my solution to this problem was simple. Depending on the position played, I multiplied the number value of the players defense (gotten from the letter grade) by a small amount. For catchers it was multiplied by 2, for shortstops it was 1.67, for centerfielders and second baseman it was 1.5 and for third baseman it was 1.33. In applying this method to Willie Mays' caOPS+ he gets a +15 (10 for being A+, multiplied by 1.5 for being a centerfielder) and his new number is 173.3. We will call this "Career Adjusted Fielding OPS+" or "cafOPS+" for short. Doing the same to Dick Allen he gets a +5 (5 for being a B- fielder, multiplied by 1.0 because he was aleft-fielder) and his cafOPS+ is 138.3.
The next problem to tackle is the element of base-running. Again I chose to go with a quick approach to evaluating base-running. Similar to defense, I simply added on a number (0-5) based on how good of a base-runner they are. A 0 would be Ernie Lombardi or Frank Thomas while a 5 would be Honus Wagner or Tim Raines. The number the player was credited with came from looking at their stolen bases, stolen base percentage, triples and a few other factors. Mays received a rating of 4, based on his leading the league in steals 4 times, stealing a good percentage and hitting many triples. Allen rates as a 2 because he was not a liability on the bases, but was not a legitimate asset either. After adding these numbers to their cafOPS+, Mays rates at 177.3 and Allen as 140.3.
At this point I was done with my pure statistical analysis of a player's career value. I call this number "Adjusted career value" or "ACV" for short. Mays' 177.3 ACV is one of the 10 highest of all-time, while Allen's 140.3 is somewhere in the range of 45th-75th best all-time among position players. I think most people would subjectively think those evaluations are fairly accurate.
For reference purposes, here is what the formula I used actually looks like when written out:
ACV = (((P*OPS+)+((bP-P)*100))/bP)+(fG*fF)+rS
ACV is Adjusted Career Value
P is Actual player's plate appearances
OPS+ is Player's career OPS+
bP is Baseline expected plate appearances
fG is Fielding grade
fF is Fielding factor
rS is Baserunning score
bP = 12,000 for all position players except catcher, for catchers bP is 10,000
fG = 10 for A+, 9 for A, 8 for A-, 7 for B+, 6 for B, 5 for B-, 4 for C+, 3 for C, 2 for C-, 1 for D, 0 for F (fG letter grades are assigned based on source of choice
fF = 2 for catchers, 1.67 for shortstops, 1.5 for centerfielders and second basemen, 1.33 for third basemen
rS = A number value of 0-5 based on subjective grading
Before I go into how pitchers were evaluated, I just want to note a few quick things. The 12,000 plate appearance baseline was used for all positions except catchers. Catchers instead were evaluated based on a 10,000 plate appearance baseline. Also players who missed time due to reasons beyond their control (due to war, segregation or whatever) were given credit for their time missed. To quote Bill James, "I think this is more fair to do, than to not do." Players who missed time were not given the equivalent of a complete season for each year they missed, but they were given a subjective amount of credit that I deemed appropriate.
The method for evaluating starting pitchers was extremely similar to that of position players. The statistic I used as the backbone for pitcher evaluations was ERA+ which does for ERA the exact same thing that OPS+ does for OPS. I then used essentially the same formula that I used for hitters, setting a baseline number (in this case innings pitched) and crediting pitchers with league average numbers for every inning they fell short of the baseline.
I will once again use two players to illustrate how this works, in this case Roger Clemens and Johan Santana, who have the same career ERA+ (143). I should note that I am merely looking at Clemens from a purely statistical standpoint and not making any adjustments for anything illegal he may have done to help his career. Santana's career obviously has not had the same amount of value as Clemens' because he has only pitched 1,709 innings compared to Clemens' 4,916.
The baseline number of innings pitched I used for starters was 5,000. Again, this is an achievable number, but not easily achievable, which is exactly what I am looking for. Clemens will be put through the formula first, being credited with 4,916 innings with an ERA+ of 143 combined with 84 innings (the difference between the baseline and his career innings pitched) of an ERA+ of 100, and we get a number of 142.3. Since we do not have to worry about fielding or baserunning with pitchers, this number is his final, complete "Adjusted career value." Doing the same to Santana we find that he has an ACV of 114.7. Clemens' figure comes out as one of the ten best ever, and while Santana's is very good, he clearly has a ways to go before he establishes himself as an all-time great.
Relief pitchers were a bit trickier. I used the same formula, but set the baseline at 1500 innings pitched. For career relievers, this number is extremely difficult to achieve, Trevor Hoffman for example has 1,042 career innings pitched. The formula works extremely well for what it was intended when analyzing pitchers who have under 1,500 career innings, but it does not work so well for players over that amount. Unlike with hitters and starters who exceed the baseline, just letting the formula give them credit for going over would not work here.
I wanted to develop a method for which starting pitchers could be evaluated evenly with relief pitchers. Again I wanted to keep it simple, so what I did was for every 250 innings pitched over the 1,500 baseline, I credited a player with a +1 to their career ERA+. For example, Dizzy Dean as a starter rates as a 111.8 ACV, but as a reliever he has a 131.8 ACV. The reason for this is to compare how a starter would be expected to perform if used as a reliever, and therefore make it capable of comparing him to other full-time relievers. The downside of this is that relievers cannot be compared to starters simply by looking at their ACV's.
For some relievers I jumped through a few hoops to give them a more accurate evaluation. Dennis Eckersley is perhaps the best example because his career ERA+ is not spectacular, but he was clearly dominant when used as a relief pitcher. I did this by weighting his seasons as a reliever heavier than his seasons as a starter and finding a new career ERA+ for him based on that.
If desired players ACV's can be subjectively adjusted for things that I have not taken into consideration, for example a player's impact in the clubhouse and leadership. I have no doubt Willie Mays had more of a positive influence on his team than Dick Allen purely by being in the dugout, but exactly how to weight those things is essentially impossible. In exceptional cases I think it would be fine to give bonus credit or to dock points from a players ACV based on this if you want an even more complete evaluation.
The goal was to find a single number that represented a player's career value to his team. Any such statistic is a blunt instrument and this method is no different. It is a unique way to look at players, and I think it holds up to logical, objective assessment fairly well. As mentioned, I am calling this new statistic "adjusted career value" (ACV), and it can be easily found by anyone with a calculator and a spreadsheet. I think it is elegant in its simplicity and completeness. While I do not think it is a groundbreaking method of player analysis, developing ACV was a fun and rewarding process.
The next entry will be the actual breakdown of all the teams and the ratings for the individual players.
Friday, October 9, 2009
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment