Backstage
Menu
Player Value Calculator
Want to calculate Player Value on your own? Then look no further!
To get started, you'll want to make sure you have R and RStudio downloaded on your computer. If you don't already, a simple Google search should guide to the necessary steps. Make sure to install the Lahman, dplyr, and ggplot2 packages.
Then, download the files below:
To get started, you'll want to make sure you have R and RStudio downloaded on your computer. If you don't already, a simple Google search should guide to the necessary steps. Make sure to install the Lahman, dplyr, and ggplot2 packages.
Then, download the files below:
![](http://www.weebly.com/weebly/images/file_icons/gz.png)
totalplayerseasons.csv.zip | |
File Size: | 8252 kb |
File Type: | zip |
![](http://www.weebly.com/weebly/images/file_icons/file.png)
posplayers.r | |
File Size: | 34 kb |
File Type: | r |
The TotalPlayerSeasons zipped file is a csv that basically contains all of the position player data in history. Unzip that file, and then the R file will read it in to calculate Player Value with. The data in this file can be replicated in R and is based off of the data in the Lahman package, but why make things harder than they need to be? I essentially just loaded the Batting, Fielding, and Appearances tables, joined them, and then grouped by individual player seasons.
The PosPlayersR file is where the magic of calculating Player Value happens. This file will work for the 1973 to 2021 seasons, which is the period when the DH was introduced. It is also just for position players, not pitchers. Using the file requires just 4 simple steps:
Below are different file examples from 2010 that represent the different potential baseline levels you could use.
The PosPlayersR file is where the magic of calculating Player Value happens. This file will work for the 1973 to 2021 seasons, which is the period when the DH was introduced. It is also just for position players, not pitchers. Using the file requires just 4 simple steps:
- Update the yearSet variable at the top of the file's code to the year that you want to calculate Player Value for. It is set to 2010 by default, since that's the year that I did my Player Value rollout example on.
- Update the baselineLevel variable to the desired percentile, as a decimal. This is basically the level that you determine as "bad". How poorly does a player have to rank among his positional peers for you to consider him a bad player? By default the value is set to .25, since I believe that a player that is in the bottom 25% of his position, which is about the 7 to 8 worst starters, is a bad player.
- Update the file path for the total variable to reflect where you've saved the TotalPlayerSeasons.csv file.
- Select all of the code and hit Run
Below are different file examples from 2010 that represent the different potential baseline levels you could use.
![](http://www.weebly.com/weebly/images/file_icons/xls.png)
2010playersbaseline10.csv | |
File Size: | 1431 kb |
File Type: | csv |
![](http://www.weebly.com/weebly/images/file_icons/xls.png)
2010playersbaseline16.csv | |
File Size: | 1488 kb |
File Type: | csv |
![](http://www.weebly.com/weebly/images/file_icons/xls.png)
2010playersbaseline25.csv | |
File Size: | 1476 kb |
File Type: | csv |
![](http://www.weebly.com/weebly/images/file_icons/xls.png)
2010playersbaseline33.csv | |
File Size: | 1509 kb |
File Type: | csv |
![](http://www.weebly.com/weebly/images/file_icons/xls.png)
2010playersbaseline50.csv | |
File Size: | 1504 kb |
File Type: | csv |
The baseline10 file uses the 10th percentile, meaning a player must be among the bottom 10%, indicating one of the 3 worst starters at his position, to be in the negatives and thus considered bad.
The baseline16 file uses the 16.67th percentile, meaning a player must be among the bottom 1/6, indicating one of the 5 worst starters at his position, to be in the negatives and thus considered bad.
The baseline25 file uses the 25th percentile, meaning a player must be among the bottom 25%, indicating one of the 7 to 8 worst starters at his position, to be in the negatives and thus considered bad.
The baseline33 file uses the 33.33rd percentile, meaning a player must be among the bottom 1/3, indicating one of the 10 worst starters at his position, to be in the negatives and thus considered bad.
The baseline50 file uses the 50th percentile, meaning a player must be among the bottom 50%, indicating one of the 15 worst starters at his position, to be in the negatives and thus considered bad.
The choice of baseline may somewhat effect the ranking results for each player, but it will mainly just shift all players up/down by some amount. A higher baseline like the median can be useful to see if a player is above/below average, but the reality is that being average, or even slightly below average, can be valuable. WAR assumes that any starter will be better than any bench player or minor league guy, and develops its mathematically-backed-into and somewhat arbitrary "replacement level" to make it very difficult for a player to be deeply in the negatives. For Player Value, I prefer the idea that the very worst of starters that are consistently playing poorly should be graded negatively to indicate that it is time to give another player a shot.
After you've ran the R file, view the batters table to see the results. You can choose to write the table out as a csv or an Excel file, or just view the results within R. The file will also produce a series of plots of Player Value by team, position, league, and division like the one below:
The baseline16 file uses the 16.67th percentile, meaning a player must be among the bottom 1/6, indicating one of the 5 worst starters at his position, to be in the negatives and thus considered bad.
The baseline25 file uses the 25th percentile, meaning a player must be among the bottom 25%, indicating one of the 7 to 8 worst starters at his position, to be in the negatives and thus considered bad.
The baseline33 file uses the 33.33rd percentile, meaning a player must be among the bottom 1/3, indicating one of the 10 worst starters at his position, to be in the negatives and thus considered bad.
The baseline50 file uses the 50th percentile, meaning a player must be among the bottom 50%, indicating one of the 15 worst starters at his position, to be in the negatives and thus considered bad.
The choice of baseline may somewhat effect the ranking results for each player, but it will mainly just shift all players up/down by some amount. A higher baseline like the median can be useful to see if a player is above/below average, but the reality is that being average, or even slightly below average, can be valuable. WAR assumes that any starter will be better than any bench player or minor league guy, and develops its mathematically-backed-into and somewhat arbitrary "replacement level" to make it very difficult for a player to be deeply in the negatives. For Player Value, I prefer the idea that the very worst of starters that are consistently playing poorly should be graded negatively to indicate that it is time to give another player a shot.
After you've ran the R file, view the batters table to see the results. You can choose to write the table out as a csv or an Excel file, or just view the results within R. The file will also produce a series of plots of Player Value by team, position, league, and division like the one below:
Since Baserunning Value is pretty insignificant for most players, it's excluded from the plot. The vertical axis is Batting Value and the horizontal axis is Fielding Value, so the sum of these essentially makes up Player Value. Any player above the black line is adding value, in this case assuming a 25% baseline. Players above the light green line are among the top 5% in Player Value across the league. Players above the dark green line are among the top 1% in Player Value across the league. Players below the light red line are among the bottom 5% in Player Value across the league. Players below the dark red line are among the bottom 5% in Player Value across the league.
As we can see for the NL Central in 2010, Votto and Pujols were among the top 1%, thus their MVP-caliber seasons. Carlos Lee of the Astros was among the bottom 1%.
As we can see for the NL Central in 2010, Votto and Pujols were among the top 1%, thus their MVP-caliber seasons. Carlos Lee of the Astros was among the bottom 1%.