Sabermetrics Baseball: My RBI Decimal Formula
I love Sabermetrics, and not just in baseball, but in all sports! I thought it would be a good idea for all of you to see a passion of mine, which would be writing analytical formulas for sports.
If you follow Hockey, I have published sabermetric formulas on FanSided’s Too Many Men on the Site for hockey fans to read. I call it “Project Helium.” Now, I think it would be a good idea for me to publish a baseball sabermetrics formula. I realize that while virtually every baseball formula has already been written, there is always room for more analytical formulas in the world of advanced metrics.
The next phase of baseball sabermetrics is to value the RBI, which I am going to do in this article. Keep in mind that this is a very complex formula, so grab a chair and dig in
I call the actual sabermetrics formula RBI Decimal because it involves many numbers to the right of the decimal. I will explain more about this later in the article, and considering the way I have this setup it is going to be a bit challenging to have anything but a computer track it.
Overall, I think it makes sense to value the RBI in a different way in baseball sabermetrics.
When each RBI is counted, not to mention every failed attempt in that situation, it is going to be interesting. Another issue is the fact that if a player fails in that situation, he is going to lose points in RBI Decimal.
The biggest key to the RBI Decimal, sabermetrics formula is that it doesn’t count every RBI situation. There are three situations, with two parts of the game involved. If a player got an RBI in a different situation, it wouldn’t register in RBI Decimal. Also, if someone were to hit a home run in one of the situations listed below, only the RBI’s of the men on base would be counted.
RBI Decimal – Success Variables
Now, I am going to explain the variables listed below. Keep in mind how complex some of the numbers are. The situations are listed 6-1 by least difficult and degree of importance.
Note: The quality of the pitcher is valued via Wins Above Replacement (WAR), and the asterisk indicates which variables match each other.
Man on 2nd base only, two outs. (innings 1-6)
If succeeds (6)
*(a)Vs below average pitcher (1.0102) **(b)vs average pitcher (1.02) ***(c)vs above average pitcher (1.0303)
- *A1:(+0.002)
- *A2:+(-0.0002)
- **B1:(+0.00201)
- **B2:+(-0.0001)
- ***C1:(+0.003)
- ***C2:+(-0.00001)
If there is a “1” next to the variables listed directly above (such as C1 for example), that means a below average or mediocre hitter was hitting behind the batter at the plate. On the flip side, if a “2” is next to the variable list above (such as A2 for example), an average or better hitter was behind the batter at the plate. I did this because protection is critical in regards to hitting.
Man on 2nd base only, 2 out (inning 7 or later)
If succeeds (5)
*(d)Vs below average pitcher (1.02031) **(e)vs average pitcher (1.031) ***(f)vs above average pitcher (1.0302)
- *D1:(+0.003)
- *D2:+(-0.000301)
- **E1:(+0.003021)
- **E2:+(-0.000102)
- ***F1:(+0.00301)
- ***F2:+(-0.0000101)
There is no difference in this part of the formula then there was for the last part (#6).
Man on 1st base only, two outs (innings 1-6)
If succeeds (4)
(g)Vs Below average pitcher (1.102) (h)vs average pitcher (1.10306) (i)vs above average pitcher (1.10341)
I do not believe there is a need for any extra variables because there is not an empty base. Same as #2.
Man on 3rd (and potentially other bases), one out or less (vs regular defense) (innings 1-6)
If succeeds (3)
*(j)Vs below average pitcher (1.113) **(k)vs average pitcher (1.11501) ***(l)vs above average pitcher (1.1233)
- *J1:+(-0.002002)
- **K1:+(-0.001031)
- ***L1:+(-0.00101)
All of the variables above have to do with one issue. The fact that two or more infielders being in. If that is the case, add the proper variables together.
Man on 1st base, two outs (inning 7 or later)
If succeeds (2)
(m)Vs below average pitcher (1.1383) (n)vs average pitcher (1.13832) (o)vs above average pitcher (1.14833)
Man on 3rd (and potentially other bases), one out or less (vs regular defense) (innings 7 or later)
If succeeds (1)
*(p) (1.1494) vs below average pitcher **(q)vs average pitcher (1.149833) ***“r”vs above average pitcher (1.1583)
- *P1:+(-0.00303)
- *P2:+(-0.00301)
- **Q1:+(-0.00202)
- **Q2:+(-0.002001)
- ***R1:+(-0.0013)
- ***R2:+(-0.0011)
This formula has some similarities to #3. However, it isn’t the same thing. While variable #1 is the same, #2 is different. If variable #2 is involved, it is because the entire outfield is playing in (if the infield is up, add both #1 and #2 together).
Now that I have explained all six success situations for the RBI Decimal sabermetrics formula, I would like to explain why I put them in the order I did. Keep in mind it is not just decided by difficulty, but importance as well.
More from White Sox News
- The Chicago White Sox might have had a season ending loss
- The Chicago White Sox are expecting Tim Anderson back soon
- Miguel Cairo’s words spark life into the Chicago White Sox
- Dylan Cease should be the favorite for the AL Cy Young Award
- Ozzie Guillen speaks the whole truth about Tony La Russa
The first two (#6 and #5) have a man on second base only. Therefore, I don’t think that it is as crucial/difficult to get a man in from second as it is from first or third. While you may be wondering why I put first ahead of second, it is simple, first is more difficult, especially with two outs.
Another key area to look at is that in any of the situations, innings 7 or later is always more vital than innings 1-6. Situations #4 and #2 are the same, outside of the point in the game. The same can be said for #3 and #1. I decided that a man on 1st base is less valuable than a man on 3rd base for one reason, that being if a team doesn’t get a man in from third it is a big mistake. Mistakes cost a team games more than good plays.
RBI Decimal – Failure Variables
Now, on to situations where a player fails.
Man on 1st base, two outs (innings 1-6)
If fails (6)
(s)Vs below average pitcher (-0.001036) (t)vs average pitcher (-0.001023) (u)vs above average pitcher (-0.00018)
I don’t look at this situation any different than #4 and #2 for success. It is hard to put a negative value for this because there isn’t any way to punish a player for anything outside of what he does at the plate.
Man on 1st base, two outs (7th inning or later)
If fails (5)
(v)Vs below average pitcher (-0.0020211) (w)vs average pitcher (-0.002021) (x)vs above average pitcher (-0.00202)
See #6 for the reason why there are no variables to change the formula.
Man on 2nd base only, two outs (innings 1-6)
If fails (4)
*(y)Vs below average pitcher (-0.0020611) **(z)vs average pitcher (-0.002061) ***(aa)vs above average pitcher (-0.00206)
- *Y1:(+0.0002)
- *Y2:+(-0.0002)
- **Z1:(+0.00020101)
- **Z2:+(-0.0001)
- ***AA1:(+0.00020102)
- ***AA2:+(-0.00001)
The variables serve the exact same purpose as success formula #6 above.
Man on 3rd base (or more), one out (innings 1-6)
If fails (3)
*(bb)Vs below average pitcher (-0.0030812) **(cc)vs average pitcher (-0.0030621) ***(dd)vs above average pitcher (-0.00304001)
- *BB1:+(-0.000306)
- **CC1:+(-0.000303)
- ***DD1:+(-0.000301)
See #3 from success to see what purpose the variables serve.
Man on 2nd base only, two outs (7th inning or later)
If fails (2)
*(ee)Vs below average pitcher (-0.0030913) **(ff)vs average pitcher (-0.00306301) ***(gg)vs above average pitcher (-0.0030408)
- *EE1:(+0.000301)
- *EE2:+(-0.0003)
- **FF1:(+0.0003102)
- **FF2:+(-0.0002003)
- ***GG1:(+0.0003201)
- ***GG2:+(-0.00003001)
No different than #5 from success. You will have to refer to #6 from success to see what purpose the variables serve.
Man on 3rd base (or more), one out (7th inning or later)
If fails (1)
*(hh)Vs below average pitcher (-0.0040833) **(ii)vs average pitcher (-0.0040602) ***(jj)vs above average pitcher (-0.00403031)
- *HH1:+(-0.000408)
- *HH2:+(-0.00002001)
- **II1:+(-0.00040601)
- **II2:+(-0.000020001)
- ***JJ1:+(-0.0004032)
- ***JJ2:+(-0.00002)
This formula is a bit different than the #1 from success, for one main reason. Since the situation is the 7th inning or later, I had to put a second number in for each variable. Just as above, “1” refers to a situation where there are two or more infielders in. Meanwhile, “2” refers to a situation when the entire outfield is playing in (and infield as well). If there is a situation where “2” is needed, make sure to add “1” and “2” together.
Now, I am guessing you are wondering how to calculate this formula. It is quite simple honestly. For each and every situation a player comes up to bat (or team), just add the outcome together. If they succeed, make sure to add the proper parts of the formula together. Keep in mind that even though a player may succeed in a situation, he may also lose points for the way he did it.
More from Southside Showdown
- The Chicago White Sox might have had a season ending loss
- The Chicago White Sox are expecting Tim Anderson back soon
- Miguel Cairo’s words spark life into the Chicago White Sox
- Dylan Cease should be the favorite for the AL Cy Young Award
- Ozzie Guillen speaks the whole truth about Tony La Russa
I want you to understand one thing. You may be wondering why I take so little numerically from a hitter when he fails, but there is a reason for that. A good hitter will fail 7 out of 10 times, which means that it wouldn’t make sense to take away an even amount from a player every time he fails as when he succeeds.
The next key to this is how far to the right of the decimal I will go when calculating it. I decided that it would make sense to round to the nearest ten-thousandths, just because if I went as far as the actual number goes in some of these actual variables it would get too long.
Overall, I think it makes sense to value the RBI in a different way in baseball sabermetrics. I know some may wonder why I am not valuing every RBI, but it makes sense to see what the most critical situations just to see who is best in them. I also think it makes sense to punish players for failure as well, because it may show who comes up more in these situations and doesn’t succeed.
The whole purpose of doing this is to evaluate players in a complex way, which is why the decimal point was brought in. If I just used basic numbers, it might not work, but the decimal point makes it a bit more complicated.
Next: Quarterly White Sox Player Grades
Keep in mind that changes can be made, as it would be hard to test this sabermetrics formula as of right now. Stick around as you may see more formulas such as this one in the future.