Archives
You are currently viewing archive for July 2013
Posted By Dr. Stan, the Stats Man


The idea for this posting came about after I stumbled upon the “Hall of 100” on the ESPN website. This list was published on December 4, 2012. ESPN summoned sports writers, and on-air personalities to compile a list of the top 100 baseball players of All-Time. They were given a list of more than 300 players who met certain milestones. They were then asked to rank them 1 thru 100. They were told to only consider their play between the lines.

I wanted to compare the ESPN top ten players to my list of top ten players, as they appeared in Chapter 18 of my book Sandlot Stats. ESPN’s top ten list was based on the subjective opinion of their chosen committee of experts. However, they were encouraged to use advanced metrics. My top 10 was based on nine quantitative statistics which included AVG (batting average), OBP (on-base pct.), SLG (slugging pct.), OPS (on-base plus slugging), BRA (OBP*SLG), HRA (home run average), H (Number of Hits), HR (number of home runs), and Runs Created for their team [(H+BB)*TB]/[AB+BB]. Also, credit was given for winning a Triple Crown, a Career Triple Crown, and ranking in the top 10 in either Bill James’ Black or Gray-Ink Test. In Chapter 18, you can read about my 26 finalists and their total points.  Like the ESPN list, I only looked at what the players did between the lines. Since my list only considered positional players, I only chose ESPN’s top ten positional players. There was one other difference. The ESPN list considered hitting, fielding, and base-running whereas my list only considered hitting. Therefore, Rickey Henderson, who finished number 11 on the ESPN list, did not make my list of 26 finalists. My list was based on player accomplishments before 2009 (when Chapter 18 was written).

Here are the two lists.All-time Top Hitters

 

Notice how similar the two lists are. This shows how important hitting is in the evaluation of positional players. Of course, both lists have “The Babe” as number 1. I can understand the difference in rank 2 between the two lists. Willie Mays was a five-tool player in the important position of center field; whereas Ted Williams was an adequate left fielder. In fact, the Yankees turned down a proposed trade of Ted Williams for Joe DiMaggio because they considered the center field position much more valuable than the left field position. Taking into account fielding and running, I can see why ESPN put Mays in front of Williams.

Except for the order of the 21 players on the two list, the only five players not on both lists are Nap Lajoie, Honus Wagner, Rogers Hornsby, Mickey Mantle, and Albert Pujols. Wagner and Mantle are on the ESPN top 10 list but not on my list. However, Wagner ranks 12th and Mantle 13th on my list of 26 players. The actual difference between the 8th ranked Lajoie and the 13th ranked Mantle is a total of five points in my scoring system.  Excluding pitchers, Hornsby ranked 12th , Pujols ranked 15th , and Lajoie ranked 34th on the ESPN list  My major beef with the ESPN list is the extreme difference in rank between Lajoie (rank 34) and Wagner (rank 9). Both players played in the same era (1896-1917) and both were infielders. Lajoie’s career AVG was .338 compared to Wagner’s.327. I gave the edge to Lajoie because in 1901 he was a Triple Crown winner. I guess ESPN liked the fact that Wagner played shortstop while Lajoie played second base. I could have easily called it a tie between the two players.As I mentioned before my All-Time favorite player was Mantle. In my opinion if it wasn’t for his reckless life-style and unfortunate knee injury, he would have been in the top 5 on both lists


 
Posted By Dr. Stan, the Stats Man

Bill James introduced his formula for predicting a team’s expected winning percentage (Expected W%) for a season. The formula is Expected W% = (Runs Scored)2 / [(Runs Scored)2 + (Runs Allowed)2] and is called the Pythagorean Theorem of Baseball.  Bill James’s rationale for his theorem was that the number of runs a team scored (RS) compared to the number of runs allowed (RA) is a better indication of a team’s performance than their actual win-loss record. Of course, this reasoning is the antithesis of Bill Parcells’ quotation “You are what your win-loss record says you are.” Let’s say a team is 45-37 at midseason but based on James’ formula their Expected W% was at or below .500. It might be predicted that as the season moves toward the end their win-loss record will also move in the wrong direction.

My current research done with the help of Kevin Faggella, a math major at QU, introduces a new alternative formula to Bill James’ formula to accomplish the same goal. My formula is Expected W% = .000683*(RS – RA) + .500 and is called the Linear Theorem of Baseball. This formula is developed by applying the statistical techniques of linear regression and correlation analysis to the sample of MLB years 1998-2012. For those interested in learning these important mathematical tools and seeing the derivation of these theorems go to Chapter 5 of Sandlot Stats. In fact, my Expected W% formula would have correctly predicted the fate of the 2005 Washington Nationals. On July 5, 2005 the Washington Nationals were in first place with a record of 51-32 having RS = 340 and RA = 340. According to my formula their Expected W% = .000683*(RS – RA) + .500 = .500. This clearly sent a message about how their season would end. In fact, their final record for 2005 was 81-81 with RS = 639 and RA = 673.

Let us now look at the midpoint of the 2013 season and use my formula to make predictions on which teams will make the playoffs. Using 90 wins as the milestone for a team to either win their division or become a wild card, this equates to a final record of 90-72 and a winning percentage of (90/162) = .556. Using my formula, we have .556 = .000683*(RS – RA) + .5000. Solving we get (RS – RA) = 82 (rounded). Of course, one might choose 95 wins or some other amount instead of 90. Based on the closing records before the 2013 All-Star game and using my formula, I created a table of all the teams whose Expected W% = .000683*(RS – RA) + .500 is now greater than .500. I also included the Dodgers because they finished the first half winning 18 of their last 23 games.A New Sabermetrics Formula to Predict Postseason Teams

Using this table along with other data, these are my playoff predictions for 2013. First, the division winners are for the AL-East Boston, for the AL-Central Detroit, for the AL-West Oakland, for the NL-East Atlanta, for the NL-Central St Louis, and for NL-West the Dodgers. My two wild-card choices are for the AL Tampa Bay and Baltimore or Texas (a tossup) and for the NL Pittsburgh and Cincinnati. I pick Detroit for the AL-Conference winner and St. Louis for the NL-Conference winner. The last time Detroit won a World Series was in 1984 under the leadership of Sparky Anderson. Detroit’s current manager Jim Leyland managed the 2013 All-Star game to win and the AL won. Yes, with the help of home field advantage the Detroit Tigers will win the 2013 World Series. Sadly, I predict my beloved Yankees will be on vacation during the playoffs.


 
Posted By Dr. Stan, the Stats Man
The All-Star game marks the halfway point in the season. Honestly, for the past few years I had very little interest in the game. As a Yankee fan, this year I really looked forward to seeing Mariano Rivera pitch in his last All-Star game. I was also intrigued by the thought of the Mets young pitching phenom Matt Harvey facing one of the best hitting lineups ever in an All-Star game. I enjoyed the pregame ceremony when the People Magazine military heroes were honored and lined up on the first and third base foul lines where in past games the players would line up during their introductions. Harvey, after issuing a double to Trout on his first pitch and then hitting Cano on the side of his knee with a fastball, was untouchable striking out three in two innings. Yes, Harvey is the real deal and could be the next Tom Seaver for the Mets. Tom Seaver threw out the first ball to the Mets current franchise David Wright making it a Mets love fest at Citi Field. Pitching clearly dominated the game with the NL getting only three hits. But for me the highlight of the game was what happened in the top of the eighth inning. Mariano entered the game to his song Sandman. When he reached the mound only the umpires were on the field as all the AL players remained in the dugout. Fans, players, and media people clapped as one for this role model of what a professional baseball player should be. The AL won 3-0 and will have home field advantage in the World Series.
 
There were many player highlights in the first half of 2013. On the fielding side there were great leaping and diving catches in the outfield. But the fielding play that stands out in my mind as the best occurred against my Yankees. Third baseman Manny Machado stabbed at a ball hit behind third base by the Yankee’s Cruz and deflected it into foul territory. Reaching out with his bare hand he grabbed the rolling ball and in one motion threw it across his body to first base nipping Cruz. Maybe the Orioles have their next Brooks Robinson. As an on core the young Machado made two sparkling plays in the All-Star game. He can hit also and leads MLB with 39 doubles. The Orioles also are seeing the emergence of a new superstar in Chris Davis. Chris Davis is currently batting .315 with 37 home runs (tying the AL record for most HRs before an All-Star game) and 93 RBI. His power may stop Miguel Cabrera from repeating his Triple Crown. Cabrera is currently batting .362 with 30 home runs and 95 RBI. Cabrera’s 2013 numbers are far ahead of his Triple Crown pace of 2012. Miguel Cabrera is the best hitter in MLB today. For the first time in All-Star history two players, Chris Davis and Miguel Cabrera had 30+ home runs and 90+ RBIs before the All-Star game.  The emergence of the Cuban import Yasiel Puig as a five tool player energized the floundering LA Dodgers, who are now a serious threat to win the NL West division. After only a little over a month in the big leagues, Puig is batting .391 with an OPS of .1038. Even though Puig is getting all the publicity, don’t overlook the return of Hanley Ramirez to the Dodgers. His numbers in 39 games are amazing. He is batting .386. By the way his OPS of.1137 exceeds both Davis’ OPS of .1109 and Cabrera’s OPS of .1132. As the late Mel Allen, the original voice of the Yankees would say, “how about that.” Finally on May 21, the five tool player Mike Trout became the youngest player ever to hit for the cycle.
 
Moving on to pitching, the first half of 2013 produced two no-hitters. On July 2, Homer Bailey of Cincinnati pitched his second career no-hitter against Tim Lincecum of the Giants. He duplicated Nolan Ryan’s feat of throwing one no-hitter and then a second no-hitter before anyone else. Amazingly, 11 days later Tim Lincecum threw a no-hitter against the Padres. Surprisingly, Bailey has a 5-8 record and Lincecum has a 5-9 record in the first half.

 
Posted By Dr. Stan, the Stats Man
Martin Cobern is VP, R&D of APS Technology Inc., in Wallingford, CT. He has known Stan for over 30 years and was one of many who read and commented on his book. Sandlot Stats. He grew up as a Dodger fan in the Bronx, until the great Western betrayal left him without a team. When the Mets arrived, featuring neighborhood hero Ed Kranepool, he found a new one. He maintained his loyalty despite moves around the world. Then, both his daughters went to school in Boston and introduced him to the despair and joy of the Red Sox Nation. His two teams, the Mets and the Red Sox, have one thing in common – a passionate dislike of Stan’s beloved Yankees.
 
“WITHOUT DATA….
… you’re just another person with opinions.” The entire concept and premise of this site is that data analysis can give us a deeper insight into the game of baseball, and that the study of baseball statistics can be a useful introduction to the basic concepts of data analysis. Both are true … provided we have the data to analyze.
For a century or so, these data have been provided by thousands of baseball writers and fans using a 3 ½” stub of a pencil.   Using a variety of arcane marks, these devoted scribes have recorded every play for posterity. This is the ore that baseball statisticians mine. I, for one, can’t watch a game without scoring. Baseball is a game filled with pregnant pauses. What better way to fill them than by putting the events in the context of the overall progress of the game?
This cherished tradition is now, like most activities involving pencils, being challenged by the advance of technology.   As noted in the New York Times:
Hand-held electronic devices allow users not only to follow a game, but also to download several scorekeeping applications. The apps can do what paper scorecards and the eraserless pencil cannot: update statistics and correct mistakes, among other features.
I realize that it took enormous effort for the Elias Sports Bureau and others to locate and transcribe thousands of sometimes illegible, totally inconsistent scorecards, and correct the inevitable errors (if they could.) Their work is now much simpler, with real-time updates following every pitch. This is also a godsend for baseball statisticians like Stan.
But baseball is, above all, a traditional game, with its roots in the games played by Civil War soldiers (and prisoners of war) 150 years ago. I have been forced to accept artificial turf, the designated hitter, nighttime World Series games and pitchers specializing in single innings. I’ll be damned if I am going to give up my pencil!
Martin E. Cobern

 
Posted By Dr. Stan, the Stats Man
July, 2013, marks the first anniversary of my blog. First, I would like to thank each one of you for reading my postings and emailing suggestions for new postings. For those of you who have just started reading my postings, please use the archive feature to look at past postings. I am happy to announce that I have passed 50,000 views for my over 50 postings.
I would also like to take this opportunity to thank all of you who purchased my book, “Sandlot Stats: Learning Statistics with Baseball”, published by Johns Hopkins University Press. I thank you and have appreciated and enjoyed all of your comments. “Sandlot Stats” serves as the textbook for teaching my Baseball and Statistics course at Quinnipiac University. Since the basic mathematics needed is covered within the book, anybody can use this book as a tool for learning the important subject matter of statistics. The first 15 chapters teaches the subject matter of probability and statistics. The last three chapters apply the statistics covered in the first 15 chapters to baseball research.
My last blog talked about the probability of a current player batting .400. This is one of the topics covered in “Sandlot Stats”. For those of you not familiar with my book, the two leading characters throughout the book are Henry Aaron and Barry Bonds. The supporting characters are Ted Williams and Joe DiMaggio. You can read reviews of the book and hear interviews on my website www.sandlotstats.com. This website also contains a wealth of baseball history and trivia. I hope you will check it out.
For the rest of my anniversary posting, I would like to talk about the wonderful baseball people and friends whose stories gave me inspiration for many of my postings.My first thank you goes to my wife, Tara, of 45 years, the designer and programmer of my web site and publisher of all my blogs. She has also put Sandlot Stats on Facebook and Pinterest so look for it there, too.  Throughout this year, I have been invited to speak at mathematics conventions in Boston and San Diego, at the West Point Military Academy, at California State University at LA and at Amity High School in CT. The talks I gave ranged from my baseball research to the teaching of statistics with baseball and the history of baseball. If you would like me to talk to your group about baseball just email me. As my friends will tell you, I love to talk. Thank you to Father Gabriel Costa for the wonderful day you provided my wife and I at West Point. I really enjoyed meeting the Cadets and speaking to them on the topic of assigning probabilities to various batting streaks, including Joe DiMaggio’s 56-game hitting streak. Thank you to Nikolai Yakovenko for coming to Quinnipiac University to speak to our faculty and students. Thank you to the former major leaguer Rico Brogna for coming to my baseball and statistics class to talk to my students about the use of sabermetrics in baseball scouting, Rico, I really enjoyed spending the entire day talking baseball and hearing your baseball stories. Then there were the many friends and casual associations which led to some of my blogs. Such a casual meeting occurred with a woman outside a store in Naples, Fl. It turned out her father’s uncle was the pitcher who was on the mound when Babe Ruth pointed to centerfield. I even learned more about Dale Long from his grandson who was one of my students this year. Thank you to my many other Quinnipiac baseball students who wrote blogs about how their lives have been affected by baseball. Finally, thank you to my friends and colleagues, whom I will not name for fear of leaving someone out, for all their support and encouragement. The blog will continue….

 


 
Google

User Profile
Dr. Stan, th...
stan@sandlot...
Male
Quinnipiac U...

 
Links
 
Archives
 

Enter your email address:

Delivered by FeedBurner

 
Visitors

You have 984862 hits.