Data deluge: MLB rolls out Statcast analytics on Tuesday, multiplying metrics
Which outfielder sprints the fastest and runs the longest to snag line drives into the gap? Which shortstop is best at throwing from the grass to nip the runner at first? Which catcher gets the ball to second base the quickest?
A new era in analytics starts Tuesday when Major League Baseball rolls out its Statcast tracking technology during the MLB Network's broadcast of the St. Louis Cardinals' game at the Washington Nationals.
Real-time access will expand quickly to Fox, ESPN and Turner, then to regional sports networks within about six weeks.
By June, fans should be able to look up leaderboards for hitters' exit velocity, fielders' route efficiency, speed and distance, and pitchers' spin rates and arm extension.
Cameras and sensors installed at each ballpark capture 120,000 bits per second. Henry Chadwick, who invented the box score in 1859, would be flabbergasted.
"Fans are ready for a deeper dive into what makes this game go," Bob Bowman, MLB's president of business and media, said Monday.
All the equipment is in place at the 30 big league ballparks to gather information even the HAL 9000 could not compute.
ChyronHego developed high-definition arrays of three cameras apiece, each placed behind third base 15 meters apart, which capture 30 samples per second of stereoscopic video.
Trackman created a redesigned 3D Doppler radar with a panel containing multiple sensors that captures 2,000 samples of data per second.
Hardware was built specifically for MLB. Joe Inzerillo, executive vice president and chief technology officer of MLB Advanced Media, said the cost was tens of millions of dollars.
"You can say that was the fifth-fastest run to first, that was the ninth-fastest catch, the best route efficiency this season," Inzerillo said.
"A decade from now, we'll be looking back and saying that was the highest-fourth-decimal point route efficiency that's ever been captured in baseball."
Code was written by a pair of Brazilians with Ph.D. degrees: Claudio Silva, a 45-year-old professor of computer science and engineering and data science at New York University, and Carlos Diedrich, a 36-year-old computer graphics researcher at Modelo who is a consultant for BAM.
"We'll have much better tools to study collections of games rather than individual plays," Silva said. "The next phase of this is trying to study collections of games and what that means for strategy."
Teams already have access to the data, and MLB thinks its biggest impact will be the defensive metrics.
"We're going to be able to settle some age-old disputes," MLB Network President Rob McGlarry said. "They always said (Joe) DiMaggio never had to dive for a ball — they didn't use that term back then, but presumably because his route efficiency was so good."
MLB made a major advance in 2006, when Sportvision's PITCHf/x started measuring the velocities and trajectories of pitches. The sport started testing Statcast prototypes during the second half of 2013 at Citi Field, Miller Park and Target Field, and also at the 2014 All-Star Game, League Championship Series and World Series, plus the spring training home of Arizona and Colorado.
Last year, some of the data was used a day later. For instance, Kansas City's Eric Hosmer was thrown out at first for a double play in the third inning of World Series Game 7 for a spectacular double play started by diving second baseman Joe Panik, who flipped the ball to shortstop Brandon Crawford with his glove.
Originally called safe by first base umpire Eric Cooper, Hosmer was ruled out after a video review. Statcast determined Hosmer's speed toward first dropped from 20.9 mph to 12.9 mph when he dove into first, and that if he had run through the base he would have been safe by 0.1 seconds — about a foot — rather than being out by 0.02 seconds.
So far this year, data showed Atlanta shortstop Andrelton Simmons needed just 0.11 seconds for his first step on an April 10 grounder by the New York Mets' Travis d'Arnaud and made a 68.5 mph throw across his body from the outfield grass in time for the out at first.
Toronto left fielder Kevin Pillar had a 97.9 percent route efficiency when he climbed the left-field wall at Rogers Centre last Wednesday to rob Tampa Bay's Tim Beckham of a home run, covering 81.3 feet at a top speed of 15.2 mph.
Houston right fielder George Springer was even better on April 12, with a 99.1 percent route efficiency when he stole a walkoff grand slam from Texas' Leonys Martin, covering 93.7 feet at up to 17.7 mph.
All this information could be of interest to video game developers — "certainly can make them even more realistic," Bowman said.
Because of less frequent frame rates, Inzerillo said it would be laborious and subject to a higher error rate to use the technology to analyze players in previous eras.
Silva wants to see how teams will use all this information in evaluating their players and deciding which ones to sign and trade for. Players can be compared with a theoretical star who is tops in every category, a baseball R2-D2 and C-3PO.
"They're going to new skills sets in their back offices," he said. "How much data do your really need on a player to model that player? How much predictive power does it have?"
More data leads to more answers but also more questions.