The Shep Diagram and Data in the Digital Age

I originally published this on Beyond The Boxscore, here.

One of the cooler things that has emerged this season is the “Shep Diagram,” a series of animated images/video layered into one. It began about a week ago when a reddit user named DShep created the image below. The image was picked up by Deadspin and FanGraphs and a bunch of other places because its incredible, but also because it shows how incredible Yu Darvish is due in part to his consistent release point.



(If that doesn’t animate, click here.)

Just a day or two ago, MLB Network created their own Shep Diagram for Justin Verlander — watch it here — showing how four of his pitches can look the same coming out of his hand. This method of displaying information is relatively new, but it is taking off fast, in large part because it feels useful nearly immediately.

We haven’t seen data displayed like this much in the past. Sure, we have heat maps and PitchFX data etc., but it’s rare that we’re using actual game footage in this manner. Take, for example, this 2010 New York Times Interactive piece on Mariano Rivera’s effectiveness. They show us a computer-generated representation of over 2000 pitches, and at the time (and now), it’s an incredible way to look at the game. As the Shep Diagram demonstrates, though, it’s just the tip of the iceberg.

Fifteen years ago, we were just at the beginning of the era where digital publishing tools and broadband access were enabling the growth of user-generated content. The skill set of the people who were able to gain access to those tools included literacy and a basic understanding of math, and because of that, we have a wealth of ways to explore baseball via words, spreadsheets, and formulas. As tools and skill sets involving digital video editing become better and more mainstream, hopefully we’ll see more things like the Shep Diagram above. A picture is worth 1,000 words; these things are worth ten times that.

Originally published on May 2, 2013