Thursday, October 25, 2007

Welcome World Series fans!

The big story of Game 1, to me at least, is the continuing run barrage of the Red Sox. Last night, they scored double-digit runs for the third straight game (post-season game-by-game log):

vs. Cleveland (Game 6) 12-2
vs. Cleveland (Game 7) 11-2
vs. Colorado (Game 1) 13-1 

I did a posting in late August about how the Red Sox had accomplished the extremely rare feat during the regular season of scoring double-digit runs in all games of a four-game series (against the Chicago White Sox). Thus, it seems the Red Sox are now up to their old tricks!

The pitching of Boston's Josh Beckett shouldn't be overlooked, either. As this ESPN.com game summary from last night notes:

Beckett also lowered his career postseason ERA to 1.73, placing him third behind Mariano Rivera (0.77 ERA) and Chrisy Mathewson (1.15 ERA) among pitchers who have thrown at least 70 postseason innings.

Tuesday, October 23, 2007

Welcome to visitors who've found their way here via Carl Bialik's "Numbers Guy" blog for the Wall Street Journal. I invite you to browse through my write-ups and the links section on the right. Feel free to add comments to my postings, if you'd like.

If you have no idea what I'm talking about, click here.

Sunday, October 21, 2007

The Tennessee Titans at Houston Texans game completed earlier this afternoon had two major streakiness story lines. Houston, trailing 32-7 entering the fourth quarter, went on a 29-3 burst in the final period to take a 36-35 lead with 57 seconds remaining. Tennessee moved the ball down the field in the closing moments, however, to set up kicker Rob Bironas for a game-winning 29-yard field goal as time ran out (ESPN.com game recap).

The other streaky element was the "hot foot" of Bironas. His winning kick was his eighth successful field goal of the game, which sets a new NFL record (he had no misses). The yardage distances of the field goals in the order in which they occurred are as follows:

52, 25, 21, 30, 28, 43, 29, 29

Looking at Bironas's career statistics from various distances (which appear to be from before the Houston game, given that shortly after the game, his distance-specific stats for this season hadn't been updated, so I would doubt his career ones had been), they are as follows (career stats offer a bigger sample size than just those from 2007):

20-29 yards 21/23 (.91)
30-39 yards 18/19 (.95)
40-49 yards 11/18 (.61)
50+ yards 3/7 (.43)

To estimate the probability of Bironas's making all eight field-goal attempts he took, given that he would be receiving these opportunities, we multiply the component probabilities together:

(.43) (.91) (.91) (.95) (.91) (.61) (.91) (.91)

which yields .155. If we also factored in the likelihood of an NFL team having so many drives stall in fairly close proximity to the goal line, the probability of Bironas's accomplishment would probably get even smaller.

A couple of cautions are in order about this analysis. First, it is the unusual nature of the feat (or in this case, foot) that drew me to conduct the analysis; I did not seek a random cross-section of games. Second, the equation I used assumes independence of observations, that the outcome of any one kick had no impact on the next.

The independence assumption is typically associated with sequences of coin flips and dice rollings, which unlike humans, cannot experience momentum and other associated psychological states. However, having conducted numerous analyses over the years for this website, I consider the independence assumption to hold pretty well for athletic performances, too.

As for Houston's team-comeback element, which unfortunately for Texans' fans did not hold up, I would direct you to my statistical analysis of a relatively recent, similar comeback by Texas Tech (where I'm on the faculty) against Minnesota in last December's Insight Bowl.

Monday, October 15, 2007

The Colorado Rockies have just swept the Arizona Diamondbacks in the National League Championship Series, four games to none. Building upon a sweep of Philadelphia in the opening round (which has a three-out-of-five format), Colorado is 7-0 in the post-season and, factoring in the close of the regular season, has won an amazing 21 of its last 22 games.

Since the advent of the three-round/wild-card play-off system in 1995, the most dominant post-season performance by a World Series champion is shared by the 2005 Chicago White Sox and 1999 New York Yankees, each with an 11-1 record. The Rockies will thus seek to become the first team to go 11-0.

The baseball media naturally have been abuzz with Rocky talk, including comparisons to other hot teams down the stretch in baseball history. Another team I heard about tonight was the 1977 edition of the Kansas City Royals. From August 31 to both games of a September 25 double-header, inclusive, the Royals won 24 out of 25. Ultimately, however, Kansas City lost a heartbreaking American League Championship Series to the Yankees.

Friday, October 12, 2007

Just a few brief notes on the Major League Baseball play-offs:

With their opening-game win over Arizona in the National League Championship Series, the Colorado Rockies have now won 18 of their last 19 games (which also includes a three-game sweep over Philadelphia in the opening round). A couple of entries down, I conducted an elaborate analysis of the Rockies in the regular season.

A streak-within-a-streak is that Colorado pitcher Jeff Francis improved to 5-0 lifetime at the Diamondbacks' Chase Field. I was pleased to see, via this game article, that Francis appears to have some statistical savvy:

"I really can't explain that," said Francis. "It's just a small sample size of me not being here that long and just having a good run against one particular team."

Over in the American League, the championship series between Cleveland and Boston starts tonight. As pointed out in an ESPNews graphic on television yesterday, Cleveland hit .444 (12-27) in two-out situations with runners in scoring position (RISP) in its opening-round win over the New York Yankees.

Tuesday, October 09, 2007

Alex Rodriguez's streaky stretches, both hot and cold, have been chronicled on this blog. With the Yankees' elimination from this year's MLB play-offs at the hands of Cleveland last night, here's an accounting of his post-season woes from a Yahoo! Sports article...

He is mired in an 8-for-59 (.136) playoff spiral dating to his Game 4 home run against Boston in the 2004 ALCS.

New York's biggest bopper is hitless in his last 18 playoff at-bats with runners in scoring position.

Rodriguez hit a solo homer [in the finale of the Cleveland series]... ending a streak of 57 postseason at-bats without an RBI...

Hitless in his last 27 postseason at-bats with any runners on base, A-Rod is certain to again face some criticism after his up-and-down postseason.

Tuesday, October 02, 2007

With last night's exciting, extra-inning, come-from-behind win over San Diego in the National League one-game tie-breaker for the wild-card play-off slot, the Colorado Rockies are riding a hot streak (14 wins in their last 15 games) into the first round of the post-season (Rockies' game-by-game log).

Colorado's first sign of streakiness this season came when it won 7 straight in late May after starting out 18-27. I have plotted a graph of the Rockies' cumulative winning percentage after each game, starting with the 7-game winning streak, as shown below (you can click on the graphic to enlarge it). The late ending of the Colorado-San Diego game, plus all the little embellishments I added to the chart, kept me up until 2:00 AM last night!


As it says in the caption, the Rockies' last 118 games of the season included a combination of streaks (both hot and cold) and relatively steady, incremental gains.

A statistical technique that's appropriate in this context is the runs test. A "run" is a stretch of all wins (without interruption by a loss) or all losses (uninterrupted by a win). The following hypothetical sequence [WWLWWWLLL] includes four runs.

Given that streakiness entails winning (or losing) games in bunches, and not merely alternating wins and losses, evidence for streakiness would come in the form of a team exhibiting fewer runs than would be expected by chance. During their last 118 games of the season (the part I'm focusing on), the Rockies indeed exhibited fewer runs (55) than would be expected (57), but the difference is not very large.

A lot of teams (or individual players, when it comes to hitting or pitching) appear to be streaky performers. However, finding statistical evidence for such is more difficult than many fans would imagine.

For an earlier example of the runs test, where I went into greater detail, click here.

The Rockies' first-round opponent, the Phillies, have exhibited hot play, too, of late, though not quite as dramatically, closing out the season 13-4 (log). If both teams continue their hot offense, the scoreboard operators should get a real workout!

Thursday, September 27, 2007

A couple days ago, Sports Illustrated's Tom Verducci published a column on "Debunking the biggest myths of MLB's wild-card era" (which I learned about via the ESPN radio show, The Herd). Myth No. 2 was that, "The 'hot' teams -- the ones that play well down the stretch -- are the ones to fear in the postseason." Take a look at Verducci's evidence by clicking here.

Tuesday, September 25, 2007



This upcoming Saturday, September 29 will mark the 20th anniversary of a major article in the St. Louis Post-Dispatch's Science section, on whether there was any evidence of streakiness -- either in wins and losses, or in batting performance -- in the city's beloved baseball club, the Cardinals. The article was written by Charles Franklin, then a relatively new professor at St. Louis's Washington University.

My connection to Dr. Franklin -- including a span of 22 years between any in-person contact -- and how I obtained the images of his article interspersed throughout this write-up make for an interesting story, if I do say so myself. (By the way, you can click on any of the images to enlarge them and be able to read them more easily.)

As with many developments in my life, it all starts with the University of Michigan. During the summer of 1985, after I had completed my first year of social psychology grad school at UM, I took a statistics course (linear models) through the university's ICPSR program.

The instructor of that course was the aforementioned Charles Franklin, who had just completed (or was just completing) his Ph.D. in political science at Michigan and had come back from Wash U to teach the summer class.

After that class, roughly 20 years passed without Charles's and my paths crossing in any way. In 1992, Charles moved to the Univesity of Wisconsin, Madison. Then, in 2005, he founded a blog called Political Arithmetik (yes, it ends with a "k"), which is devoted to quantitative expositions on public-opinion data.


Armed with his palette of graphing software, Charles might track, for example, presidential job-approval ratings over time, or systematic differences between survey firms in whether their polls tend to give higher or lower job-approval readings than other firms (known as "house effects"). Charles now also grinds out his analyses for the website Pollster.com, in collaboration with Mark Blumenthal, himself a Michigan undergraduate alumnus.

I don't remember exactly when I first discovered Charles's blog, but once I did, I e-mailed him about being in his class in 1985, and I've submitted comments on his postings from time to time.

This past summer 2007, I was fortunate enough to get the opportunity to teach a course at Wisconsin-Madison, as a visitor in human development and family studies (the same department I'm in at Texas Tech for my regular, full-time job). Here are some photos from my time in Madison.

Once I knew that I would be going up to Madison for a summer term, I contacted Charles about getting together, which would be our first visit in 22 years. He was agreeable, so we met in his office, just north of the campus's famous Bascom Hill. Charles told me that he had just returned from teaching in the Michigan summer stats program, and that he was calling it quits after 25 summers in Ann Arbor.

We chatted about Michigan, statistics, polling, and blogging, the latter of which led to my mentioning the Hot Hand page. As if we didn't have enough connections between Michigan and all the statistical stuff, Charles then told me about his 1987 Cardinal streakiness article for the St. Louis Post-Dispatch, of which I was completely unaware.


He didn't have any copies around. However, compounding our coincidences in a manner worthy of a Seinfeld episode, I was heading to St. Louis over an upcoming weekend to attend the annual SABR conference, and it seemed likely I could find a microfilm of Charles's article at the downtown St. Louis public library.

I, indeed, found the microfilm of Charles' article, and you're now seeing some excerpts of my discovery. The staff members in the microfilm room were extremely helpful, for which I thank them.

As you can glean from the inserted newspaper images, Charles didn't find any evidence of streakiness on the part of the Cardinals.

Thursday, September 20, 2007

Matt Holliday of the Colorado Rockies is currently on a home-run explosion, having hit 11 in his last 12 games. Holliday is known for the gaudy distances of some of his homers, as immortalized in this 2006 blast I found on YouTube.

It took Holliday until September 2 to get his 25th homer of the 2007 season. He's now, of course, up to 36 homers, a 44% increase from when he was at 25 (11/25) in less than three weeks.

Holliday's streak has prompted me to seek out other similar ones.

The Yankees' Alex Rodriguez, whose tendencies to hit homers in bunches I analyzed in an earlier posting, began the 2007 season by hitting 12 homers in 15 games.

Another seemingly good place to look was at players who had set (or come close to) single-season records. During Barry Bonds's 73-homer season in 2001, his most scorching stretch appears to have taken place from May 17-22, during which he hit 9 homers in 6 games (game-by-game log).

Mark McGwire and Sammy Sosa in 1998 also seemed worth looking at. A few years ago, I found a copy of Race for the Record: The Great Home Run Chase of 1998 (a fancy magazine-type volume with side-binding) on sale for 2ドル.99, so I was able to consult the charts within. In the eight games from May 18-25, McGwire had 9 homers. Sosa, of course, had the 20-homer month of June; at his hottest during that month, he hit 11 dingers from June 15-25.

By focusing only on big-name home-run hitters in the last decade, I'm sure I'm missing other great homer binges. I invite readers to add other big homer stretches (with documentation please) via the Comments link, below.

Wednesday, September 12, 2007

Pete Ridges just sent a message to the SABR e-mail discussion list, pointing out that this past Sunday, while playing at Cincinnati, Milwaukee became the first team in Major League Baseball history to start off a game by hitting three consecutive home runs (box score and play-by-play sheet)

Ridges offered the opinion that:

Unusually, some reports have undersold this, by saying that the Brewers were the third team to start their first inning with 3 HR. However, the other two cases came in the bottom of the first...

The other trifectas were by San Diego in 1987 and Atlanta in 2003.

Offensively, of course, only a visiting team can start off a game. From this perspective, Milwaukee's feat is technically unique. However, for a home team to lead off its half of the first inning with three straight homers is pretty darn impressive, too.

Whether any given reader considers the Brewers to be in a class by themselves or to share the record with two other teams, Ridges's conclusion helps put everything in context:

By my addition there had been 188,835 major league games through Sunday, so I was extremely impressed by this.

Sunday, September 02, 2007

Charlotte (NC) Independence High School has just had its 109-game football winning streak come to an end -- but it took an out-of-state opponent to do it.

As part of former Ohio State quarterback Kirk Herbstreit's Ohio vs. The USA Challenge, Charlotte Independence ventured to take on Cincinnati Elder in the latter's home city, and dropped a 41-34 overtime decision.

As noted in the above-linked ESPN.com article, "Most of the wins weren't close. Independence had beaten opponents during its win streak by an average of nearly 35 points per game entering the 2007 season."

Independence's situation appears to fit a very simple "theory" of super-long streaks. A team (or individual) is physically superior to its competition, thus winning most of its games in dominant fashion. Then, in the rare circumstance of a tight game, the team with the winning streak benefits from good luck to keep the streak going, until the luck runs out.

One recent memorable example, from college football, was USC's 2005 win at Notre Dame to extend the Trojans' winning streak to 28, a victory that required some favorable bounces of the ball at the end.

When one thinks of other historical streaks, such as Joe DiMaggio's getting a hit in 56 straight games, the UCLA men's basketball team winning 88 straight games, and Tiger Woods making the cut at 142 straight PGA golf tournaments, it should not be surprising that the teams and individuals who accumulated these streaks were already at the top of their crafts.

Sunday, August 26, 2007

It was a Sox vs. Sox weekend, with Boston visiting Chicago for a Friday doubleheader and single games Saturday and Sunday. In the end, one team did a lot more "socking" of the ball than the other, as revealed in the following scores:

Red Sox 11, White Sox 3
Red Sox 10, White Sox 1
Red Sox 14, White Sox 2
Red Sox 11, White Sox 1

According to the Sunday game article, for a team to put up double-digit run totals in each game represented:

...only the fourth time that has happened in a four-game series since 1900, according to the Elias Sports Bureau. It's the first time it has happened in the American League in 85 years.

Thursday, August 23, 2007

Here are a couple of noteworthy hot-hand phenomena from last night's baseball action:

For the first story, the ESPN.com article says it all:

The Texas Rangers... became the first team in 110 years to score 30 runs in a game, setting an American League record Wednesday in a 30-3 rout of the Baltimore Orioles.

Elsewhere, a first-inning Milwaukee run ended Arizona pitcher Brandon Webb's consecutive scoreless innings streak at 42. Webb had been within reasonable striking distance of former L.A. Dodger Orel Hershiser's record of putting zeroes on the scoreboard for the opponents' inning-by-inning run counts for 59 straight frames, in 1988. Hershiser himself had edged out another Dodger great, Don Drysdale, who had blanked opponents for 58 2/3 innings in 1968.

A nice compilation of statistical data on pitchers' scoreless-inning streaks is available here.

Friday, August 10, 2007

Chicago White Sox closer Bobby Jenks tied a league record tonight for cumulative batters consecutively retired. According to this ESPN.com article, Jenks "has retired 38 straight batters, tying David Wells' American League record set in 1998 with the New York Yankees. It's the fourth-longest streak in major league history."

Update 1: The streak is now at 41 straight batters retired, tying the major-league record.

Update 2: Brought in to close out the ninth inning of the White Sox' August 20 contest against Kansas City, Jenks was greeted with a lead-off single by the Royals' Joey Gathright (article).

Jenks thus joins -- but doesn't exceed -- former San Francisco Giant pitcher Jim Barr in retiring a major-league record 41 consecutive batters.

Monday, August 06, 2007

In baseball action tonight, the St. Louis Cardinals tied a major-league record by getting hits in 10 straight at-bats (official at-bats, that is, as one batter walked in between the first eight and last two hits of the streak).

Much of the oddity centered around St. Louis starting pitcher Braden Looper, a converted reliever (I mention that, as relief pitchers would probably have among the fewest at-bats of any National League players and thus little opportunity to gain hitting experience).

For one thing, Cardinal manager Tony LaRussa had Looper batting eighth in the order, a ploy LaRussa tries with his pitcher from time to time. And, more amazingly still, Looper (who, as the above-linked game article noted, "began the game batting .161"), got two of the hits in the Cards' barrage (one of them a bunt single).

The comprehensive, batter-by-batter play-by-play sheet from ESPN.com is available here; your attention should be directed to the Cardinals' at-bats in the bottom of the fifth inning.

Due to the lateness of the hour (1:37 AM Central), I won't attempt any statistical analyses at the moment. I'll probably revisit the matter, though.

Update: One basic kind of analysis that can be done is to take the pre-August 6 batting average for each hitter who took part in the streak and multiply these together to obtain the overall probability of the Cardinals' accomplishing what they did.

As an analogy, if one wants to know the probability of rolling double-sixes with a pair of dice, one multiplies the chances of a six on each die together, (1/6) X (1/6), to obtain 1/36. This multiplication procedure assumes independence of events (i.e., no effect of one event on the other), an assumption that seems to work pretty well for athletic performance data.

Here are the St. Louis hitters who got at least one hit in the streak, along with the type of hit(s), and their batting averages prior to August 6 (for the position players, these averages are taken from the August 5 box score):

Looper (2: single, bunt single) .161 (use twice in multiplication)
Miles (2: infield single, single) .283 (use twice in multiplication)
Eckstein (single) .286
Taguchi (single) .298 (didn't play August 5, so taken from August 4)
Pujols (single) .316
Encarnacion (single) .289
Rolen (homer) .270
Ludwick (homer) .251

Thus, we're left with:

.161 X .283 X .286 X .298 X .316 X .289 X .270 X .251 X .161 X .283 = .000001

which, as an estimate at least, is 1 in a million.

Of course, each time a player makes an out in a game, his team has a new chance to start a hitting streak. Taking into account the large number of games each team plays in a year and the even larger number of outs it makes, those million opportunities probably come up every several years. Indeed, as alluded to above, there are other teams who share the record of 10 straight hits with the Cardinals.

Thursday, August 02, 2007

A month ago, I conducted runs-test statistical analyses of Alex Rodriguez's alleged tendency to hit homers in bunches. I concluded that there was "very modest evidence" of A-Rod's "being a streaky home-run hitter..."

In the short time since that write-up, Rodriguez has unveiled a new batting stretch -- this time of the cold variety -- to further his credentials as a streaky hitter.

As reported in the ESPN.com article on this afternoon's Yankees loss to the White Sox, A-Rod “ended a career-high hitless streak at 22 at-bats when he singled in the second” (the grey summary box above the article refers to an “0-21 skid,” but I believe 22 is the correct number of at-bats).

His pre-slump batting average was .312 (116/372). This translates into a pre-slump failure rate = 1 - .312 = .688. Raising the latter figure to the 22nd power (for 22 straight at-bats) yields a probability of .0003 (3-in-10,000) of A-Rod having such a drought.

In a bizarre coincidence, the statistical figures of Rodriguez's cold stretch almost exactly parallel those of a 2005 slump by Ichiro Suzuki. As I previously reported:

Seattle’s Ichiro Suzuki, who in 2004 set the single-season record for most hits, suffered through a 0-for-22 slump (longest of his career) in early August 2005. The mid-season Sports Weekly listed him as batting .311 (a failure rate of .689), so the probability of Ichiro’s going hitless in 22 straight official at-bats is .689^22 = .0003.

For the Ichiro analysis, I didn't have his batting average at the exact moment before his slump; I therefore used his average at (roughly) the halfway point of the 2005 season, which would have been a few weeks before his cold spell.

Going back to today's Yankee-White Sox game, another numerical oddity was that, after a scoreless first inning, Chicago scored eight runs in the top of the second, only to have New York put up eight of its own to tie the game. The Yankees clearly would have seemed to have the momentum, but in fact, the White Sox dominated the rest of the way, winning 13-9 (see above-linked article).

Sunday, July 22, 2007

Today's finish to golf's British Open (or "The Open" as the hosts call it) will probably be remembered primarily for the play of Padraig Harrington and Sergio Garcia on the 18th hole in regulation and then the four-hole play-off between the two, won by Harrington.

For sheer hot streaks, though, the Sunday round of Andres Romero, the third-place finisher, would be hard to top. He made 10 birdies for the day, including a stretch of 6-out-of-7 holes in the latter half of the round.

As pointed out during the ABC television broadcast (and can be seen on Romero's scorecard), he had bested par only nine times during the first three days (54 holes) of the tournament (8 birdies and 1 eagle).

Statistical tests on one athlete in one event are always dicey because of the relatively small sample size. However, with the ready availability of online statistical calculators -- in this case, for chi-square -- let's go for it!

We start with a basic 2 X 2 contingency table, with the values referring to numbers of holes (the dashes have been inserted to make sure the spacing comes out right):

----------Below par-----Par or above
Day 1-3--------9------------45------
Day 4---------10-------------8------

The calculator site I used offers three different versions of the chi-square test. Regardless of which one is used, the obtained difference in Romero's percentage of below-par holes between the first three days and the final day would be expected to come up purely by chance less than .005 of the time (5 in 1,000 or 1 in 200). We thus conclude that he performed significantly better on Sunday than during the three previous days.

Of course, the usual cautions apply: I was drawn to doing this analysis by the unusual nature of Romero's spectacular round, I did not test a random cross-section of golfers, and in the aggregate "big picture" of all golfers in all major tournaments, a round like his may not occur any more often than would be expected by chance.

Saturday, July 21, 2007

The Kansas City Royals have been playing some good baseball of late, at least relative to what we'd expect from recent years' incarnations of the team. According to a blog that follows the Royals:

Their June record of 15-12 is their first winning month since July of 2003... realize this team went 22 months with a sub-.500 record. Incredible.

Further, the Royals are 8-6 thus far in July, despite playing against some of the top teams in the American League recently. Here are KC's 2007 game-by-game logs from ESPN.com (first half of season, second half).

Friday, July 20, 2007

With their 6-2 victory over the Arizona Diamondbacks this afternoon, the Chicago Cubs have now won 19 of their last 24 games. An ESPN television graphic showed that the Cubs have steadily increased their month-specific winning percentages from April to July (thus far in the month). As of about a month ago, the Cubs were 32-39 (.451). Their game-by-game log for this season is available here.
Subscribe to: Comments (Atom)

AltStyle によって変換されたページ (->オリジナル) /