MIT Research News' Journal
 
[Most Recent Entries] [Calendar View]

Friday, February 28th, 2014

    Time Event
    5:00a
    Give him the hook: New data shows baseball managers when to replace the starting pitcher
    Last October, the Detroit Tigers won the first game of the American League Championship Series against the Boston Red Sox; the Tigers led the second game, 5-1, going into the eighth inning in Boston’s Fenway Park, with one of the league’s best starting pitchers, Max Scherzer, on the mound. They were six outs from taking command of the series.

    Then Tigers manager Jim Leyland made a disastrous decision: He turned the game over to his bullpen, which promptly blew the lead in the eighth inning and lost the game in the ninth inning. Instead of the Tigers holding a 2-0 series lead heading back to their own ballpark for three games, the series was tied, 1-1, and the Red Sox went on to win it in six games.  

    Should Leyland have taken Scherzer out of the game? No, according to a unique model built by two MIT computer scientists, which indicates that major-league baseball managers have significant room to improve their decision-making.

    Indeed, while managers sometimes seem to remove starting pitchers too hastily, as in Scherzer’s case, they even more frequently stick with starting pitchers too long: The study finds that from the fifth inning on, in close games, pitchers who were left in games when the model recommended replacing them allowed runs 60 percent of the time, compared to 43 percent of the time overall.

    “Clearly the most important decision a manager makes is changing pitchers,” says John Guttag, the Dugald C. Jackson Professor of Computer Science and Engineering at MIT. In making those decisions, he adds, “I think there’s definitely room for improvement.”

    Guttag developed the model with Ganeshapillai Gartheeban, one of his PhD students in MIT’s Computer Science and Artificial Intelligence Laboratory. Their paper, “A Data-driven Method for In-game Decision Making in MLB,” is one of eight finalists in the research paper competition at this year’s MIT Sloan Sports Analytics Conference (SSAC), being held today and tomorrow at the Hynes Convention Center in Boston. 

    To conduct the study, Guttag and Gartheeban took data from the 2006 through 2010 major-league baseball seasons. They used the first 80 percent of the games in those seasons to build a model of how pitchers fare over the course of a game, concentrating on Pitcher’s Total Bases (PTB) — an aggregate measure of hits and unintentional walks allowed — as the leading indicator of future performance. PTB, they note, is a more granular measure of pitcher performance than runs allowed.

    “There is a lot of randomness involved in giving up a run,” Gartheeban observes. “If you train a model on that, it will attach itself to noisy patterns. We go below that, to a more fundamental variable.”

    The researchers then tested the model on the final 20 percent of those seasons. Over 21,538 innings, the Guttag-Gartheeban model disagreed with the manager’s decision regarding his starting pitcher 48 percent of the time. About 43 percent of the time, the manager left the starting pitcher in when the model indicated he should be replaced. In just 5 percent of the cases did managers pull starting pitchers when the model suggested they should stay in the game — the scenario from the Tigers-Red Sox game last fall.

    Admittedly, in those latter cases, “there is no way to know how the starter would have done had he not been removed,” as the paper notes.

    By focusing on in-game decision-making, the paper brings to baseball a subject that has proven popular in football — where many studies have shown that teams should go for it on fourth down, rather than kicking. Despite the data, NFL coaches have been slow to change their ways.

    Guttag — a lifelong New York Yankees fan — hopes big-league managers will be quicker to use this kind of data, although he emphasizes that they have to consider many complicating factors: how rested the bullpen is, the upcoming schedule of games, and more.

    “The managers are considering a lot of things,” Guttag acknowledges. “I wouldn’t come to the conclusion that the entire gap [between the model and the actual decisions] is due to managers making bad decisions. The managers may well be making better decisions [in some cases] than we would if we knew all the things they have to consider.”

    Channeling the data deluge

    Founded in 2007 and originally held on MIT’s campus, SSAC has since grown to become the biggest, most prominent event of its kind in global sports. This year’s session includes more than 30 panel discussions, featuring the likes of NBA Commissioner Adam Silver, retired basketball coach Phil Jackson, and Red Sox owner John Henry, among dozens of other prominent coaches, general managers, players, and analysts.

    The conference’s research paper competition — whose winner earns $20,000 — features multiple entries based on the new optical tracking data now being gathered in all 30 NBA arenas. The SportVU system, as it is known, records the coordinates of all players, officials, and the ball, at 25 frames per second.

    Using the SportVU data, Guttag, along with students Jenna Wiens and Armand McQueen, has produced another finalist entry in the SSAC research paper competition, titled “Automatically Recognizing On-Ball Screens.” Like the baseball paper, this one applies machine-learning techniques to a wealth of information.

    In this case, Guttag, Wiens, and McQueen have developed a machine learning-based system for recognizing when an on-ball screen — also known as the pick-and-roll, an essential part of basketball’s offensive flow — occurs in the SportVU data. That could allow coaches and scouts to comb through hours of footage more efficiently.

    “In the modern NBA, on-ball screens are used as really the heart of offenses,” says McQueen, an MIT junior with a strong interest in sports analytics. In a trial run of the technique on 14 NBA games, the researchers found, their system identified 80 percent of on-ball screens with a confidence level of 82 percent.

    Of the teams involved, the researchers also found that the specific screening patterns used by the Golden State Warriors and the Houston Rockets were most similar to each other, the kind of affinity that could provide more assistance to coaches and scouts.

    “It’s the first step in a machine-learning, pattern-recognition system,” says Wiens, who will receive her PhD in electrical engineering and computer science this spring and is applying for academic jobs as a computer science professor. “I think this work is really a proof of concept of what can be done in the NBA, and it’s really the tip of the iceberg.”

    The research was funded by Quanta Computing and the Qatar Computing Research Institute.
    5:00a
    Inside the minds of voters
    Any analysis of exit polling reveals a welter of numbers whose meaning remains slightly elusive, with issues or candidate characteristics described as “very important,” “somewhat important,” or “not important at all” by voters. But it is not always clear how these findings fit together.

    Now, a new paper co-written by an MIT political scientist suggests a way to assess the relative impact of several factors at once, using a method known as “conjoint analysis” that is not currently employed in political polling.

    The method behind conjoint analysis is fairly simple: Respondents in public opinion surveys are given hypothetical matchups between two candidates whose characteristics — say, religion, wealth, ethnic background — are randomly altered in the survey. Given a representative sample of voters making choices based on these hypothetical matchups, it is possible to determine the relative weight the electorate gives to any of these candidate characteristics.

    Does religion matter more than candidate wealth? Conjoint analysis can provide a direct comparison. Moreover, because most voters weigh several factors at once when making choices at the ballot box, conjoint analysis can reveal the relative weight of many factors at once.

    “Researchers are good at examining the effects of single attributes of a candidate on the voting public,” says Teppei Yamamoto, an assistant professor of political science at MIT. “But people actually usually use different dimensions of a candidate when they decide how to vote. We thought this would be an ideal approach for political science, to find out which aspects of political candidates are important to people.”

    The paper, “Causal Inference in Conjoint Analysis,” is published in the latest issue of Political Analysis, a journal that focuses on political science methodology. In addition to Yamamoto, the authors are Jens Hainmueller of Stanford University and Daniel Hopkins of Georgetown University.

    Choosing a candidate, not a car

    The concept of conjoint analysis is not novel, but until now, the method has been applied primarily to marketing, not politics. Automakers, for instance, will use focus groups to determine the relative significance of a vehicle’s many features, which can help direct the design process. However, there is at least one major difference between this kind of market research and political polling: Consumers are less likely than voters to link together various characteristics when making choices.

    “In designing a car, it’s less of a concern that different aspects of the car will interact with each other in consumer decisions,” Yamamoto explains. “People’s preferences about car color are almost independent of whether or not the car has a manual transmission. But in political decisions, or social decisions in general, those aspects can interact with each other.” For example, he suggests, voters might tend to reflexively link candidate characteristics, such as competence, with a candidate’s political party.

    Yamamoto’s paper lays out a precise way in which conjoint analysis could be translated to politics. For one thing, the successful use of conjoint analysis in politics, he thinks, depends on the randomization of the characteristics presented to survey respondents, as a way of decoupling the connections voters tend to make between certain characteristics. If that is done right, Yamamoto believes, the method would prove useful not only to pollsters, but to campaigns and consultants.

    “I can imagine a political consultant wanting to do a sample of constituents and then advising their politician clients that [voters] seem to value certain things a lot, so why not design a campaign emphasizing that,” Yamamoto says.

    Are voters being honest?

    Others think the paper contains significant methodological insight. Kenneth Scheve, a political scientist at Stanford University, calls the work “a terrific paper” that makes the methodology “useful for modern public opinion scholarship,” in part by describing how conjoint analysis can be adapted to political polling. 

    “I expect their approach to be widely applied to any problems of multidimensional choice, certainly those prevalent in politics but really for any choices that individuals make when the object of choice has a number of different potentially relevant features,” Scheve says.

    In the paper, the authors address potential limitations of the method. Pollsters and researchers may be interested in voter preferences that are not easily expressed in terms of rankings, for instance. A more common concern about conjoint analysis is that it relies too heavily on the stated preferences of respondents, which would be problematic if voters were unwilling to provide candid answers about subjects such as ethnicity.

    To examine this concern, Yamamoto and Hainmueller are conducting studies of how closely such expressed preferences match the actions of voters. Currently, they are studying the stated views of Swiss citizens to see if they match election results. Such studies could help reveal the relationship between what voters tell pollsters and what voters really think.

    << Previous Day 2014/02/28
    [Calendar]
    Next Day >>

MIT Research News   About LJ.Rossia.org