MIT Research News' Journal
 
[Most Recent Entries] [Calendar View]

Tuesday, December 19th, 2017

    Time Event
    9:00a
    Auto-tuning data science: New research streamlines machine learning

    The tremendous recent growth of data science — both as a discipline and an application — can be attributed, in part, to its robust problem-solving power: It can predict when credit card transactions are fraudulent, help business owners figure out when to send coupons in order to maximize customer response, or facilitate educational interventions by forecasting when a student is on the cusp of dropping out.

    To get to these data-driven solutions, though, data scientists must shepherd their raw data through a complex series of steps, each one requiring many human-driven decisions. The last step in the process, deciding on a modeling technique, is particularly crucial. There are hundreds of techniques to choose from — from neural networks to support vector machines — and selecting the best one can mean millions of dollars of additional revenue, or the difference between spotting a flaw in critical medical devices and missing it.

    In a paper called "ATM: A distributed, collaborative, scalable system for automated machine learning," which was presented last week at the IEEE International Conference on Big Data, researchers from MIT and Michigan State University present a new system that automates the model selection step, even improving on human performance. The system, called Auto-Tuned Models (ATM), takes advantage of cloud-based computing to perform a high-throughput search over modeling options, and find the best possible modeling technique for a particular problem. It also tunes the model's hyperparameters — a way of optimizing the algorithm — which can have a substantial effect on performance. ATM is now available for enterprise as an open-source platform.

    To compare ATM with human performers, the researchers tested the system against users of a collaborative crowdsourcing platform, openml.org. On this platform, data scientists work together to solve problems, finding the best solution by building on each other's work. ATM analyzed 47 datasets from the platform and was able to deliver a solution better than the one humans had come up with 30 percent of the time. When it couldn’t outperform humans, it came very close, and crucially, it worked much more quickly than humans could. While open-ml users take an average of 100 days to deliver a near-optimal solution, ATM can arrive at an answer in less than a day.

    Empowering data scientists

    This level of speed and accuracy offers much-needed peace of mind for data scientists, who are often plagued by "what-ifs." "There are so many options,” says Arun Ross, professor in the computer science and engineering department at Michigan State University and a senior author on the paper. “If a data scientist chose support vector machines as a modeling technique, the question of whether a neural network or a different model would have resulted in better accuracy always lingers in her mind.”

    Over the past few years, the problem of model selection/tuning has become the focus of a whole new subfield of machine learning, known as Auto-ML. Auto-ML solutions aim to provide data scientists with the best possible model for a given machine-learning task. There’s just one problem: Competing Auto-ML approaches yield different results, and their methods are often opaque. In other words, while seeking to solve one selection problem, the community created another that is even more complex. "The 'what-if' question still remains," says Kalyan Veeramachaneni, a principal research scientist at MIT’s Laboratory for Information and Decision Systems (LIDS) and a senior author on the paper. “It simply shifts to, ‘what if we used a different Auto-ML approach?’"

    The ATM system works differently, using on-demand cloud computing to generate and compare hundreds (or even thousands) of models overnight. To search through techniques, researchers use an intelligent selection mechanism. The system tests thousands of models in parallel, evaluates each, and allocates more computational resources to those techniques that show promise. Poor solutions fall by the wayside, while the best options rise to the top.

    Rather than blindly choosing the “best” one and providing it to the user, ATM displays results as a distribution, allowing for comparison of different methods side-by-side. In this way, Ross says, ATM speeds up the process of testing and comparing different modeling approaches without automating out human intuition, which remains a vital part of the data science process.

    Open-source, community-driven approach

    By streamlining the process of model choice, Veeramachaneni and his team aim to allow data scientists to work on more impactful parts of the pipeline. "We hope that our system will free up experts to spend more time on understanding the data, problem formulation, and feature engineering," Veeramachaneni says.

    To that end, the researchers are open-sourcing ATM, making it available to enterprises who might want to use it. They have also included provisions that allow researchers to integrate new model selection techniques and thus continually improve on the platform. ATM can run on a single machine, local computing clusters, or on-demand clusters in the cloud, and can work with multiple data sets and multiple users simultaneously.

    "A small- to medium-sized data science team can set up and start producing models with just a few steps," Veeramachaneni says. And none of those are followed by a "what-if."

    9:10a
    MIT Teaching Systems Lab to address assessments in maker education

    The National Science Foundation (NSF) has awarded an early-concept grant for exploratory research to the Teaching Systems Lab at MIT .

    MIT will partner with Maker Ed, a national nonprofit, to develop and evaluate practices of embedded assessment in maker-centered learning. The two-year grant, Beyond Rubrics: Moving Towards Embedded Assessment in Maker Education, will support design-based research in middle school science and engineering classrooms.

    The San Mateo County Office of Education in Northern California and the Albemarle County Public Schools in Charlottesville, Virginia, have also signed on to co-create embedded assessments for maker-centered curricula. NSF early-concept grants support beginning-stage exploratory research, and the Beyond Rubrics project is considered “high risk, high payoff” because of its potential to apply a novel approach to school-based assessment in maker education and STEM classrooms.  

    Maker education — an open-ended, process-driven, youth-centered learning approach — has grown in popularity in K-12 education over the past decade. However, one of the greatest challenges of implementing making in schools is the question of how to assess collaborative, interdisciplinary, and iterative making practices and outcomes.

    “The assessment science community has been innovating embedded assessments in rich digital learning environments,” says Yoon Jeon Kim, the co-principal investigator of the project. “We can apply this knowledge to envision what seamlessly integrated assessment in maker-centered learning can look like in schools and facilitate educators’ assessment capacities beyond the simple use of rubrics. Our project will be sensitive to the maker education advocates’ concerns regarding assessment by innovatively thinking about assessment in this context without losing the richness and complexity that characterize learning in a maker classroom.”

    Stephanie Chang, Maker Ed’s interim executive director, says maker education “continues to grow in thoughtful, diverse, and authentic ways, and as it’s integrated into K-12 environments across the country, assessment is a critical part of the learning equation, not just an afterthought. We at Maker Ed are so excited to invest in this work with three incredible partners and with the support of NSF.”

    The participating school districts also recognize the importance of good assessment in maker-centered learning and how this project can be impactful.

    “K-12 education today is at a critical crossroads,” says Superintendent Pamela R. Moran of the Albemarle County Public Schools. “One path, steeped in traditional teaching to the test methods, narrows career choices.  The other takes us in a different direction by connecting learning to the life skills that prepare students for success regardless of career choice. We need to be able to accurately measure the quality and relevance of student learning. We are grateful to be partnered with Maker Ed and with MIT’s Teaching Systems Lab to develop those measures.”

    San Mateo County Superintendent of Schools Anne Campbell said her district is “excited to have the opportunity to partner with Maker Ed and MIT to develop assessment strategies for maker education.”

    “We’re especially eager to work with our partners to develop techniques for evaluating learning as it occurs in real time while students are actively engaged in making,” Campbell says.

    Justin Reich, executive director of the Teaching Systems Lab, says maker education is closely aligned with MIT’s motto: “mens et manus” (“mind and hand”).

    “At MIT we take that seriously as part of our educational philosophy,” Reich says. “Students have powerful learning experiences as they make, tinker, break, struggle, and succeed in spaces that emphasize hands-on, minds-on learning. With this project, we’re committed to experimenting with new assessment tools and approaches that help students and educators better understand the learning that takes place in makerspaces.”

    More information about the Beyond Rubrics project is available via the Teaching Systems Lab.

    11:59p
    Street signs

    Day after day in early 2011, massive crowds gathered in Cairo’s Tahrir Square, calling for the ouster of Egyptian President Hosni Mubarak. Away from the square, the protests had another effect, as a study co-authored by an MIT professor shows. The demonstrations lowered the stock market valuations of politically connected firms — and showed how much people thought a full democratic revolution was possible.

    “When there’s street mobilization, you expect that the future will be different,” says MIT economist Daron Acemoglu, co-author of a paper detailing the results.

    The study opens a keyhole into the hopes and fears of Egyptians at a time of great political uncertainty. After weeks of protest, caused in part by perceptions of government corruption, Mubarak resigned in February 2011, replaced by an interim military government. The moment passed, however. In June 2012, the Islamist leader Mohamed Morsi was elected president, only to be replaced by another phase of military rule, starting in July 2013. Military leader Abdel Fattah el-Sisi was then elected president in May 2014 with 97 percent of the vote.

    Still, in the first half of 2011, an open democracy seemed conceivable — indeed, a democratic revolution was taking place in Tunisia — and that was reflected in market sentiment. In the nine days of market activity after Mubarak left power, the valuations of the firms most politically connected to his National Democratic Party (NDP) fell by 13 percent relative to other firms.

    Moreover, the support for NDP-linked stocks was not shifted to firms linked to other power centers in Egyptian life, including the military or Morsi’s Muslim Brotherhood. Investors were, in part, devaluing the worth of political connections in the country.

    “It’s not just redistribution of a given amount of spoils, but perhaps street mobilization is reducing what the market thinks the available spoils are,” Acemoglu says of investor activity in early 2011.

    More specifically, Acemoglu adds, some investors thought politically connected firms would be “less capable of capturing rents,” the revenues flowing from noncompetitive business activity, and would have “less room for engaging in these corrupt activities.”

    The study also shows a connection to protest-crowd size; an estimated one-day turnout of 500,000 protestors in Tahrir Square would lower the valuation of NDP-connected firms by 0.8 percent relative to other listed firms.

    Among its other findings, the study sheds light on the much-discussed relationship between social media and the Arab Spring uprisings of 2011. In this case, the scholars also found that Twitter activity forecast the amount of street protest that would ensue. By itself, social media activity did not immediately affect stock market valuations, but by encouraging public demonstrations, it had an indirect effect.

    The paper, “The Power of the Street: Evidence from Egypt’s Arab Spring,” is forthcoming in print form by the Review of Financial Studies and currently appears in advance online form. The authors are Acemoglu, the Elizabeth and James Killian Professor of Economics at MIT; Tarek A. Hassan, an associate professor of economics at Boston University; and Ahmed Tahoun, an assistant professor of accounting at London Business School.

    Taking stock of protests

    To conduct the study, the researchers used stock-market data concerning 177 firms listed on the Egyptian stock exchange in early 2011, and examined daily closing prices for those firms between 2005 and 2013, as well as total firm assets and leverage (the amount of debt as a fraction of total assets).

    Looking at board members and principal shareholders, Acemoglu and his colleagues divided the firms into four main groups: those with connections to the NDP, those with military connections, those with Muslim brotherhood connections, and those that were unconnected to the other groups.

    The scholars also used published estimates of crowd sizes from Tahrir Square demonstrations, and to derive the conclusions about Twitter they examined 311 million tweets by over 300,000 Egyptian accounts between Jan. 1, 2001, and July 31, 2013.

    In the paper, the researchers consider but largely rule out a couple of alternate explanations for stock market behavior during this time. One would be that Mubarak’s fall simply created instability which affected firms in varying ways. But the study controls for firm-level qualities and industrial sectors, and the devaluation effect was specific to NDP-aligned companies.

    A second possible alternative is that the stock market was still expecting top-down control over the Egyptian government, but investors were simply altering their bets and identifying the next group of firms they expected to benefit from useful political connections. Acemoglu says that “is definitely a possibility” in theory, but as the paper notes, “there is no evidence of such offsetting shifts” in market investments.

    To be clear, the 13 percent drop experienced by NDP-connected firms shows that many investors were not fazed by the protests, or at least did not expect the protests to lead to massive political changes. On the other hand, a significant portion did think that a ground-up, populist uprising could succeed — even if that ultimately proved not to be the case.

    “It’s whoever the marginal investor is, and obviously the marginal investor was wrong,” Acemoglu says. “If you had perfect foresight, the day Mubarak fell, you would just be selling all the NDP stocks but buying all the military stocks.”

    The social media moment

    The study’s Twitter data suggest a slightly more subtle picture than some commentators described during the eventful days of 2011. Twitter activity did not lead to immediate stock market effects. On the other hand, protest hashtags did predict the occurrence of large demonstrations, and those protests subsequently moved the market.

    “You can scream and shout whatever you want on social media, and it doesn’t [directly] change anything, but if social media acts as a vehicle for people organizing, then it might have an effect,” Acemoglu says.

    Acemoglu, who pursues research projects in multiple areas of economics, is perhaps best-known for his work on the relationship between democratic institutions and economic growth, which is summarized in his 2012 book “Why Nations Fail” but remains very much an ongoing project.

    At the same time, Acemoglu has conducted a wide-ranging series of studies analyzing and modeling political change in many countries, often with political scientist James Robinson of the University of Chicago. The paper on Egypt flows, in part, from that vein of research. It also builds on other academic studies of other countries, such as a 2001 paper demonstrating that links to the Indonesian government accounted for about a quarter of the value of well-connected firms in that country during the 1990s.

    Acemoglu, for one, says he does not anticipate a sea change in Egypt’s current system of government any time soon. The current stasis in the country’s politics makes it all the more useful, however, to consider how fluid the political situation appeared in real time as recently as six years ago.

    “Looking at it from the vantage point of 2011, none of that was obvious, that Tunisia would go one way, Egypt would go another way,” Acemoglu says.

    << Previous Day 2017/12/19
    [Calendar]
    Next Day >>

MIT Research News   About LJ.Rossia.org