MIT Research News' Journal
 
[Most Recent Entries] [Calendar View]

Tuesday, January 29th, 2019

    Time Event
    10:40a
    Learning to teach to speed up learning

    The first artificial intelligence programs to defeat the world’s best players at chess and the game Go received at least some instruction by humans, and ultimately, would prove no match for a new generation of AI programs that learn wholly on their own, through trial and error.

    A combination of deep learning and reinforcement learning algorithms are responsible for computers achieving dominance at challenging board games like chess and Go, a growing number of video games, including Ms. Pac-Man, and some card games, including poker. But for all the progress, computers still get stuck the closer a game resembles real life, with hidden information, multiple players, continuous play, and a mix of short and long-term rewards that make computing the optimal move hopelessly complex.

    To get past these hurdles, AI researchers are exploring complementary techniques to help robot agents learn, modeled after the way humans pick up new information not only on our own, but from the people around us, and from newspapers, books, and other media. A collective-learning strategy developed by the MIT-IBM Watson AI Lab offers a promising new direction. Researchers show that a pair of robot agents can cut the time it takes to learn a simple navigation task by 50 percent or more when the agents learn to leverage each other’s growing body of knowledge. 

    The algorithm teaches the agents when to ask for help, and how to tailor their advice to what has been learned up until that point. The algorithm is unique in that neither agent is an expert; each is free to act as a student-teacher to request and offer more information. The researchers are presenting their work this week at the AAAI Conference on Artificial Intelligence in Hawaii.

    Co-authors on the paper, which received an honorable mention for best student paper at AAAI, are Jonathan How, a professor in MIT’s Department of Aeronautics and Astronautics; Shayegan Omidshafiei, a former MIT graduate student now at Alphabet's DeepMind; Dong-ki Kim of MIT; Miao Liu, Gerald Tesauro, Matthew Riemer, and Murray Campbell of IBM; and Christopher Amato of Northeastern University.

    “This idea of providing actions to most improve the student's learning, rather than just telling it what to do, is potentially quite powerful,” says Matthew E. Taylor, a research director at Borealis AI, the research arm of the Royal Bank of Canada, who was not involved in the research. “While the paper focuses on relatively simple scenarios, I believe the student/teacher framework could be scaled up and useful in multi-player video games like Dota 2, robot soccer, or disaster-recovery scenarios.”

    For now, the pros still have the edge in Dota2, and other virtual games that favor teamwork and quick, strategic thinking. (Though Alphabet’s AI research arm, DeepMind, recently made news after defeating a professional player at the real-time strategy game, Starcraft.) But as machines get better at maneuvering dynamic environments, they may soon be ready for real-world tasks like managing traffic in a big city or coordinating search-and-rescue teams on the ground and in the air.   

    “Machines lack the common-sense knowledge we develop as children,” says Liu, a former MIT postdoc now at the MIT-IBM lab. “That’s why they need to watch millions of video frames, and spend a lot of computation time, learning to play a game well. Even then, they lack efficient ways to transfer their knowledge to the team, or generalize their skills to a new game. If we can train robots to learn from others, and generalize their learning to other tasks, we can start to better coordinate their interactions with each other, and with humans.” 

    The MIT-IBM team’s key insight was that a team that divides and conquers to learn a new task — in this case, maneuvering to opposite ends of a room and touching the wall at the same time — will learn faster. 

    Their teaching algorithm alternates between two phases. In the first, both student and teacher decide with each respective step whether to ask for, or give, advice based on their confidence that the next move, or the advice they are about to give, will bring them closer to their goal. Thus, the student only asks for advice, and the teacher only gives it, when the added information is likely to improve their performance. With each step, the agents update their respective task policies and theprocess continues until they reach their goal or run out of time. 

    With each iteration, the algorithm records the student’s decisions, the teacher’s advice, and their learning progress as measured by the game’s final score. In the second phase, a deep reinforcement learning technique uses the previously recorded teaching data to update both advising policies. “With each update the teacher gets better at giving the right advice at the right time,” says Kim, a graduate student at MIT.

    In a follow up paper to be discussed in a workshop at AAAI, the researchers improve on the algorithm’s ability to track how well the agents are learning the underlying task — in this case, a box-pushing task — to improve the agents’ ability to give and receive advice. It’s another step that takes the team closer to its longer term goal of entering the RoboCup, an annual robotics competition started by academic AI researchers.  

    “We would need to scale to 11 agents before we can play a game of soccer,” says Tesauro, an IBM researcher who developed the first AI program to master the game of backgammon. “It’s going to take some more work but we’re hopeful.”

    12:10p
    MIT’s REXIS and Bennu’s watery surface

    After flying in space for more than two years, NASA’s spacecraft OSIRIS-REx (Origins, Spectral Interpretation, Resource Identification, Security-Regolith Explorer) recently entered into orbit around its target, the asteroid Bennu. Asteroids like Bennu are considered to be leftover debris from the formation of our solar system. So, in the first mission of its kind flown by NASA, OSIRIS-REx is looking to retrieve a sample and bring it to Earth.

    In addition to several instruments onboard the spacecraft is an MIT student-built one called the REgolith X-Ray Imaging Spectrometer (REXIS), which will provide data to help select the sampling site, as well as other mission objectives, including characterizing the asteroid and its behaviors, and comparing those to ground-based observations. REXIS is a joint project between the MIT Department of Earth, Atmospheric and Planetary Sciences (EAPS), MIT Department of Aeronautics and Astronautics (AeroAstro), the Harvard College Observatory, the MIT Kavli Institute for Astrophysics and Space Research, and MIT Lincoln Laboratory.

    Shortly after arriving at Bennu, OSIRIS-REx researchers announced that they had identified water on the asteroid, possibly impacting selection of the sampling site. EAPS spoke with Richard Binzel — an expert on asteroids at MIT and co-investigator on this mission, leading the development of REXIS — about the instrument’s role and what this finding means for the future use of similar devices. Binzel is also professor of planetary sciences in EAPS with a joint appointment in AeroAstro, and a Margaret MacVicar Faculty Fellow.

    Q: What is the purpose of REXIS, as part of the OSIRIS-REx mission?

    A: The goal of the OSIRIS-REx mission is to obtain a pristine sample from the surface of the asteroid, Bennu, that has some of the most original, surviving chemistry from the very beginning of our solar system. The asteroid is like a time capsule, which is going to tell us what the condition of our solar system was like when it formed 4.56 billion years ago.

    The goal of REXIS is to map the composition of Bennu in support of the mission, choosing the location for that sample. The objective is to go to the asteroid and spend up to a year studying it in detail to determine what location can give us the highest scientific return. It is a matter of progressive evaluation and characterization of the asteroid: We will undergo orbits that gradually go lower to the point where we see the surface in extremely good detail — like the characteristics of craters and boulders. In this way, we know where we're going to touch the surface, grab a sample, and bring it safely onboard the spacecraft.

    To do this, onboard OSIRIS-REx, there's a suite of instruments: visible cameras and spectrometers mostly in the visible and near infrared wavelengths that are mapping the asteroid’s surface, in addition to MIT’s REXIS, the REgolith X-ray Imaging Spectrometer. REXIS complements all the other instruments and contributes to the rest of the data by seeing in X-ray light. No other instrument on OSIRIS-REx will see the surface in X-ray light. So, this is quite unique in planetary exploration, and the fact that it was built by students is even more amazing.

    One of our objectives is to corroborate the mineral mapping that's done by the other instruments. The visible and near infrared spectrometers are sensitive to the mineral composition of the surface, and REXIS measures the individual atomic elements that are present. One of the things that we want to accomplish is to see whether the atomic elements that we measure are consistent with the minerals that the other instruments are measuring and vice versa.

    Q: How does REXIS work?

    A: REXIS works by taking advantage of the sun’s X-ray emissions. Some of those X-rays hit the asteroid and interact with the atoms on the surface: They get absorbed and change the electron energy level in the atoms. When the atoms return to their ground state, they emit an X-ray photon, which means the X-rays from the sun caused the asteroid to glow or fluoresce.

    REXIS measures the energy and the locations of the X-rays that are fluorescing away from the asteroid surface, and the energies tell us which atoms are present. The energy of an X-ray photon that gets emitted by an atom corresponds exactly to the energy between two electron orbitals. Every atom has its own unique signature of energy states, so we can deduce the elemental composition of the surface of the asteroid.

    We're going to be looking for things like iron, silicon, oxygen, and sulfur — some very basic building blocks of planetary bodies. We'll be able to measure those abundances and determine the composition of this asteroid.

    Now, we are performing all sorts of calibration measurements, and we're learning about the characteristics of the instrument in space: ways that it's working as expected and differences. It's part of the instrument design to monitor the sun's output and calibrate the asteroid observations, taking into account any variation from the sun. REXIS has two parts to it: One part is the main spectrometer that is measuring the X-rays emitted from the asteroid surface; the second is a small solar X-ray monitor or SXM, and it is constantly looking at the output of the sun, which varies over timescales of minutes, hours, and days. This way, if we are looking at one location on the asteroid and we see this enormous X-ray fluorescence, we'll know whether it's the asteroid that's special in that location, or whether it was just a solar flare, which happened to be occurring at the same time. We're also looking at the cosmic X-ray background or CXB and calibrating our instrument's sensitivity by looking at a steady, strong X-ray source in the sky called the Crab Nebula.

    We also calibrate REXIS measurements against laboratory measurements of meteorites, and we're going to be able to pinpoint which meteorite type Bennu is most like. If we see any variation across the surface, we'll be able to say which regions have the most similarity to known meteorites, and this can guide us as to where we get our sample. 

    Q: NASA announced that they found evidence of water on Bennu. What does this mean for REXIS and where the sample is taken from?

    A: The OSIRIS-REx mission found evidence for the presence of hydrated minerals on the surface of the asteroid Bennu. These minerals form when water molecules react with rocky material and become part of the crystal structure. Meteorite studies suggest that this process occurred very early in solar system history. This discovery tells us that Bennu’s surface has not been heated to temperatures high enough to break down these minerals and release the water. Bennu appears to contain this primordial water, providing clues to how such material was delivered to Earth, leading to a habitable world.

    This is enticing news for REXIS because one of the atomic elements we are going to be searching for is oxygen, which of course is a major constituent of water, and REXIS has the potential to confirm the finding of these water molecules in the minerals of Bennu.

    A lot of factors go into the decision of where to sample. First of all, we have to determine which parts of the surface are safe to go to, that we know the spacecraft can navigate, get a sample, and come back safely. Then out of all the safe regions, which ones are the most scientifically interesting — based on what we call the science value map. The objective is to have a complete understanding of the composition of the asteroid’s surface and any variability. Then, we want to find a place to sample that we think has the most original organic chemistry from the beginning of the solar system, and so places on Bennu that may have a signature of water would be very interesting to sample.

    Currently, we're still pretty far from the asteroid and slowly advancing to lower orbital distances. We will reach the orbital distance for REXIS to begin its science operations this coming June. Then, REXIS will fingerprint the composition of the asteroid in terms of its atomic elements. When we get the sample back, we'll be able to check whether REXIS got it right. If we did, it means that we can send a REXIS-like instrument anywhere in the solar system and get a reliable fingerprint of the detailed composition of what these objects are made of.

    If REXIS is successful, it shows that with a small instrument you can get big science. Our nickname for REXIS is, "the little spectrometer that could."

    << Previous Day 2019/01/29
    [Calendar]
    Next Day >>

MIT Research News   About LJ.Rossia.org