MIT Research News' Journal
[Most Recent Entries]
[Calendar View]
Wednesday, August 30th, 2017
| Time |
Event |
| 10:00a |
Robot learns to follow orders like Alexa Despite what you might see in movies, today’s robots are still very limited in what they can do. They can be great for many repetitive tasks, but their inability to understand the nuances of human language makes them mostly useless for more complicated requests.
For example, if you put a specific tool in a toolbox and ask a robot to “pick it up,” it would be completely lost. Picking it up means being able to see and identify objects, understand commands, recognize that the “it” in question is the tool you put down, go back in time to remember the moment when you put down the tool, and distinguish the tool you put down from other ones of similar shapes and sizes.
Recently researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have gotten closer to making this type of request easier: In a new paper, they present an Alexa-like system that allows robots to understand a wide range of commands that require contextual knowledge about objects and their environments. They've dubbed the system “ComText,” for “commands in context.”
The toolbox situation above was among the types of tasks that ComText can handle. If you tell the system that “the tool I put down is my tool,” it adds that fact to its knowledge base. You can then update the robot with more information about other objects and have it execute a range of tasks like picking up different sets of objects based on different commands.
“Where humans understand the world as a collection of objects and people and abstract concepts, machines view it as pixels, point-clouds, and 3-D maps generated from sensors,” says CSAIL postdoc Rohan Paul, one of the lead authors of the paper. “This semantic gap means that, for robots to understand what we want them to do, they need a much richer representation of what we do and say.”
The team tested ComText on Baxter, a two-armed humanoid robot developed for Rethink Robotics by former CSAIL director Rodney Brooks.
The project was co-led by research scientist Andrei Barbu, alongside research scientist Sue Felshin, senior research scientist Boris Katz, and Professor Nicholas Roy. They presented the paper at last week’s International Joint Conference on Artificial Intelligence (IJCAI) in Australia.
How it works
Things like dates, birthdays, and facts are forms of “declarative memory.” There are two kinds of declarative memory: semantic memory, which is based on general facts like the “sky is blue,” and episodic memory, which is based on personal facts, like remembering what happened at a party.
Most approaches to robot learning have focused only on semantic memory, which obviously leaves a big knowledge gap about events or facts that may be relevant context for future actions. ComText, meanwhile, can observe a range of visuals and natural language to glean “episodic memory” about an object’s size, shape, position, type and even if it belongs to somebody. From this knowledge base, it can then reason, infer meaning and respond to commands.
“The main contribution is this idea that robots should have different kinds of memory, just like people,” says Barbu. “We have the first mathematical formulation to address this issue, and we’re exploring how these two types of memory play and work off of each other.”
With ComText, Baxter was successful in executing the right command about 90 percent of the time. In the future, the team hopes to enable robots to understand more complicated information, such as multi-step commands, the intent of actions, and using properties about objects to interact with them more naturally.
For example, if you tell a robot that one box on a table has crackers, and one box has sugar, and then ask the robot to “pick up the snack,” the hope is that the robot could deduce that sugar is a raw material and therefore unlikely to be somebody’s “snack.”
By creating much less constrained interactions, this line of research could enable better communications for a range of robotic systems, from self-driving cars to household helpers.
“This work is a nice step towards building robots that can interact much more naturally with people,” says Luke Zettlemoyer, an associate professor of computer science at the University of Washington who was not involved in the research. “In particular, it will help robots better understand the names that are used to identify objects in the world, and interpret instructions that use those names to better do what users ask.”
The work was funded, in part, by the Toyota Research Institute, the National Science Foundation, the Robotics Collaborative Technology Alliance of the U.S. Army, and the Air Force Research Laboratory. | | 12:00p |
Robotic system monitors specific neurons Recording electrical signals from inside a neuron in the living brain can reveal a great deal of information about that neuron’s function and how it coordinates with other cells in the brain. However, performing this kind of recording is extremely difficult, so only a handful of neuroscience labs around the world do it.
To make this technique more widely available, MIT engineers have now devised a way to automate the process, using a computer algorithm that analyzes microscope images and guides a robotic arm to the target cell.
This technology could allow more scientists to study single neurons and learn how they interact with other cells to enable cognition, sensory perception, and other brain functions. Researchers could also use it to learn more about how neural circuits are affected by brain disorders.
“Knowing how neurons communicate is fundamental to basic and clinical neuroscience. Our hope is this technology will allow you to look at what’s happening inside a cell, in terms of neural computation, or in a disease state,” says Ed Boyden, an associate professor of biological engineering and brain and cognitive sciences at MIT, and a member of MIT’s Media Lab and McGovern Institute for Brain Research.
Boyden is the senior author of the paper, which appears in the Aug. 30 issue of Neuron. The paper’s lead author is MIT graduate student Ho-Jun Suk.
Precision guidance
For more than 30 years, neuroscientists have been using a technique known as patch clamping to record the electrical activity of cells. This method, which involves bringing a tiny, hollow glass pipette in contact with the cell membrane of a neuron, then opening up a small pore in the membrane, usually takes a graduate student or postdoc several months to learn. Learning to perform this on neurons in the living mammalian brain is even more difficult.
There are two types of patch clamping: a “blind” (not image-guided) method, which is limited because researchers cannot see where the cells are and can only record from whatever cell the pipette encounters first, and an image-guided version that allows a specific cell to be targeted.
Five years ago, Boyden and colleagues at MIT and Georgia Tech, including co-author Craig Forest, devised a way to automate the blind version of patch clamping. They created a computer algorithm that could guide the pipette to a cell based on measurements of a property called electrical impedance — which reflects how difficult it is for electricity to flow out of the pipette. If there are no cells around, electricity flows and impedance is low. When the tip hits a cell, electricity can’t flow as well and impedance goes up.
Once the pipette detects a cell, it can stop moving instantly, preventing it from poking through the membrane. A vacuum pump then applies suction to form a seal with the cell’s membrane. Then, the electrode can break through the membrane to record the cell’s internal electrical activity.
The researchers achieved very high accuracy using this technique, but it still could not be used to target a specific cell. For most studies, neuroscientists have a particular cell type they would like to learn about, Boyden says.
“It might be a cell that is compromised in autism, or is altered in schizophrenia, or a cell that is active when a memory is stored. That’s the cell that you want to know about,” he says. “You don’t want to patch a thousand cells until you find the one that is interesting.”
To enable this kind of precise targeting, the researchers set out to automate image-guided patch clamping. This technique is difficult to perform manually because, although the scientist can see the target neuron and the pipette through a microscope, he or she must compensate for the fact that nearby cells will move as the pipette enters the brain.
“It’s almost like trying to hit a moving target inside the brain, which is a delicate tissue,” Suk says. “For machines it’s easier because they can keep track of where the cell is, they can automatically move the focus of the microscope, and they can automatically move the pipette.”
By combining several imaging processing techniques, the researchers came up with an algorithm that guides the pipette to within about 25 microns of the target cell. At that point, the system begins to rely on a combination of imagery and impedance, which is more accurate at detecting contact between the pipette and the target cell than either signal alone.
The researchers imaged the cells with two-photon microscopy, a commonly used technique that uses a pulsed laser to send infrared light into the brain, lighting up cells that have been engineered to express a fluorescent protein.
Using this automated approach, the researchers were able to successfully target and record from two types of cells — a class of interneurons, which relay messages between other neurons, and a set of excitatory neurons known as pyramidal cells. They achieved a success rate of about 20 percent, which is comparable to the performance of highly trained scientists performing the process manually.
Unraveling circuits
This technology paves the way for in-depth studies of the behavior of specific neurons, which could shed light on both their normal functions and how they go awry in diseases such as Alzheimer’s or schizophrenia. For example, the interneurons that the researchers studied in this paper have been previously linked with Alzheimer’s. In a recent study of mice, led by Li-Huei Tsai, director of MIT’s Picower Institute for Learning and Memory, and conducted in collaboration with Boyden, it was reported that inducing a specific frequency of brain wave oscillation in interneurons in the hippocampus could help to clear amyloid plaques similar to those found in Alzheimer’s patients.
“You really would love to know what’s happening in those cells,” Boyden says. “Are they signaling to specific downstream cells, which then contribute to the therapeutic result? The brain is a circuit, and to understand how a circuit works, you have to be able to monitor the components of the circuit while they are in action.”
This technique could also enable studies of fundamental questions in neuroscience, such as how individual neurons interact with each other as the brain makes a decision or recalls a memory.
Bernardo Sabatini, a professor of neurobiology at Harvard Medical School, says he is interested in adapting this technique to use in his lab, where students spend a great deal of time recording electrical activity from neurons growing in a lab dish.
“It’s silly to have amazingly intelligent students doing tedious tasks that could be done by robots,” says Sabatini, who was not involved in this study. “I would be happy to have robots do more of the experimentation so we can focus on the design and interpretation of the experiments.”
To help other labs adopt the new technology, the researchers plan to put the details of their approach on their web site, autopatcher.org.
Other co-authors include Ingrid van Welie, Suhasa Kodandaramaiah, and Brian Allen. The research was funded by Jeremy and Joyce Wertheimer, the National Institutes of Health (including the NIH Single Cell Initiative and the NIH Director’s Pioneer Award), the HHMI-Simons Faculty Scholars Program, and the New York Stem Cell Foundation-Robertson Award. | | 4:00p |
Making data centers more energy efficient Most modern websites store data in databases, and since database queries are relatively slow, most sites also maintain so-called cache servers, which list the results of common queries for faster access. A data center for a major web service such as Google or Facebook might have as many as 1,000 servers dedicated just to caching.
Cache servers generally use random-access memory (RAM), which is fast but expensive and power-hungry. This week, at the International Conference on Very Large Databases, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) are presenting a new system for data center caching that instead uses flash memory, the kind of memory used in most smartphones.
Per gigabyte of memory, flash consumes about 5 percent as much energy as RAM and costs about one-tenth as much. It also has about 100 times the storage density, meaning that more data can be crammed into a smaller space. In addition to costing less and consuming less power, a flash caching system could dramatically reduce the number of cache servers required by a data center.
The drawback to flash is that it’s much slower than RAM. “That’s where the disbelief comes in,” says Arvind, the Charles and Jennifer Johnson Professor in Computer Science Engineering and senior author on the conference paper. “People say, ‘Really? You can do this with flash memory?’ Access time in flash is 10,000 times longer than in DRAM [dynamic RAM].”
But slow as it is relative to DRAM, flash access is still much faster than human reactions to new sensory stimuli. Users won’t notice the difference between a request that takes .0002 seconds to process — a typical round-trip travel time over the internet — and one that takes .0004 seconds because it involves a flash query.
Keeping pace
The more important concern is keeping up with the requests flooding the data center. The CSAIL researchers’ system, dubbed BlueCache, does that by using the common computer science technique of “pipelining.” Before a flash-based cache server returns the result of the first query to reach it, it can begin executing the next 10,000 queries. The first query might take 200 microseconds to process, but the responses to the succeeding ones will emerge at .02-microsecond intervals.
Even using pipelining, however, the CSAIL researchers had to deploy some clever engineering tricks to make flash caching competitive with DRAM caching. In tests, they compared BlueCache to what might be called the default implementation of a flash-based cache server, which is simply a data-center database server configured for caching. (Although slow compared to DRAM, flash is much faster than magnetic hard drives, which it has all but replaced in data centers.) BlueCache was 4.2 times as fast as the default implementation.
Joining Arvind on the paper are first author Shuotao Xu and his fellow MIT graduate student in electrical engineering and computer science Sang-Woo Jun; Ming Liu, who was an MIT graduate student when the work was done and is now at Microsoft Research; Sungjin Lee, an assistant professor of computer science and engineering at the Daegu Gyeongbuk Institute of Science and Technology in Korea, who worked on the project as a postdoc in Arvind’s lab; and Jamey Hicks, a freelance software architect and MIT affiliate who runs the software consultancy Accelerated Tech.
The researchers’ first trick is to add a little DRAM to every BlueCache flash cache — a few megabytes per million megabytes of flash. The DRAM stores a table which pairs a database query with the flash-memory address of the corresponding query result. That doesn’t make cache lookups any faster, but it makes the detection of cache misses — the identification of data not yet imported into the cache — much more efficient.
That little bit of DRAM doesn’t compromise the system’s energy savings. Indeed, because of all of its added efficiencies, BlueCache consumes only 4 percent as much power as the default implementation.
Engineered efficiencies
Ordinarily, a cache system has only three operations: reading a value from the cache, writing a new value to the cache, and deleting a value from the cache. Rather than rely on software to execute these operations, as the default implementation does, Xu developed a special-purpose hardware circuit for each of them, increasing speed and lowering power consumption.
Inside a BlueCache server, the flash memory is connected to the central processor by a wire known as a “bus,” which, like any data connection, has a maximum capacity. BlueCache amasses enough queries to exhaust that capacity before sending them to memory, ensuring that the system is always using communication bandwidth as efficiently as possible.
With all these optimizations, BlueCache is able to perform write operations as efficiently as a DRAM-based system. Provided that each of the query results it’s retrieving is at least eight kilobytes, it’s as efficient at read operations, as well. (Because flash memory returns at least eight kilobytes of data for any request, it’s efficiency falls off for really small query results.)
BlueCache, like most data-center caching systems, is a so-called key-value store, or KV store. In this case, the key is the database query and the value is the response.
"The flash-based KV store architecture developed by Arvind and his MIT team resolves many of the issues that limit the ability of today's enterprise systems to harness the full potential of flash,” says Vijay Balakrishnan, director of the Data Center Performance and Ecosystem program at Samsung Semiconductor’s Memory Solutions Lab. “The viability of this type of system extends beyond caching, since many data-intensive applications use a KV-based software stack, which the MIT team has proven can now be eliminated. By integrating programmable chips with flash and rewriting the software stack, they have demonstrated that a fully scalable, performance-enhancing storage technology, like the one described in the paper, can greatly improve upon prevailing architectures.” |
|