MIT Research News' Journal
 
[Most Recent Entries] [Calendar View]

Thursday, March 21st, 2019

    Time Event
    11:48a
    Kicking neural network automation into high gear

    A new area in artificial intelligence involves using algorithms to automatically design machine-learning systems known as neural networks, which are more accurate and efficient than those developed by human engineers. But this so-called neural architecture search (NAS) technique is computationally expensive.

    A state-of-the-art NAS algorithm recently developed by Google to run on a squad of graphical processing units (GPUs) took 48,000 GPU hours to produce a single convolutional neural network, which is used for image classification and detection tasks. Google has the wherewithal to run hundreds of GPUs and other specialized hardware in parallel, but that’s out of reach for many others.

    In a paper being presented at the International Conference on Learning Representations in May, MIT researchers describe an NAS algorithm that can directly learn specialized convolutional neural networks (CNNs) for target hardware platforms — when run on a massive image dataset — in only 200 GPU hours, which could enable far broader use of these types of algorithms.

    Resource-strapped researchers and companies could benefit from the time- and cost-saving algorithm, the researchers say. The broad goal is “to democratize AI,” says co-author Song Han, an assistant professor of electrical engineering and computer science and a researcher in the Microsystems Technology Laboratories at MIT. “We want to enable both AI experts and nonexperts to efficiently design neural network architectures with a push-button solution that runs fast on a specific hardware.”

    Han adds that such NAS algorithms will never replace human engineers. “The aim is to offload the repetitive and tedious work that comes with designing and refining neural network architectures,” says Han, who is joined on the paper by two researchers in his group, Han Cai and Ligeng Zhu.

    “Path-level” binarization and pruning

    In their work, the researchers developed ways to delete unnecessary neural network design components, to cut computing times and use only a fraction of hardware memory to run a NAS algorithm. An additional innovation ensures each outputted CNN runs more efficiently on specific hardware platforms — CPUs, GPUs, and mobile devices — than those designed by traditional approaches. In tests, the researchers’ CNNs were 1.8 times faster measured on a mobile phone than traditional gold-standard models with similar accuracy.

    A CNN’s architecture consists of layers of computation with adjustable parameters, called “filters,” and the possible connections between those filters. Filters process image pixels in grids of squares — such as 3x3, 5x5, or 7x7 — with each filter covering one square. The filters essentially move across the image and combine all the colors of their covered grid of pixels into a single pixel. Different layers may have different-sized filters, and connect to share data in different ways. The output is a condensed image — from the combined information from all the filters — that can be more easily analyzed by a computer.

    Because the number of possible architectures to choose from — called the “search space” — is so large, applying NAS to create a neural network on massive image datasets is computationally prohibitive. Engineers typically run NAS on smaller proxy datasets and transfer their learned CNN architectures to the target task. This generalization method reduces the model’s accuracy, however. Moreover, the same outputted architecture also is applied to all hardware platforms, which leads to efficiency issues.

    The researchers trained and tested their new NAS algorithm on an image classification task directly in the ImageNet dataset, which contains millions of images in a thousand classes. They first created a search space that contains all possible candidate CNN “paths” — meaning how the layers and filters connect to process the data. This gives the NAS algorithm free reign to find an optimal architecture.

    This would typically mean all possible paths must be stored in memory, which would exceed GPU memory limits. To address this, the researchers leverage a technique called “path-level binarization,” which stores only one sampled path at a time and saves an order of magnitude in memory consumption. They combine this binarization with “path-level pruning,” a technique that traditionally learns which “neurons” in a neural network can be deleted without affecting the output. Instead of discarding neurons, however, the researchers’ NAS algorithm prunes entire paths, which completely changes the neural network’s architecture.

    In training, all paths are initially given the same probability for selection. The algorithm then traces the paths — storing only one at a time — to note the accuracy and loss (a numerical penalty assigned for incorrect predictions) of their outputs. It then adjusts the probabilities of the paths to optimize both accuracy and efficiency. In the end, the algorithm prunes away all the low-probability paths and keeps only the path with the highest probability — which is the final CNN architecture.

    Hardware-aware

    Another key innovation was making the NAS algorithm “hardware-aware,” Han says, meaning it uses the latency on each hardware platform as a feedback signal to optimize the architecture. To measure this latency on mobile devices, for instance, big companies such as Google will employ a “farm” of mobile devices, which is very expensive. The researchers instead built a model that predicts the latency using only a single mobile phone.

    For each chosen layer of the network, the algorithm samples the architecture on that latency-prediction model. It then uses that information to design an architecture that runs as quickly as possible, while achieving high accuracy. In experiments, the researchers’ CNN ran nearly twice as fast as a gold-standard model on mobile devices.

    One interesting result, Han says, was that their NAS algorithm designed CNN architectures that were long dismissed as being too inefficient — but, in the researchers’ tests, they were actually optimized for certain hardware. For instance, engineers have essentially stopped using 7x7 filters, because they’re computationally more expensive than multiple, smaller filters. Yet, the researchers’ NAS algorithm found architectures with some layers of 7x7 filters ran optimally on GPUs. That’s because GPUs have high parallelization — meaning they compute many calculations simultaneously — so can process a single large filter at once more efficiently than processing multiple small filters one at a time.

    “This goes against previous human thinking,” Han says. “The larger the search space, the more unknown things you can find. You don’t know if something will be better than the past human experience. Let the AI figure it out.”

    The work was supported, in part, by the MIT Quest for Intelligence, the MIT-IBM Watson AI lab, SenseTime, and Xilinx.

    11:59p
    Energy monitor can find electrical failures before they happen

    A new system devised by researchers at MIT can monitor the behavior of all electric devices within a building, ship, or factory, determining which ones are in use at any given time and whether any are showing signs of an imminent failure. When tested on a Coast Guard cutter, the system pinpointed a motor with burnt-out wiring that could have led to a serious onboard fire.

    The new sensor, whose readings can be monitored on an easy-to-use graphic display called a NILM (non-intrusive load monitoring) dashboard, is described in the March issue of IEEE Transactions on Industrial Informatics, in a paper by MIT professor of electrical engineering Steven Leeb, recent graduate Andre Aboulian MS ’18, and seven others at MIT, the U.S. Coast Guard, and the U.S. Naval Academy. A second paper will appear in the April issue of Marine Technology, the publication of the Society of Naval Architects and Marine Engineers.

    The system uses a sensor that simply is attached to the outside of an electrical wire at a single point, without requiring any cutting or splicing of wires. From that single point, it can sense the flow of current in the adjacent wire, and detect the distinctive “signatures” of each motor, pump, or piece of equipment in the circuit by analyzing tiny, unique fluctuations in the voltage and current whenever a device switches on or off. The system can also be used to monitor energy usage, to identify possible efficiency improvements and determine when and where devices are in use or sitting idle.

    The technology is especially well-suited for relatively small, contained electrical systems such as those serving a small ship, building, or factory with a limited number of devices to monitor. In a series of tests on a Coast Guard cutter based in Boston, the system provided a dramatic demonstration last year.

    About 20 different motors and devices were being tracked by a single dashboard, connected to two different sensors, on the cutter USCGC Spencer. The sensors, which in this case had a hard-wired connection, showed that an anomalous amount of power was being drawn by a component of the ship’s main diesel engines called a jacket water heater. At that point, Leeb says, crewmembers were skeptical about the reading but went to check it anyway. The heaters are hidden under protective metal covers, but as soon as the cover was removed from the suspect device, smoke came pouring out, and severe corrosion and broken insulation were clearly revealed.

    “The ship is complicated,” Leeb says. “It’s magnificently run and maintained, but nobody is going to be able to spot everything.”

    Lt. Col. Nicholas Galanti, engineer officer on the cutter, says “the advance warning from NILM enabled Spencer to procure and replace these heaters during our in-port maintenance period, and deploy with a fully mission-capable jacket water system. Furthermore, NILM detected a serious shock hazard and may have prevented a class Charlie [electrical] fire in our engine room.”

    The system is designed to be easy to use with little training. The computer dashboard features dials for each device being monitored, with needles that will stay in the green zone when things are normal, but swing into the yellow or red zone when a problem is spotted.

    Detecting anomalies before they become serious hazards is the dashboard’s primary task, but Leeb points out that it can also perform other useful functions. By constantly monitoring which devices are being used at what times, it could enable energy audits to find devices that were turned on unnecessarily when nobody was using them, or spot less-efficient motors that are drawing more current than their similar counterparts. It could also help ensure that proper maintenance and inspection procedures are being followed, by showing whether or not a device has been activated as scheduled for a given test.

    “It’s a three-legged stool,” Leeb says. The system allows for “energy scorekeeping, activity tracking, and condition-based monitoring.” But it’s that last capability that could be crucial, “especially for people with mission-critical systems,” he says. In addition to the Coast Guard and the Navy, he says, that includes companies such as oil producers or chemical manufacturers, who need to monitor factories and field sites that include flammable and hazardous materials and thus require wide safety margins in their operation.

    One important characteristic of the system that is attractive for both military and industrial applications, Leeb says, is that all of its computation and analysis can be done locally, within the system itself, and does not require an internet connection at all, so the system can be physically and electronically isolated and thus highly resistant to any outside tampering or data theft.

    Although for testing purposes the team has installed both hard-wired and noncontact versions of the monitoring system — both types were installed in different parts of the Coast Guard cutter — the tests have shown that the noncontact version could likely produce sufficient information, making the installation process much simpler. While the anomaly they found on that cutter came from the wired version, Leeb says, “if the noncontact version was installed” in that part of the ship, “we would see almost the same thing.”

    The research team also included graduate students Daisy Green, Jennifer Switzer, Thomas Kane, and Peer Lindahl at MIT; Gregory Bredariol of the U.S. Coast Guard; and John Donnal of the U.S. Naval Academy in Annapolis, Maryland. The research was funded by the U.S. Navy’s Office of Naval Research NEPTUNE project, through the MIT Energy Initiative.

    << Previous Day 2019/03/21
    [Calendar]
    Next Day >>

MIT Research News   About LJ.Rossia.org