MIT Research News' Journal

MIT Research News' Journal

[Most Recent Entries] [Calendar View]

Monday, May 6th, 2019

Time	Event
`10:59a`	Merging cell datasets, panorama style A new algorithm developed by MIT researchers takes cues from panoramic photography to merge massive, diverse cell datasets into a single source that can be used for medical and biological studies. Single-cell datasets profile the gene expressions of human cells — such as a neurons, muscles, and immune cells — to gain insight into human health and treating disease. Datasets are produced by a range of labs and technologies, and contain extremely diverse cell types. Combining these datasets into a single data pool could open up new research possibilities, but that’s difficult to do effectively and efficiently. Traditional methods tend to cluster cells together based on nonbiological patterns — such as by lab or technologies used — or accidentally merge dissimilar cells that appear the same. Methods that correct these mistakes don’t scale well to large datasets, and require all merged datasets share at least one common cell type. In a paper published today in Nature Biotechnology, the MIT researchers describe an algorithm that can efficiently merge more than 20 datasets of vastly differing cell types into a larger “panorama.” The algorithm, called “Scanorama,” automatically finds and stitches together shared cell types between two datasets — like combining overlapping pixels in images to generate a panoramic photo. As long as any other dataset shares one cell type with any one dataset in the final panorama, it can also be merged. But all of the datasets don’t need to have a cell type in common. The algorithm preserves all cell types specific to every dataset. “Traditional methods force cells to align, regardless of what the cell types are. They create a blob with no structure, and you lose all interesting biological differences,” says Brian Hie, a PhD student in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and a researcher in the Computation and Biology group. “You can give Scanorama datasets that shouldn’t align together, and the algorithm will separate the datasets according to biological differences.” In their paper, the researchers successfully merged more than 100,000 cells from 26 different datasets containing a wide range of human cells, creating a single, diverse source of data. With traditional methods, that would take roughly a day’s worth of computation, but Scanorama completed the task in about 30 minutes. The researchers say the work represents the highest number of datasets ever merged together. Joining Hie on the paper are: Bonnie Berger, the Simons Professor of Mathematics at MIT, a professor of electrical engineering and computer science, and head of the Computation and Biology group; and Bryan Bryson, an MIT assistant professor of biological engineering. Linking “mutual neighbors” Humans have hundreds of categories and subcategories of cells, and each cell expresses a diverse set of genes. Techniques such as RNA sequencing capture that information in sprawling multidimensional space. Cells are points scattered around the space, and each dimension corresponds to the expression of a different gene. Scanorama runs a modified computer-vision algorithm, called “mutual nearest neighbors matching,” which finds the closest (most similar) points in two computational spaces. Developed at CSAIL, the algorithm was initially used to find pixels with matching features — such as color levels — in dissimilar photos. That could help computers match a patch of pixels representing an object in one image to the same patch of pixels in another image where the object’s position has been drastically altered. It could also be used for stitching vastly different images together in a panorama. The researchers repurposed the algorithm to find cells with overlapping gene expression — instead of overlapping pixel features — and in multiple datasets instead of two. The level of gene expression in a cell determines its function and, in turn, its location in the computational space. If stacked on top of one another, cells with similar gene expression, even if they’re from different datasets, will be roughly in the same locations. For each dataset, Scanorama first links each cell in one dataset to its closest neighbor among all datasets, meaning they’ll most likely share similar locations. But the algorithm only retains links where cells in both datasets are each other’s nearest neighbor — a mutual link. For instance, if Cell A’s nearest neighbor is Cell B, and Cell B’s is Cell A, it’s a keeper. If, however, Cell B’s nearest neighbor is a separate Cell C, then the link between Cell A and B will be discarded. Keeping mutual links increases the likelihood that the cells are, in fact, the same cell types. Breaking the nonmutual links, on the other hand, prevents cell types specific to each dataset from merging with incorrect cell types. Once all mutual links are found, the algorithm stitches all dataset sequences together. In doing so, it combines the same cell types but keeps cell types unique to any datasets separated from the merged cells. “The mutual links form anchors that enable [correct] cell alignment across datasets,” Berger says. Shrinking data, scaling up To ensure Scanorama scales to large datasets, the researchers incorporated two optimization techniques. The first reduces the dataset dimensionality. Each cell in a dataset could potentially have up to 20,000 gene expression measurements and as many dimensions. The researchers leveraged a mathematical technique that summarizes high-dimensional data matrices with a small number of features while retaining vital information. Basically, this led to a 100-fold reduction in the dimensions. They also used a popular hashing technique to find nearest mutual neighbors more quickly. Traditionally, computing on even the reduced samples would take hours. But the hashing technique basically creates buckets of nearest neighbors by their highest probabilities. The algorithm need only search the highest probability buckets to find mutual links, which reduces the search space and makes the process far less computationally intensive. In separate work, the researchers combined Scanorama with another technique they developed that generates comprehensive samples — or “sketches” — of massive cell datasets that reduced the time of combining more than 500,000 cells from two hours down to eight minutes. To do so, they generated the “geometric sketches,” ran Scanorama on them, and extrapolated what they learned about merging the geometric sketches to the larger datasets. This technique itself derives from compressive genomics, which was developed by Berger’s group. “Even if you need to sketch, integrate, and reapply that information to the full datasets, it was still an order of magnitude faster than combining entire datasets,” Hie says. (Comment on this)
`12:04p`	North Atlantic Ocean productivity has dropped 10 percent during Industrial era Virtually all marine life depends on the productivity of phytoplankton — microscopic organisms that work tirelessly at the ocean’s surface to absorb the carbon dioxide that gets dissolved into the upper ocean from the atmosphere. Through photosynthesis, these microbes break down carbon dioxide into oxygen, some of which ultimately gets released back to the atmosphere, and organic carbon, which they store until they themselves are consumed. This plankton-derived carbon fuels the rest of the marine food web, from the tiniest shrimp to giant sea turtles and humpback whales. Now, scientists at MIT, Woods Hole Oceanographic Institution (WHOI), and elsewhere have found evidence that phytoplankton’s productivity is declining steadily in the North Atlantic, one of the world’s most productive marine basins. In a paper appearing today in Nature, the researchers report that phytoplankton’s productivity in this important region has gone down around 10 percent since the mid-19th century and the start of the Industrial era. This decline coincides with steadily rising surface temperatures over the same period of time. Matthew Osman, the paper’s lead author and a graduate student in MIT’s Department of Earth, Atmospheric, and Planetary Sciences, says there are indications that phytoplankton’s productivity may decline further as temperatures continue to rise as a result of human-induced climate change. “It’s a significant enough decine that we should be concerned,” Osman says. “The amount of productivity in the oceans roughly scales with how much phytoplankton you have. So this translates to 10 percent of the marine food base in this region that’s been lost over the industrial era. If we have a growing population but a decreasing food base, at some point we’re likely going to feel the effects of that decline.” Drilling through “pancakes” of ice Osman and his colleagues looked for trends in phytoplankton’s productivity using the molecular compound methanesulfonic acid, or MSA. When phytoplankton expand into large blooms, certain microbes emit dimethylsulfide, or DMS, an aerosol that is lofted into the atmosphere and eventually breaks down as either sulfate aerosol, or MSA, which is then deposited on sea or land surfaces by winds. “Unlike sulfate, which can have many sources in the atmosphere, it was recognized about 30 years ago that MSA had a very unique aspect to it, which is that it’s only derived from DMS, which in turn is only derived from these phytoplankton blooms,” Osman says. “So any MSA you measure, you can be confident has only one unique source — phytoplankton.” In the North Atlantic, phytoplankton likely produced MSA that was deposited to the north, including across Greenland. The researchers measured MSA in Greenland ice cores — in this case using 100- to 200-meter-long columns of snow and ice that represent layers of past snowfall events preserved over hundreds of years. “They’re basically sedimentary layers of ice that have been stacked on top of each other over centuries, like pancakes,” Osman says. The team analyzed 12 ice cores in all, each collected from a different location on the Greenland ice sheet by various groups from the 1980s to the present. Osman and his advisor Sarah Das, an associate scientist at WHOI and co-author on the paper, collected one of the cores during an expedition in April 2015. “The conditions can be really harsh,” Osman says. “It’s minus 30 degrees Celsius, windy, and there are often whiteout conditions in a snowstorm, where it’s difficult to differentiate the sky from the ice sheet itself.” The team was nevertheless able to extract, meter by meter, a 100-meter-long core, using a giant drill that was delivered to the team’s location via a small ski-equipped airplane. They immediately archived each ice core segment in a heavily insulated cold storage box, then flew the boxes on “cold deck flights” — aircraft with ambient conditions of around minus 20 degrees Celsius. Once the planes touched down, freezer trucks transported the ice cores to the scientists’ ice core laboratories. “The whole process of how one safely transports a 100-meter section of ice from Greenland, kept at minus-20-degree conditions, back to the United States is a massive undertaking,” Osman says. Cascading effects The team incorporated the expertise of researchers at various labs around the world in analyzing each of the 12 ice cores for MSA. Across all 12 records, they observed a conspicuous decline in MSA concentrations, beginning in the mid-19th century, around the start of the Industrial era when the widescale production of greenhouse gases began. This decline in MSA is directly related to a decline in phytoplankton productivity in the North Atlantic. “This is the first time we’ve collectively used these ice core MSA records from all across Greenland, and they show this coherent signal. We see a long-term decline that originates around the same time as when we started perturbing the climate system with industrial-scale greenhouse-gas emissions,” Osman says. “The North Atlantic is such a productive area, and there’s a huge multinational fisheries economy related to this productivity. Any changes at the base of this food chain will have cascading effects that we’ll ultimately feel at our dinner tables.” The multicentury decline in phytoplankton productivity appears to coincide not only with concurrent long-term warming temperatures; it also shows synchronous variations on decadal time-scales with the large-scale ocean circulation pattern known as the Atlantic Meridional Overturning Circulation, or AMOC. This circulation pattern typically acts to mix layers of the deep ocean with the surface, allowing the exchange of much-needed nutrients on which phytoplankton feed. In recent years, scientists have found evidence that AMOC is weakening, a process that is still not well-understood but may be due in part to warming temperatures increasing the melting of Greenland’s ice. This ice melt has added an influx of less-dense freshwater to the North Atlantic, which acts to stratify, or separate its layers, much like oil and water, preventing nutrients in the deep from upwelling to the surface. This warming-induced weakening of the ocean circulation could be what is driving phytoplankton’s decline. As the atmosphere warms the upper ocean in general, this could also further the ocean’s stratification, worsening phytoplankton’s productivity. “It’s a one-two punch,” Osman says. “It’s not good news, but the upshot to this is that we can no longer claim ignorance. We have evidence that this is happening, and that’s the first step you inherently have to take toward fixing the problem, however we do that.” This research was supported in part by the National Science Foundation (NSF), the National Aeronautics and Space Administration (NASA), as well as graduate fellowship support from the US Department of Defense Office of Naval Research. (Comment on this)
`2:59p`	A new approach to targeting tumors and tracking their spread The spread of malignant cells from an original tumor to other parts of the body, known as metastasis, is the main cause of cancer deaths worldwide. Early detection of tumors and metastases could significantly improve cancer survival rates. However, predicting exactly when cancer cells will break away from the original tumor, and where in the body they will form new lesions, is extremely challenging. There is therefore an urgent need to develop new methods to image, diagnose, and treat tumors, particularly early lesions and metastases. In a paper published today in the Proceedings of the National Academy of Sciences, researchers at the Koch Institute for Integrative Cancer Research at MIT describe a new approach to targeting tumors and metastases. Previous attempts to focus on the tumor cells themselves have typically proven unsuccessful, as the tendency of cancerous cells to mutate makes them unreliable targets. Instead, the researchers decided to target structures surrounding the cells known as the extracellular matrix (ECM), according to Richard Hynes, the Daniel K. Ludwig Professor for Cancer Research at MIT. The research team also included lead author Noor Jailkhani, a postdoc in the Hynes Lab at the Koch Institute for Integrative Cancer Research. The extracellular matrix, a meshwork of proteins surrounding both normal and cancer cells, is an important part of the microenvironment of tumor cells. By providing signals for their growth and survival, the matrix plays a significant role in tumor growth and progression. When the researchers studied this microenvironment, they found certain proteins that are abundant in regions surrounding tumors and other disease sites, but absent from healthy tissues. What’s more, unlike the tumor cells themselves, these ECM proteins do not mutate as the cancer progresses, Hynes says. “Targeting the ECM offers a better way to attack metastases than trying to prevent the tumor cells themselves from spreading in the first place, because they have usually already done that by the time the patient comes into the clinic,” Hynes says. The researchers began developing a library of immune reagents designed to specifically target these ECM proteins, based on relatively tiny antibodies, or “nanobodies,” derived from alpacas. The idea was that if these nanobodies could be deployed in a cancer patient, they could potentially be imaged to reveal tumor cells’ locations, or even deliver payloads of drugs. The researchers used nanobodies from alpacas because they are smaller than conventional antibodies. Specifically, unlike the antibodies produced by the immune systems of humans and other animals, which consist of two “heavy protein chains” and two “light chains,” antibodies from camelids such as alpacas contain just two copies of a single heavy chain. Nanobodies derived from these heavy-chain-only antibodies comprise a single binding domain much smaller than conventional antibodies, Hynes says. In this way nanobodies are able to penetrate more deeply into human tissue than conventional antibodies, and can be much more quickly cleared from the circulation following treatment. To develop the nanobodies, the team first immunized alpacas with either a cocktail of ECM proteins, or ECM-enriched preparations from human patient samples of colorectal or breast cancer metastases. They then extracted RNA from the alpacas’ blood cells, amplified the coding sequences of the nanobodies, and generated libraries from which they isolated specific anti-ECM nanobodies. They demonstrated the effectiveness of the technique using a nanobody that targets a protein fragment called EIIIB, which is prevalent in many tumor ECMs. When they injected nanobodies attached to radioisotopes into mice with cancer, and scanned the mice using noninvasive PET/CT imaging, a standard technique used clinically, they found that the tumors and metastases were clearly visible. In this way the nanobodies could be used to help image both tumors and metastases. But the same technique could also be used to deliver therapeutic treatments to the tumor or metastasis, Hynes says. “We can couple almost anything we want to the nanobodies, including drugs, toxins or higher energy isotopes,” he says. “So, imaging is a proof of concept, and it is very useful, but more important is what it leads to, which is the ability to target tumors with therapeutics.” The ECM also undergoes similar protein changes as a result of other diseases, including cardiovascular, inflammatory, and fibrotic disorders. As a result, the same technique could also be used to treat people with these diseases. In a recent collaborative paper, also published in Proceedings of the National Academy of Sciences, the researchers demonstrated the effectiveness of the technique by using it to develop nanobody-based chimeric antigen receptor (CAR) T cells, designed to target solid tumors. CAR T cell therapy has already proven successful in treating cancers of the blood, but it has been less effective in treating solid tumors. By targeting the ECM of tumor cells, nanobody-based CAR T cells became concentrated in the microenvironment of tumors and successfully reduced their growth. The ECM has been recognized to play crucial roles in cancer progression, but few diagnostic or therapeutic methods have been developed based on the special characteristics of cancer ECM, says Yibin Kang, a professor of molecular biology at Princeton University, who was not involved in the research. “The work by Hynes and colleagues has broken new ground in this area and elegantly demonstrates the high sensitivity and specificity of a nanobody targeting a particular isoform of an ECM protein in cancer,” Kang says. “This discovery opens up the possibility for early detection of cancer and metastasis, sensitive monitoring of therapeutic response, and specific delivery of anticancer drugs to tumors.” This work was supported by a Mazumdar-Shaw International Oncology Fellowship, fellowships for the Ludwig Center for Molecular Oncology Research at MIT, the Howard Hughes Medical Institute and a grant from the Department of Defence Breast Cancer Research Program, and imaged on instrumentation purchased with a gift from John S. ’61 and Cindy Reed. The researchers are now planning to carry out further work to develop the nanobody technique for treating tumors and metastases. (Comment on this)

<< Previous Day 2019/05/06
[Calendar] Next Day >>