Data Center Knowledge | News and analysis for the data center industry - Industr's Journal
 
[Most Recent Entries] [Calendar View]

Thursday, November 17th, 2016

    Time Event
    2:00p
    Five Reasons to Adopt Hybrid Cloud Storage for your Data Center

    Jon Toor is CMO for Cloudian.

    Are you weighing the benefits of cloud storage versus on-premises storage? If so, the right answer might be to use both–and not just in parallel, but in an integrated way. Hybrid cloud is a storage environment that uses a mix of on-premises and public cloud services with data mobility between the two platforms.

    IT professionals are now seeing the benefit of hybrid solutions. According to a recent survey of 400 organizations in the U.S. and UK conducted by Actual Tech, 28 percent of firms have already deployed hybrid cloud storage, with a further 40 percent planning to implement within the next year. The analyst firm IDC agrees: In its 2016 Futurescape research report, the company predicted that by 2018, 85 percent of enterprises will operate in a multi-cloud environment.

    Hybrid has piqued interest as more organizations look to the public cloud to augment their on-premises data management. There are many drivers for this, but here are five:

    1. We now have a widely-accepted standard interface.

    The emergence of a common interface for on-prem and cloud storage changes everything. The world of storage revolves around interface standards. They are the glue that drives down cost and ensures interoperability. For hybrid storage, the defacto standard is the Amazon S3 API, an interface that began in cloud storage and is now available for on-premises object storage as well. This standardization is significant because it gives storage managers new flexibility to deploy common tools and applications on-prem and in the cloud, and easily move data between the two environments to optimize cost, performance, and data durability.

    1. Unprecendented hybrid scalability delivers operational efficiency.

    Managing one large, scalable pool of storage is far more efficient than managing two smaller ones. And hybrid storage is hands-down the most scalable storage model ever devised. It combines on-prem object storage – which is itself scalable to hundreds of PBs – with cloud storage that is limitlessly scalable, for all practical purposes. This single-pool storage model reduces data silos, and simplifies management with a single namespace and a single view — no matter where the data originated or where it resides. Further, hybrid allows you to keep a copy of all metadata on-premises, ensuring rapid search across both cloud and on-premise data.

    1. Best-of-breed data protection is now available to everyone.

    Data protection is fundamental to storage. A hybrid storage model offers businesses of all sizes incredible data protection options, delivering data durability that previously would have been affordable to only the most well-heeled storage users. In a hybrid configuration, you can backup data to object storage on premises, then automatically tier data to the cloud for long-term archive (Amazon Glacier, Google Coldline, Azure Blob). This gives you two optimal results: You have a copy of data on-site for rapid recovery when needed, and a low-cost, long-term archive offsite copy for disaster recovery. Many popular backup solutions including Veritas, Commvault and Rubrik provide Amazon S3 connectors that enable this solution as a simple drop-in.

    1. Hybrid offers more deployment options to match your business needs.

    Your storage needs have their own nuances, and you need the operational flexibility to address them. Hybrid can help with more deployment options than other storage models. For the on-premise component, you can select from options that range from zero-up-front cost software running on the servers you already own, to multi-petabyte turnkey systems. For the cloud component, a range of offerings meet both long-term and short-term storage needs. Across both worlds, a common object storage interface lets you mix-and-match the optimal solution. Whether the objective is rapid data access on-premises or long-term archival storage, these needs can be met with a common set of storage tools and techniques.

    1. Hybrid helps meet data governance rules.

    External and internal data governance rules play a big part in data storage planning.  In a recent survey, 59% of respondents reported the need to maintain some of their data on premises. On average, that group stated that only about half of their data can go to the cloud. Financial data and customer records in particular are often subject to security, governance and compliance rules, driven by both internal policy and external regulation. With a hybrid cloud model, you can more easily accommodate the changing needs. With hybrid, you can set policies to ensure compliance, tailoring migration and data protection rules to specific data types.

    While many are seeing the natural advantages of hybrid, some are still unsure. What other factors play in that I haven’t mentioned? With more and more being digitized and retained into perpetuity, what opportunities is your organization exploring to deal with the data deluge?

    Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating. View previously published Industry Perspectives in our Knowledge Library.

    3:51p
    GPU Acceleration Makes Kinetica’s Brute Force Database a Brute

    The big data solution known as Hadoop resolved two huge, critical problems facing data centers at the turn of the last decade:  First, it extended virtual data volumes beyond the bounds of single storage volumes, in a method that was easily managed — or at least, easily enough.  Second, it provided a new mechanism for running analytical and processing jobs in batches and in parallel, using an effective combination of new cloud-based methodologies and old data science methods that had been collecting dust since before the rise of dBASE.

    The way Hadoop resolved this second issue is at one level complex, and from another perspective, brilliant.  But another technology that sat on the shelf for too long — graphics processor-based acceleration — has since ascended to its own critical mass.  Now it’s possible for GPUs to run their own massively parallel tasks, at speeds that conventional, sequential, multi-core CPUs can’t possibly approach, even in clusters.

    So along comes an in-memory database called Kinetica, whose value proposition is based on its GPU acceleration.  A few weeks ago, the firm unveiled its Kinetica Install Accelerator, with a one-year license for a 1 TB installation plus two weeks of personal consultation; and its Application Accelerator, with a 3 TB license and up to four weeks of consultation.

    Smash Bros.

    What isn’t generally known about Kinetica is how radically simplified its schematics have been engineered to be.  Since GPUs accelerate simple operations by deploying their components massively in parallel, Kinetica is. . . shall we say, compelled to process data sets from massive data lakes without so many indexes.

    “We want to be a part of that ecosystem where we solve a lot of problems that take four or five Hadoop ecosystem products, kind of patched together with duct tape, to get working,” said Nima Negahban, Kinetica’s CTO and co-founder, in an interview with Data Center Knowledge.

    Usually just after an enterprise builds its data lake, it embarks on a noble quest to collect every possible bit and byte of data that can be scarfed up, Negahban told us.  But then enterprises have an urge to query the heck out of it.

    “SQL on Hadoop is not what Hadoop is for,” he said.  “Hadoop is great for being an HDFS data lake, even though that’s not the first reason it was built.  And then that produced the whole job-based mentality, where I have a question or I want to generate a model, so let me run a job.  That’s different from the need to be a 24/7 operational database, that needs fast response times and query flexibility.”

    Just after the data lake model first took root in enterprises, there was a presumption that analytics-oriented data models would soon envelop and incorporate transactional data — the type that populates data warehouses.  That is not happening, and now folks are realizing it probably won’t.  There will be a co-existence of the two systems, which will just have to learn to share and get along with one another.

    But must this co-existence necessarily juxtapose the fast with the slow?  Kinetica’s value proposition is that there are methodological benefits that can be gleaned from the new realm of analytical data science and “big data” (which has already become just “data”).  But these benefits are best realized, Negahban argues, when they are applied to a reconfigured processing model that makes optimal use of today’s hardware — of configurations that weren’t available when Hadoop first appeared on the scene.

    Kinetica utilizes its own SQL engine, said Negahban.  By way of API calls, or alternately ODBC and JDBC connectors (for integration with client/server applications), it parses standard SQL, decomposing it into commands that are directed to virtual processors in its own clusters.  Those processors are delegated to GPU pipelines.  Those connectors enable Kinetica to serve as a back end for analytics and business intelligence (BI) platforms such as Tableau and Oracle Business Intelligence.

    “With pretty much any BI platform,” he remarked, “you can drop in our adapters, and quickly start using the tool, just with accelerated performance.”

    Call of Duty

    Negahban’s journey towards building Kinetica began in 2010, acting in his capacity as CTO of one of the intelligence community’s principal consultancies, GIS Federal.

    At that time, the firm was part of a U.S. Army project to converge some 200 separate analytics tools into a single API.  This way, developers could produce their own custom applications that utilized this API, rather than cherry-pick one of the 200 tools for which the Army was licensed — at the expense of the other 199.

    The high water-mark for contenders for that Army project was whether the right analysis, at the right time, could save troops’ lives.

    NoSQL, Hadoop + HBase, and Hadoop + Cassandra were the bases for the Army’s first projects.  “Time and time again, the same issues had us arrive at the same conclusion,” related Negahban.  “We had too many indexes to try to drive a query, which caused hardware fan-out to explode, and ingestion time to increase.  We went from being a representation of up-to-the-second data, to being one day old, and after a while, a week old.  We were that behind on ingestion.”

    A year earlier, Nvidia had advanced its Fermi GPGPU microarchitecture, its successor to Tesla.  Negahban argued that Fermi could have virtually infinite compute capability, so he argued in favor of experimenting with leveraging Fermi as a rather ordinary database engine. . . multiplied by millions.  Rather than re-architecting a column store that would graft Hadoop + Cassandra to Fermi, and generating a plethora of indexes that would orchestrate data distribution from big data lakes among big Cassandra clusters, Negahban’s idea was to multiply a brute-force SQL engine across a huge swath of pipelines.

    “We were pretty much the first to think of that in a distributed fashion,” he said.  “Our real contribution to the GPGPU world has been, basically opening people’s eyes to its ability to be used in data processing, where it’s been so focused on machine learning and the kinds of lower I/O/higher compute problem sets.  Data processing and OLAP workloads are heavily I/O intensive, so we were one of the first to say, this has an application in this [new] context as well.”

    Since advancing the cause of GPGPU for the U.S. Army, Negahban, along with GIS Federal CEO Amit Vij (also CEO of Kinetica) have found success with another major customer, the U.S. Postal Service.

    “That trend that we got to see early at the Army, because they’re such a massive organization,” he told Data Center Knowledge, “we think is going to be a prevailing trend for enterprises all over, where they’ve spent millions of dollars in data creation and data infrastructure.  The whole Hadoop movement, oddly enough, is enabling enterprises to have their ‘A-ha!’ moment.  ‘I have this massive data lake, I’ve spent all this money on IoT and asset management, I’m dumping terabytes of data a day into my huge Hadoop cluster.  Now how do I make this an operational tool that will give real-time ROI to my enterprise?’”

    9:17p
    Cisco Sales Forecast Points to Slower Technology Spending

    (Bloomberg) — Cisco Systems Inc., the biggest maker of the equipment that’s the backbone of the internet, projected sales and profit that indicate corporate spending on technology hardware is slowing.

    Profit before certain costs in the period that ends in January will be 55 cents to 57 cents a share and revenue may decline as much as 4 percent, the company said Wednesday. Analysts projected profit of 59 cents a share and a 2 percent increase in sales to $12.1 billion, according to data compiled by Bloomberg.

    The forecast signals the difficulties facing Chief Executive Officer Chuck Robbins, who is he seeking to recast Cisco as a provider of networking services amid tightening corporate purchasing. With switching and routing still providing the biggest chunk of sales, Robbins is struggling to show growth as customers move away from the mix of fixed hardware and closed software that helped the company become dominant.

    “When you have more economic uncertainty enterprises tend not to spend,” said David Heger, an analyst at Edward Jones. The new businesses’ contribution is “still not big enough and the transition can’t move enough relative to the overall business.”

    Cisco shares declined as much as 5.4 percent in extended trading following the announcement. The stock had gained 16 percent this year to $31.57 at the close in New York.

    Orders Decline

    Robbins said most of the caution in Cisco’s forecast comes from weak orders from telecommunications service providers for switches, devices that create networks of computers. Orders fell 12 percent in the quarter ended Oct. 29 from a year ago, Cisco said. Some companies are uncertain how political changes will affect their regulatory environment and others are concentrating spending on their mobile networks rather than Cisco gear for their data centers, he said.

    The outlook reflects customers tightening their spending and not a loss of market share to competitors, Robbins said in a phone interview.

    “We’ve gone account by account with these customers; it’s not losses, it’s pauses” in spending, he said. “It’s not because we’re losing franchises.”

    Revenue growth has slowed significantly since fiscal 2010, when Cisco reported an 11 percent increase. On Wednesday, the company projected sales in the fiscal second quarter ending in January will decline to a range of $11.36 billion to $11.6 billion, based on adjusted revenue of $11.83 billion in the quarter a year earlier.

    Quarterly Results

    In the fiscal first quarter, Cisco’s net income was $2.3 billion, or 46 cents a share, from $2.4 billion, or 48 cents, a year earlier, the San Jose, California-based company said in a statement. Sales rose 1 percent to $12.4 billion. Excluding some costs, profit was 61 cents a share, compared with analysts’ estimate of profit of 59 cents on revenue of $12.3 billion.

    Sales in Cisco’s biggest business, switching, declined 7 percent to $3.7 billion from a year earlier. Routing, the second-biggest unit, was a stand out with a revenue increase of 6 percent to $2.09 billion.

    Some investors no longer expect top-line expansion from the company and are holding the stock based on its ability to generate cash and return that to investors. Cisco’s backers also like its ability to maintain high-levels of profitability, even amid more difficult market conditions, said Brian White, an analyst at Drexel Hamilton.

    They’re also looking forward to a possible lower tax rate on repatriating overseas corporate earnings, he said. That would give Cisco the opportunity to bring home more than $50 billion it has parked outside of the U.S., White said. That cash influx would help boost dividend payments and stock repurchases.

    10:09p
    LinkedIn’s Award-Winning Hillsboro Data Center Goes Live

    After more than a year in the works, LinkedIn took its new, 8-megawatt Hillsboro, Oregon data center live today—the first of its kind to implement a new hyperscale infrastructure strategy similar to those used by Google, Facebook and Microsoft.

    The facility, which LinkedIn is leasing from Infomart Data Centers, features custom electrical and mechanical design with 96 servers per cabinet and uses just below 18kW of power per cabinet. The cooling design of the cabinet allows for densities up to 32 kW per rack, however. For cooling at that density, LinkedIn is using heat-conducting doors on every cabinet. Each cabinet acts as its own contained ecosystem, too, so there are no hot and cold aisles as in a typical data center.

    “The advanced water side economizer cooling system communicates with outside air sensors to utilize Oregon’s naturally cool temperatures, instead of using energy to create cool air,” wrote Michael Yamaguchi, LinkedIn’s Director of Data Center Engineering in a blog.” Incorporating efficient technologies such as these enables our operations to run a PUE (Power Usage Effectiveness) of 1.06 during full economization mode.”

    VIDEO: Take a look at a bird’s eye view of the Hillsboro data center.

    Custom network switches are also unique to the data center. The switches are one of the biggest parts of the new data center transformation. LinkedIn has designed its own 100-Gigabit switches and a scale-out data center network fabric for better performance. It went with 100G for future capacity needs, and currently splits the 100G into two 50G ports using the PSM4 optical interface standard. This is less expensive than using 40G optical interconnect, according to the company.

    It is also the first data center designed to enable the company to go from running on tens of thousands of servers to running on hundreds of thousands of servers.

    The other LinkedIn data centers, located in California, Texas, Virginia, and Singapore, will transition to the new hyperscale infrastructure gradually.

    “This is the most technologically advanced and highly efficient data center in our global portfolio, and includes a sustainable mechanical and electrical system that is now the benchmark for our future builds,” wrote Yamaguchi.

    Testament to the data center’s efficiency is the fact that Uptime Institute announced on Wednesday that it has awarded LinkedIn the Efficient IT (EIT) Stamp of Approval. ”

    This award includes a complete evaluation of enterprise leadership, operations, and computing infrastructure designed to help organizations lower costs and increase efficiency, and leverage technology for good stewardship of corporate and environmental resources. It also certifies an organization’s sustainable leadership in IT, evidencing better control of how resources are both consumed and allocated,” according to an Uptime press release.

     

     

     

    10:30p
    Intel Sets FPGA Goal: Two Orders of Magnitude Faster than GPGPU by 2020

    Intel used to be capable of delivering process miniaturization and configuration improvements in a predictable, “tick-tock” cadence.  That was before the laws of physics stomped on the company’s best laid plans for miniaturization.  So in a special event Thursday in San Francisco, Intel formally unveiled the hardware it expects will take over from CPUs, in the job of scaling the next big obstacle in performance improvement, including a PCI Express-based FPGA accelerator exclusively for deep learning applications, and a customized derivative of its “Knights Landing”-edition Xeon Phi processors.

    “I know that today AI has definitely become overhyped,” admitted Barry Davis, Intel’s general manager for its Accelerated Workload Group, in a press briefing prior to the event.  “But the reality is that it’s a rapidly growing workload — whether we’re talking about the enterprise, the cloud, everyone is figuring out how to take advantage of AI.”

    At least, that’s Intel’s best hope, as the company shoves all its chips into the center of the table, betting on hardware-driven acceleration to carry on the company’s founding tradition.

    What’s being called the Deep Learning Inference Accelerator (DLIA), puts to work the Arria 10 FPGA design that Intel acquired last year in its purchase of Altera.  DLIA is available for select customer testing today, subject to Intel approval, though the accelerator card will be generally available in mid-2017.

    In a statement released Thursday, Intel Executive Vice President and Data Center Group General Manager Diane Bryant boasted, “Before the end of the decade, Intel will deliver a 100-fold increase in performance that will turbocharge the pace of innovation in the emerging deep learning space.”

    That statement came with the requisite footnote under “performance,” reminding readers that such displays of performance will be registered on the company’s benchmarks of choice (including SYSmark), and not necessarily in everyday practice.

    As originally conceived, Arria 10 would be put to work in reconfigurable communications equipment such as wireless routers, transceiver towers, and live HDTV video camera gear.  But Intel is leveraging it now as a caretaker for a convolutional neural network (CNN, albeit without James Earl Jones).

    Think of a CNN as a way that an algorithm can “squint” at an image, reducing its resolution selectively, and determining whether the result faintly resembles something it’s been trained to see beforehand.

    As Facebook research scientist Yangqing Jia, Caffe’s principal contributor, writes, “While the conventional definition of convolution in computer vision is usually just a single channel image convolved with a single filter (which is, actually, what Intel IPP’s [Integrated Performance Primitives] convolution means), in deep networks we often perform convolution with multiple input channels (the word is usually interchangeable with ‘depth’) and multiple output channels.”

    If you imagine a digital image as a product of several projections “multiplied” together, you see what Jia is getting at: an emerging picture of something an algorithm is training a system to recognize.

    Intel intends to position DLIA towards customers interested in tackling the big, emerging AI jobs: image recognition and fraud detection.

    “FPGAs are very useful for artificial intelligence,” Intel’s Davis told reporters.  “They’ve been used quite extensively for inference or scoring, to augment the Xeon CPU.  Today, 97 percent of scoring or inference is actually run on Intel Xeon processors.  But sometimes people do need a bit more.”

    Those people, specifically, are customers who are well aware of the purposes they have in mind, and are familiar with the algorithms they intend to use for those purposes, said Davis.  It was a surprising statement, given that the DLIA project has been described as geared toward an entry-level product for a broader range of AI users.

    Intel has already produced its own fork of the Caffe deep learning framework, and Davis said DLIA will be geared to accelerate that library.

    To be released at or about the same time as DLIA is a derivative of the “Knights Landing” generation of Intel’s Xeon Phi processors — its original effort to supplement CPU power.  Back in August, at the Intel Developer Forum, Diane Bryant spilled the beans on “Knights Mill,” a variant on the existing generation that’s optimized for AI workloads.  Specifically, the variant will focus on improving performance for mixed-precision operations, which may be critical when pairing high-precision AI algorithms for precision against low-precision algorithms for speed.

    At AI Day on Thursday, Intel promised a 4x performance boost for Knights Mill versus the previous generation of Xeon Phi (which would be “Knights Corner”), for deep learning workloads.  The company also plans to imbue the next generation of its mainline Xeon CPUs with its Advanced Vector Instruction set, potentially boosting the product line’s performance with floating-point operations run in parallel.ou

    << Previous Day 2016/11/17
    [Calendar]
    Next Day >>

Data Center Knowledge | News and analysis for the data center industry - Industry News and Analysis About Data Centers   About LJ.Rossia.org