Data Center Knowledge | News and analysis for the data center industry - Industr's Journal
 
[Most Recent Entries] [Calendar View]

Thursday, June 2nd, 2016

    Time Event
    2:00p
    Dell Designs Custom Liquid Cooling System for eBay Data Centers

    It’s no secret that between air and water, water is the more efficient medium for cooling the mightiest of servers. Using liquid to carry heat away from computer chips is a common data center cooling method in the world of supercomputers, but today, as some internet-based services develop a more complex set of backend capabilities, such as Big Data analytics or machine learning, data centers that host them are taking cues from supercomputing facilities.

    One example is eBay. A special unit within Dell that makes custom tech for operators of the world’s largest data centers, has designed a water-based system for cooling custom server chips it developed together with Intel Corp. and eBay itself.

    The system is different from typical liquid cooling solutions, however. It brings water from the facility’s cooling towers directly to every chip inside server chassis. There are no central distribution units, which typically sit between cooling towers and server racks in liquid-cooled data centers.

    Another atypical aspect of the design, codenamed Triton, is its ability to use water that’s warmer than usual. The system, now deployed at one of eBay’s data centers, uses water that’s 33 degrees Celsius – because the customized CPUs run at very high frequency – but if Triton is used with lower-power CPUs, supply water temperature can be as high as 60 degrees Celsius, Austin Shelnutt, principal thermal engineer at Dell, said.

    Cranking up the CPU

    The eBay processor that needs all the cooling it can get is a modified version of the chips in Intel’s latest Xeon E5 v4 family. The highest-performing off-the-shelf part in the family has 22 cores and thermal design power (TDP) of 145W. eBay’s chip, modified to run at higher frequency, has 24 cores and TDP of 200W.

    Intel has been designing custom chips for hyperscale data center operators like eBay, Facebook, and Amazon for several years, and this business really ramped up starting about three years ago. Some of it was driven by cloud providers, such as Amazon, who wanted to launch more and more powerful types of cloud servers.

    Rack-Scale Architecture, 21-Inch Form Factor

    Triton is a rack-scale architecture, meaning all components that go into a rack, including servers, are designed holistically as a single system. This is a different architectural approach developed specifically for hyperscale data centers.

    Vendors have traditionally focused on designing individual self-sufficient boxes, be they servers, storage units, or network switches. In rack-scale architecture, basic resources, such as power, cooling, or network connectivity, can be shared among the nodes for efficiency, and components like memory cards or CPUs can be swapped out individually, without the need to replace entire servers.

    Triton uses 21-inch-wide server chassis, similar to the racks and chassis Facebook developed and open sourced through the Open Compute Project. It is not, however, an OCP design, Shelnutt pointed out.

    Facility Water Directly to Chip

    Water in the system travels from the cooling tower to the rack, and individual pipes bring it inside every chassis and to a cold plate that sits on top of every CPU. The only pumps in the system are the facility pumps that push water between the cooling towers and the building.

    The absence of additional pumps between mechanical chillers and computer room air handlers in air-cooled data centers and central distribution units in traditional liquid-cooled data centers makes for a very energy efficient system. “Energy required to cool CPUs in the rack is zero,” Shelnutt said.

    Triton’s Power Usage Effectiveness, or PUE, is 1.06, according to internal analysis by Dell. That’s compared to the industry average data center PUE of 1.7, according to the most recently available data from a 2014 survey by the Uptime Institute.

    Dell addressed the common worry about bringing water into expensive electronics with “extreme testing” of the welded copper pipes, using high-pressure simulations in which it pumped water at more than 350 PSI. The system’s normal supply-water PSI is 70.

    Each server, chassis, and rack has a leak-detection mechanism and an emergency shut-off device. The system borrows a dripless disconnect design from military applications, according to Dell.

    Hyperscale Doesn’t Always Mean High Density

    Not all hyperscale data center operators shoot for high power density like eBay does. Some of them – Facebook, for example – prefer highly distributed low-density systems working in parallel. In an earlier interview, Jason Taylor, Facebook’s VP of infrastructure, told us that the average power density in the social network’s data centers is around 5.5kW. It may have some servers that require 10kW or 12kW per rack and some that only take about 4.5kW, but the facilities are designed for low power density overall.

    “We definitely see it both ways,” Shelnutt said. “We have customers on both sides of that fence.”

    Where a company falls on this low-to-high-density continuum depends a lot on the nature of its application. If the application requires a lot of compute close to dense population centers, for example, the company may be inclined to go for higher density, because real estate costs or tax rates may be higher in those areas, he explained. There is a long list of factors that affects these design decisions, and “not every customer derives the same benefit from tweaking the same knobs.”

    Shelnutt’s colleague Jyeh Gan, director of product marketing and strategy for Dell’s Extreme Scale Infrastructure Unit, said their team is starting to see more demand for high-frequency, high-core-count CPUs and alternative cooling solutions among hyperscale data center operators.

    Shelnutt and Gan’s unit, whose name is abbreviated as ESI, was officially formed about six months ago. It’s tasked with developing custom data center solutions for hyperscale technology companies.

    Those customizations may be as simple as adding extra SSD slots to an existing server design or as involved as designing an entire data center cooling system, Gan said.

    But the unit’s focus is solely on the biggest of customers. Something like eBay’s Triton is not available to any customer off-the-shelf, but if a company with a big enough requirement wants it, the ESI unit is where it would turn.

    3:00p
    Delivering Effective Quality of Service

    Brandon Salmon works in the Office of the CTO for Tintri.

    When most folks in the data center think about Quality of Service (QoS), networking is most likely to come to mind. QoS typically refers to the capability of a network to provide better service to selected network traffic. But, it is a term increasingly adopted by storage, which you can think of as QoS for IOPS—a measure of performance.

    Why does QoS for IOPS matter? Businesses depend on applications, and applications depend on storage; they need sufficient performance from storage in order to operate smoothly. As a result, storage admins need to understand how QoS can help them guarantee application performance and even differentiate the services they offer to (internal and external) customers.

    LUNs and the Noisy Neighbor Problem

    You’ve probably heard of noisy neighbors—in fact you’ve probably dealt with them yourself. Why do noisy neighbors exist? LUNs.

    Conventional storage is built on LUNs, which made sense as a storage management unit when there were a few physical workloads. But today, more than 80% of workloads have been virtualized, and so instead organizations are stuffing those same LUNs with tens or hundreds of virtual machines (VM). Within a LUN, if a single VM goes rogue—if it starts demanding more than its share of performance—it can negatively affect the performance of other VMs in that same LUN. It’s a noisy neighbor. Even worse, you’ll only see that the LUN is behaving badly; you won’t know which resident VM is the real troublemaker.

    Fortunately, there’s an alternative—just move out of the LUN neighborhood. More organizations are turning to VM-aware storage (VAS), which uses individual VMs as the unit of management. There are no LUNs, and so there are no neighbors. If an individual VM goes wrong, it doesn’t affect any other VMs on the VAS storage platform.

    Band-aids for Bottlenecks

    You can eliminate the conflict over resources, or you can simply increase the performance resources available. That’s one of the reasons for the explosion in all-flash storage; organizations are throwing more and more all-flash at their performance problems.

    But all-flash alone is not enough—it’s a band-aid. It postpones having to deal with the underlying problem (LUNs), and you have to apply more and more over time. It’s easy to see how costs can spiral out of control.

    Now, some storage providers tout QoS despite having a LUN-based architecture—but that’s not a solution either. You can set QoS for an entire LUN. If a VM within that LUN goes rogue you can use QoS to assign the entire LUN even more performance. Since you can’t see which specific VM is causing problems, you’re just pouring performance resources at the LUN, not addressing the root cause.

    Use Cases for VM-level QoS

    With VM-aware storage you have visibility into every VM, and that means when behavior changes you can take action. That’s because you can specifically set the minimum and maximum QoS for IOPS on any individual VM. For example:

    When a VM g\Goes Rogue …

    On occasion a VM will start misbehaving. Maybe it’s expected (a finance server at the end of the month), or perhaps it’s not (a print server that goes awry). Either way, with VM-level QoS, you can set a maximum ceiling for IOPS. To keep things contained, clamp that ceiling down.

    figure 1

    When a VM is Mission Critical …

    If you’ve got a mission critical VM that must get sufficient performance, then you need the ability to take the minimum IOPS up to a set level. You’ve used QoS to guarantee performance for that VM.

    figure 1 figure2

    When You Want to Differentiate Tiers of Service …

    And when you’ve got VM-level QoS you can even create multiple tiers of service on a single platform. That’s incredibly difficult with LUNs, since the residents might be a mix of mission critical and less critical VMs. In the past, enterprises and service providers typically bought multiple storage devices, with some dedicated to “gold” applications, others to “silver” applications and so on. The device itself was the dividing line. But with VM-aware storage you can establish gold, silver, bronze and/or other tiers on one device, and then assign each VM to a tier.

    Importantly, in any of the above scenarios you can see you are receiving immediate, visual feedback. You know whether your changes are spurring any contention and/or latency, and its exact source. That way you know that your actions are having the intended effect.

    To guarantee the performance of your virtualized applications, you need more than all-flash alone; you need all-flash with VM-aware storage capabilities. That way you can add per-VM QoS to your toolbox and rapidly fix (or prevent) the problems that might otherwise plague performance.

    Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating. View previously published Industry Perspectives in our Knowledge Library.

    4:57p
    ARM Expects to Challenge Intel for Server Customers

    (Bloomberg) — ARM Holdings, the British chip designer whose technology is found in most smartphones, says it can become a genuine challenger to Intel Corp. for server customers from next year even though it has less than 1 percent of that market.

    While the challenge to Intel’s dominance in servers is “in its infancy,” ARM-based systems are being trialed by global cloud operators, CEO Simon Segars said at the Computex trade show in Taipei. The company is targeting a quarter of the market by 2020.

    “We’re expecting that to turn into real production deployments,” Segars said in an interview. “Over the next couple of years, 2017 to 2018, we’ll start to see some of those deployments.”

    ARM has been transformed by the mobile boom, evolving from a small lab in a converted barn to a company whose designs are found in 95 percent of smartphones. With the mobile phone market slowing, ARM is adding new customers in the automotive industry and targeting growth in processors for network equipment makers and servers.

    Segars said public cloud providers, such as those run by Amazon Web Services, were a key target. Alibaba Group Holding is working with ARM customer Nvidia Corp. as part of a $1 billion bet on cloud computing that could make it one of Asia’s biggest providers of hosted storage and web services.

    ARM sees the global smartphone market growing 6 percent annually until 2020 thanks to demand for mid-range devices. That compares with International Data Corp., which this week cut its forecast for 2016 shipment growth to 3.1 percent, in what it calls a “substantial slowdown.” The market grew by 10.5 percent growth in 2015 and about 8 percent the year before.

    Segars said seasonal changes were expected but that his company remained convinced the market would continue to expand. He predicted that 100 million more smartphones will be shipped this year compared with 2015.

    “At various points in our history we’ve thought, OK, give it a couple of years and the ratio will be 20 percent mobile and 80 percent non-mobile, but mobile has just kept growing,” Segars said in an interview. “Based on that history, I don’t write off smartphones at all.”

    ARM focuses on chip design and generates royalties when companies adopt its architecture. That can include parts of the chip that perform specific functions, such as communications, or the whole semiconductor itself.

    As smartphones took off, so did ARM’s share price. The stock has surged almost 700 percent since the start of 2007, the year Apple introduced the iPhone.

    Segars said he hasn’t heard of any proposal for Apple to move to a longer release cycle for the iPhone.

    Apple may move release a full-model change every three years, instead of the current two, Nikkei reported last month without saying where it got the information.

    5:14p
    Google Hires Box Chief of Engineering for Cloud Role

    (Bloomberg) — Google has hired Sam Schillace, senior VP of engineering for Box, to help run engineering within its cloud division.

    Schillace is a Google veteran who previously helped oversee engineering for the company’s docs, Gmail, calendar, reader, and other products. He will be working on Google’s cloud products, reporting to Diane Greene, said a person familiar with the matter.

    Box spokesman Denis Roy confirmed Schillace’s departure. Google spokesman Michael Moeschler confirmed Schillace’s hiring.

    See also: What Cloud and AI Mean for Google’s Data Center Strategy

    6:49p
    Infomart Expanding in Tight Silicon Valley Data Center Market

    Infomart Data Centers has kicked off an expansion project on its Silicon Valley data center site, expecting to bring an additional 6MW of capacity to a market where demand for data center space significantly outpaces available supply, the company announced this week.

    Technology companies have been taking up data center space in Silicon Valley quickly and in big chunks, while wholesale data center providers have taken a more careful, phased approach to expansion, in contrast with past practices. These dynamics have led to supply shortages and growing lease rates in the market.

    Some of the biggest data center leases in the Valley last year were signed by Microsoft, Uber, Alibaba, IBM SoftLayer, Google, VMware, Arista, and Amazon, according to the commercial real estate firm North American Data Centers.

    Read more: Who Leased the Most Data Center Space in 2015

    In its annual US data center market report released earlier this year, Jones Lang LaSalle, also a commercial real estate company, estimated demand in the San Francisco/Silicon Valley market exceeded 80MW when the report was published. JLL said it expected rates in the market to continue growing over the course of the year.

    Infomart is one of few companies building additional data center capacity in Silicon Valley, where real estate is expensive and land that’s suitable for data center construction is scarce.

    The company that’s building more than others is Vantage Data Centers, which has announced two separate expansion projects – a 6MW and a 21MW one – on its Santa Clara campus. Another provider building in Silicon Valley is DuPont Fabros Technology, which is expanding capacity in Santa Clara by 16MW.

    Other major players expanding in the market are CoreSite and Equinix.

    The first phase of Infomart’s data center in San Jose is about 9MW. The company is now fitting out the second 50,000-square foot building at the site.

    This was the original site operated by Fortune Data Centers, which merged with Dallas Infomart in 2014 to create the company that exists today. Infomart also operates data centers in Dallas, Portland, and Ashburn, Virginia.

    << Previous Day 2016/06/02
    [Calendar]
    Next Day >>

Data Center Knowledge | News and analysis for the data center industry - Industry News and Analysis About Data Centers   About LJ.Rossia.org