Data Center Knowledge | News and analysis for the data center industry - Industr's Journal
 
[Most Recent Entries] [Calendar View]

Thursday, January 7th, 2016

    Time Event
    3:00p
    Vantage Kicks Off 6 MW Data Center Construction in Silicon Valley

    Vantage Data Centers has kicked off construction of a fourth building on its Silicon Valley data center campus, expecting to bring an additional 6 MW of capacity to the supply-constrained market in the fourth quarter.

    The market is tight, and few providers have wholesale-size chunks of data center space available, while demand is high, driven by the Valley’s booming high-tech sector. Software companies, digital content providers, as well as cloud service providers, such as Software-as-a-Service and Infrastructure-as-a-Service, are looking for data center capacity there, and Vantage is hoping to get some of those customers onto its Santa Clara campus.

    “We continue to see good demand, and we were fortunate to come up with a way to accommodate that,” Vantage CEO Sureel Choksi said.

    Another major multi-tenant data center construction project happening in the Valley is a 230,000-square-foot facility in Santa Clara by CoreSite, a major Vantage competitor. CoreSite is also building a 140,000-square-foot data center in the market, but that building is going to be occupied by a single customer, whose name CoreSite has not disclosed.

    Last year, Vantage brought 12 MW of available capacity to the market that was freed up by an existing customer who had overestimated their needs when they first signed the lease.

    The fourth building on Vantage’s campus will be adjacent to the existing V1 data center there. It will be a two-story facility with a design similar to the design of V1.

    It will provide 200 watts per square foot on a raised floor and leverage airside economization, or free cooling.

    4:00p
    Why Working Sets May Be Working Against You

    Pete Koehler is an Engineer for PernixData.

    Lack of visibility into how information is being used can be extremely problematic in any data center, resulting in poor application performance, excessive operational costs, and over-investment in infrastructure hardware and software.

    One of the biggest mysteries in modern day data centers is the “working set,” which refers to the amount of data that a process or workflow uses in a given time period. Many administrators find it hard to define, let alone understand and measure how working sets impact data center operations.

    Virtualization helps by providing an ideal control plane for visibility into working set behavior, but hypervisors tend to present data in ways that can be easily misinterpreted, which can actually create more problems than are solved.

    So how can data center administrators get the working set information they need in a manner that is most useful for proper planning, design, operations, and management?

    What Is It?

    For all practical purposes, working sets are the data most commonly accessed from persistent storage. But that simple explanation leaves a handful of terms that are difficult to qualify, and quantify. What is recent? Does “amount” mean reads, writes, or both? What happens when the same data is written over and over again?

    Determining a working set’s size helps administrators understand the behaviors of workloads for better design, operations, and optimization. For the same reason administrators pay attention to compute and memory demands, it is also important to understand storage characteristics like working sets. Understanding and accurately calculating working sets can have a profound effect on the consistency of a data center. Have you ever heard about a real workload performing poorly, or inconsistently on a tiered storage array, hybrid array, or hyper-converged environment? This is because both are extremely sensitive to right sizing the caching layer. Not accurately accounting for working set sizes of the production workloads is a common reason for such issues.

    To explore this more, let’s review a few traits associated with working sets:

    • Working sets are driven by the workload, the applications driving the workload, and the virtual machines (VMs) on which they run. Whether the persistent storage is local, shared, or distributed, it really doesn’t matter from the perspective of how the VMs see it. The size will be largely the same.
    • Working sets always relate to a time period. However, it’s a continuum, with cycles in the data activity over time.
    • Working set will be comprised of reads and writes. The amount of each is important to know because reads and writes have different characteristics, and demand different things from your storage system.
    • Working set size refers to an amount, or capacity. But how many I/Os it takes to make up that capacity will vary due to ever- changing block sizes.
    • Data access types may be different. Is one block read a thousand times, or are a thousand blocks read one at a time? Are the writes mostly overwriting existing data, or is it new data? This is part of what makes workloads so unique.
    • Working set sizes evolve and change as your workloads and data center change. Like everything else, they are not static.

    A simplified, visual interpretation of data activity that would define a working set, might look like below.

    1stgraph

    If a working set is always related to a period of time, then how can it ever be defined? A workload often has a period of activity followed by a period of rest. This is sometimes referred to the “duty cycle.” A duty cycle might be the pattern that shows up after a day of activity on a mailbox server, an hour of batch processing on a SQL server, or 30 minutes compiling code. Taking a look over a larger period of time, duty cycles of a VM might look something like below.

    2ndgraph (2)

    Working sets can be defined at whatever time increment desired, but the goal in calculating a working set will be to capture one or more duty cycles of each individual workload at a minimum.

    Classic Methods for Calculating Working Sets

    There are various ways that administrators have attempted to measure working sets, all of which are ineffective for various reasons. These include:

    • Calculate working sets using known (but not very helpful) factors, such as IOPS over the course of a given time period. This is flawed, however, as it assumes one knows all of the various block sizes for that given workload, and that block sizes for a workload are consistent over time. It also assumes all reads and writes use the same block size, which is also not true.
    • Measure working sets at the array, as a feature of the array’s caching layer. This attempt often fails because it sits at the wrong location. It may know what blocks of data are commonly accessed, but there is no context to the VM or workload imparting the demand. Most of that intelligence about the data is lost the moment the data exits the host. Lack of VM awareness can even make an accurately guessed cache size on an array insufficient at times due to cache pollution from noisy neighbor VMs.
    • Take an incremental backup, and look at the amount of changed data. It seems logical, but this can be misleading because it will not account for data that is written over and over, nor does it account for reads. The incremental time period of the backup may also not be representative of the duty cycle of the workload.
    • Guess work. You might see “recommendations” that say a certain percentage of your total storage capacity used is hot data, but this is a more formal way to admit that it’s nearly impossible to determine. Guess large enough, and the impact of being wrong will be less, but this introduces a number of technical and financial implications on data center design.

    As you can see, these old strategies do not hold up well, and still leaves the administrator without a real answer. A data center architect deserves better when factoring in this element to the design or optimization of an environment.

    A New Approach

    The hypervisor is the ideal control plane for measurement of a lot of things. Let’s take storage I/O latency as a great example. It doesn’t matter what the latency a storage array advertises, but what the VM actually will see. So why not extend the functionality of the hypervisor kernel so that it provides insight into working set data on a per VM basis?

    By understanding and presenting storage characteristics such as block sizes in a way never previously possible, you can understand on a per VM basis the key elements necessary to calculate working set sizes. Furthermore, you can estimate working sets for each individual VM in a vSphere cluster, and/or estimate for VMs on a per host basis.

    Once working set sizes have been established, it opens a lot of doors for better design and optimization of an environment. Here are some examples of what can be achieved:

    • Properly sized persistent storage in a storage array.
    • If using server side storage acceleration, you can size the flash and/or RAM on a per host basis correctly to maximize the offload of I/O from an array.
    • If replicating data to another data center, take a look at the writes committed on the working set estimate to gauge how much bandwidth you might need between sites.
    • Learn how much of a caching layer might be needed for hyper-converged environments.
    • Chargeback/showback. This is one more way of identifying the heavy consumers of your environment, and would fit nicely into a chargeback/showback arrangement.

    Summary

    Understanding and accurately accounting for working set sizes can make the difference between a successful design, implementation, and operation of the data center, or an environment that leaves you with erratic performance, and dissatisfied application owners and users. Accommodating working set sizes correctly will not only help with predictable application delivery, but may have significant cost savings by avoiding overprovisioning of data center resources.

    Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating. View previously published Industry Perspectives in our Knowledge Library.

    7:34p
    Amazon Launches Its First Cloud Data Centers in Korea

    Promising to reduce cloud latency for its Korean customers, Amazon Web Services has launched several cloud data centers in the country, establishing a fifth availability region in Asia Pacific. The other four are Sigapore, Beijing, Tokyo, and Sydney.

    The company didn’t specify how many data centers the new region consisted of or where exactly they were in Korea. They’re likely in or just outside of Seoul, since it’s called the Seoul region. The region currently has two availability zones, and each zone usually consists of one or more data centers.

    Amazon said existing customers who are either based in Korea or do business in the Korean market have requested that the provider launch physical data centers there. Because of latency and in some cases for data-sovereignty reasons, providing infrastructure cloud services globally has become a race to expand geographic reach of the physical infrastructure that only a few players have the resources to participate in.

    So far, Amazon and Microsoft have been the two main contenders in the race. Google, considered to be the third cloud giant, doesn’t have nearly as much of its global data center capacity dedicated to its cloud infrastructure services, which may start to change this year.

    IBM, following its acquisition of data center service provider SoftLayer, went on a global cloud data center expansion push last year and the year before. Many others, such as HP and Dell, have dropped out of the race, while big telcos, including CenturyLink, Verizon, and AT&T, are reassessing their future in the cloud and data center services market, exploring alternatives to owning the massive data center portfolios they built out in recent years to chase the cloud opportunity.

    The Seoul region brings Amazon’s cloud to 32 availability zones across 12 regions. The company is promising to bring online nine more availability zones in four regions (China, India, Ohio, and the UK) this year.

    In the announcement, the cloud giant flaunted two eager Korean customers: a gaming company called Nexon and an asset-management firm called Mirae Asset Global Investments Group.

    The former said cloud infrastructure allows it to test new video games for market traction before it commits a lot of money data center infrastructure needed to support them. The latter has already moved all of its web properties from on-prem data centers to AWS. Now that there are AWS data centers in Korea, it is considering the cloud for more mission-critical workloads.

    8:06p
    Shaw to Offer Microsoft Cloud Out of ViaWest Data Centers

    Canadian telecom Shaw Communications has partnered with Microsoft to make the latter’s Azure cloud services available out of all of its data centers, including the ones operated by ViaWest, the Greenwood Village, Colorado-based data center provider Shaw acquired in 2014.

    The partnership, announced Thursday, is aimed at hybrid IT deployments, where customers combine their own dedicated servers hosted at colocation data centers with public cloud services. Hybrid is generally said to be the model enterprises prefer today, giving them the control of having their most valuable and critical data and applications on their own servers while leveraging the flexibility of public cloud where appropriate.

    Shaw also announced the launch of a new data center in Calgary. The purpose-built facility has 40,000 square feet of raised floor, the company said.

    Microsoft cloud will be available out of 30 Shaw and ViaWest data centers in the US and Canada.

    Shaw has been expanding its technological capabilities and data center capacity. Last year it acquired Applied Trust, a consulting company that specializes in infrastructure, security, compliance, and DevOps.

    It also launched a new data center in the Portland market and acquired INetU, a data center provider that expanded its footprint on the US East Coast and in Europe.

    11:24p
    Stream Sells Dallas Data Center to Zayo

    Zayo Group, the network connectivity and data center services provider, has acquired a 36,000-square-foot Dallas data center from Stream Data Centers, doubling its footprint in one of the hottest data center markets in the US.

    The company said it has seen accelerating demand for data center and interconnection services in Dallas and expects at least a 40-percent return on its investment in the facility. It did not, however, disclose the size of the investment.

    This is the second data center Stream has sold. Last year, financial services company TD Ameritrade bought the company’s data center in nearby Richardson, considered to be part of the same Dallas-Fort Worth market.

    Stream has three more data centers in the Dallas market, as well as Austin, Houston, and San Antonio. In addition to Texas, it has data centers in Silicon Valley, Denver, and Minneapolis.

    Zayo leverages its network infrastructure assets to attract data center customers. Its new Dallas data center will provide access to its fiber backbone in the region, which spans more than 3,500 miles.

    It will also link directly to Zayo Points of Presence in the area: one at the Dallas Infomart and another at the Digital Realty Trust facility at 2323 Bryant Street – both key data center and network interconnection hubs in the Dallas market.

    The first week of the year has proven to be a week of data centers either changing hands or trying to. On Tuesday, DuPont Fabros Technology announced it was selling its New Jersey data center, and on Wednesday, GI Partners said it acquired a data center, also in the Dallas market, occupied by ViaWest.

    Also on Tuesday, Reuters reported that Verizon had kicked off the process to auction about $2.5 billion worth of data center assets.

    << Previous Day 2016/01/07
    [Calendar]
    Next Day >>

Data Center Knowledge | News and analysis for the data center industry - Industry News and Analysis About Data Centers   About LJ.Rossia.org