Data Center Knowledge | News and analysis for the data center industry - Industr's Journal
 
[Most Recent Entries] [Calendar View]

Monday, June 9th, 2014

    Time Event
    12:00p
    Mid-Year Cloud Jobs Update: Get Noticed, Get Your Head in the Cloud

    We’re well into June already and the cloud world continues to speed up. New types of delivery models, improved optimizations and better infrastructure convergence mechanisms are all impacting how we utilize the modern data center. Through it all – the engineer, architect and IT professional must continue to evolve alongside technology and the business organization.

    So what’s new? Over the past few months a few new types of platforms and technologies are creating increased demands from evolving businesses. If you’re in the cloud job world – here are a few things to keep an eye on:

    • Security. This is really a growing area of expertise. A variety of healthcare breaches, the Target incident and more recently Heartbleed have all created a lot of demand for good security professionals. More than anything – these professionals must be fluent in the language of cloud and ecosystem interoperability. There is a growing need for the ever-effective whitehat security professional. New types of advanced persistent threats aimed at a variety of logical and physical resources have created a new type of dynamic in the security industry. If you’re an IT professional, it’s time to take a look at the very many security concepts that are currently impacting your environment.
    • Application interconnectivity. Cloud computing and the modern data center are already interconnected. Now, applications and the data that supports it all are becoming even more distributed. Cloud and application professionals are finding new ways to deliver rich content to a variety of users and end-points. These apps must be agile, secure and always keep the user’s performance in mind. There is a shift happening around mobility and a movement away from the modern desktop. Now it’s the world of cloud computing and application delivery. HTML5 and new types of web delivery methodologies allow us to interact with online applications much more than ever before.
    • Mobility architecture. The topic of mobility is not going anywhere. Within just a few years there will be more cloud-connected devices than there are people on this planet. All of the data, applications, workloads, and delivery methodologies around mobility have to be planned out and architected. Here’s the important point: this isn’t like deploying a desktop or a laptop. Mobility architecture is a new breed of end-point computing. It revolves around a new type of user and supports a new type of organization — a next-generation one. Applying aging concepts like desktop or end-point deployment simply won’t cut it. As a cloud professional, begin to examine the implications of mobility and how it can all be optimized. Organizations are actively jumping on the mobility bandwagon, mainly because of the demands from the end-user.
    • High-performance computing. So here’s a bit of a curve ball. Modern enterprises are finding new ways to apply data research to create powerful quantitative results. The high-performance computing community is gaining a lot of traction. Even large shops like VMware are now actively looking at ways to optimize and virtualize HPC workloads. Scalability and the capability to process critical web-based workloads has become a necessary component for many modern enterprises. Prediction engines – ones that predict geological events for example – are being built on HPC workloads. This is certainly an interesting area to explore.

    No doubt that the cloud environment is going to facilitate new types of careers and opportunities. Now, organizations are creating their entire business model around the capabilities of their data center and their cloud. This means the modern cloud engineer is becoming a critical piece of the entire organization. Remember, it’s not all about the techie side of things. It’s important to understand and articulate how modern technologies can positively impact your business. This means discussing benefits with key business stakeholders. Ultimately – this will help you get noticed in your organization as well as help keep both your business and IT infrastructure proactive in the industry.

    12:00p
    Intel Tackles Water Supply Problems With Big Data

    With the majority of Big Data conversation focused on solving business problems it is easy to lose sight of the fact that the ultimate opportunity of Big Data is solving big problems — problems that affect our entire society. One of them is hunger and the mix of Big Data and agriculture has the potential to arm companies and governments to tackle it more effectively than has ever been possible.

    One of the companies looking at ways food-supply problems can be addressed with Big Data is Intel. The company currently has two research projects aimed at using Big Data to solve the world’s food and farming challenges. One looks at irrigation and the other at snow mapping in the Sierras.

    The two initiatives are part of a wider program at Intel to apply Big Data to a range of specific problems. Its short-term goal is to drive research insights, while the long-term aim is creating reference architectures that can apply across a variety of industries and drive value.

    “We see these in terms of grand challenges and tough problems,” Vin Sharma, Intel’s director of planning and marketing for Hadoop, said. “We want to push the boundary. We do see commercial value, but it doesn’t have to be immediate.”

    The commercial value for Intel can be enormous in the long term. Every industry is looking at ways to use Big Data, different verticals at different stages. “With Big Data, we’re starting to see a secular movement across all industries,” Sharma said.

    If Intel creates a platform and reference architecture that works across all verticals, it will guarantee itself a big piece of the action, whatever it may be in the future. “We provide the platform and components, but we also talk to the end users to understand their demand patterns. It’s a good mix of things. By being able to solve problems, document, create a reference architecture, we’re able to use that as reference to replicate.”

    Big Data in ’Precision Farming’

    The irrigation project is called Precision Farming. Intel is working with University of California, Davis, to perfect irrigation techniques by placing sensors in crops and other strategic areas to monitor soil and air moisture levels. “The rate of water supply to the crop is [currently] determined on an ad-hoc basis,” Sharma said. “The notion is that there is a lot of waste because of over-provisioning. You instrument better and you tie that back to the supply of water and reduce waste.”

    He said said this approach could reduce the amount of water used for irrigation by as much as 50 percent.

    There are challenges to this beyond technology, however. “We sense there is concern within the farming community. Let’s say the farmer is renting the land. If they transfer the ownership of the land, who owns the data? We want to be part of the effort to democratize [it] as well.”

    Mapping snow to plan ahead

    The second project uses data from a lab that monitors the snow pack in the Sierra Nevada mountain range, correlating it with the size of California’s water supply. The goal is to create a database of images that will enable governments and farmers to predict drought conditions and plan accordingly.

    This is not a new problem. What is new is the amount of data available now that the lab uses remote sensing equipment instead of a stick in the ground to measure snow levels. The lab’s “snow-mapping” technique deduces snow coverage by rating reflectance of snow on a one-to-seven scale.

    “It generates about two terabytes every 15 days, and the source files [are] 75 gigabytes a day,” Sharma said. ”The challenge they want to address is make that data ‘queryable.’ We correct the files and load them into a database called EarthDB, based on work that Michael Stonebreaker did.” Stonebreaker is a famous computer scientist who specializes in databases. Intel’s Science and Technology Center funds the EarthDB project at the University of California, Santa Barbara.

    California key state to tackle water issues in

    It is relevant that these projects are being conducted in California. The state produces nearly half of U.S.-grown fruits, nuts and vegetables and is currently facing one of the worst droughts in recent years.

    Extreme weather is causing the price of some foods, such as lettuce and avocados, to skyrocket, and many predict that shortages of certain kinds of produce will soon become a fact of life.

    Benefit potential beyond Intel’s bottom line

    By tackling the big problems, Intel is investing in the future of all humans as well as its own. One of the immediate benefits for the company has been a quickly growing varied pool of Big Data talent.

    “In the beginning we had one or two data scientists,” Sharma said. “Because of our engagements with a number of these, now we have a very interesting group with specialized expertise in healthcare, telco and now increasingly in food and agronomics.”

    Success of these efforts will extend far beyond Intel’s profits.

    12:30p
    How to Choose Intelligently Between Hybrid and Flash Storage Arrays

    Len Rosenthal is the vice president of marketing at Load DynamiX where he is responsible for corporate and product marketing. 

    Flash storage, or solid state drives (SSDs), is one of the most promising new technologies to affect data centers in decades. Like virtualization, flash storage will likely be deployed in every data center over the next decade. The performance, footprint, power and reliability benefits are too compelling. However, flash arrays come at a price, literally. As the importance of storage infrastructure has increased, so has the budget necessary to meet performance and capacity requirements. The cost of storage infrastructure can now consume up to 40 % of an IT budget and flash storage is not an inexpensive solution.

    Despite vendor claims, flash arrays can run as much as 3X-10X the price of spinning media (HDDs). Informed IT managers and architects will know that the best solution to meet both application performance and budget demands will be a combination of using both technologies.  The question remains, how do you know when and where to invest in each.

    Below are two ways that every storage architect can go about analyzing their current and future requirements to understand which workloads will benefit from flash storage and which will perform better with HDD or a hybrid solution.

    Characterize Your Application Workloads to Create a Workload Model

    One of the smartest ways to understand your storage deployment requirements is to have an accurate model that represents your current storage I/O profiles or workloads. The goal here is to enable the development of a realistic-enough workload model to compare different technologies, devices, configurations, and even software/firmware versions that would be deployed in your infrastructure.

    To effectively model these types of workloads, you’ll need to know the key storage traffic characteristics that have the biggest potential performance impact. For any deployment, it is critical to understand the peak workloads, specialized workloads-such as backups and end of month/year patterns- and impactful events such as login/logout storms.

    There are some basic areas to consider when characterizing a workload.

    1. The description of the size, scope and configuration of the environment itself.
    2. The access patterns for how frequently and in what ways the data is accessed. Proper characterization of these access patterns will be different for file (NAS) and block (SAN) storage.
    3. The data types representative of the applications that use storage to understand how well pattern recognition operates in the environment.
    4. The load patterns over time. Understanding the environment itself differs for file and block storage. Each has unique characteristics that must be understood in order to create an accurate workload model.  The load patterns help determine how much demand and load can fluctuate over time. In order to generate a real-world workload model, understanding how the following characteristics vary over time is essential.  IOPs per NIC/HBA, IOPs per application, Read & Write IOPs, metadata IOPs, Read, Write, & total bandwidth, data compressibility and the number of open files are key metrics.
    5. The basic command mix, whether data is accessed sequentially or randomly, the I/O sizes, any hotspots, and the compressibility and deduplicability of the stored data. This is critical for flash storage deployments as compression and inline deduplication facilities are essential to making flash storage affordable.

    There are a number of products and vendor-supplied tools that exist to extract this information from storage devices or by observing network traffic.  This forms the foundation of a workload model that accurately characterizes workloads.  The data is then input into a storage workload modeling solution.

    Running & Analyzing the Workload Models

    Once you have created an accurate representation of the workload model, the next step is to define the various scenarios to be evaluated.  You can start by directly comparing identical workloads run against different vendors or different configurations.  For example, most hybrid storage systems allow you to trade off the amount of installed flash versus HDDs.  Doing runs, via a load generating appliance that compares latencies and throughput from a 5% flash / 95% HDD configuration versus a 20% flash / 80% HDD configuration, usually produces surprising results.

    After you have determined which products and configurations to evaluate, you can then vary the access patterns, load patterns, and environment characteristics.  For example:

    1. What happens to performance during the log-in/boot storms?
    2. During end of day/end of month situations?
    3. What if the file size distribution changes?
    4. What if the typical block size was changed from 4KB to 8KB?
    5. What if the command mix shifts to be more metadata intensive?
    6. What is the impact of a cache miss?
    7. What is the impact of compression and inline deduplication?

    All of these factors can be modeled and simulated in an automated fashion that allows direct comparisons of IOPS, throughput and latencies for each workload.  With such information, you will know the breaking points of any variation that could potentially impact response times.

    Before deploying any flash or hybrid storage system, storage architects need a way to proactively identify when performance ceilings will be breached and how to evaluate the technology and product options for best meeting application workload requirements. Relying on vendor-provided benchmarks will usually be irrelevant as they can’t determine how flash storage will benefit your specific applications.

    Workload modeling, combined with load generating appliances, is the most cost-effective way to make intelligent flash storage decisions and to align deployment decisions with specific performance requirements. There is a new breed of these solutions available on the market that can provide a workload modeling, load generation and decision management in a single 2 U chassis.  These new technologies easily replace older in-house tools that involve purchasing dozens or even hundreds of servers and involve an infinite amount of man hours to reproduce these workloads under various network conditions.

    Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating. View previously published Industry Perspectives in our Knowledge Library.

     

    2:00p
    5 Great Reasons to Use Colored PDUs

    Let’s start this conversation with two very important points:

    • More than 90% of the data center operators responding to a recent survey reported that their data center had at least one unplanned outage in the past two years (Ponemon Institute).
    • The overwhelming majority of outages were attributed to human error.

    The difference between the frequencies of outages for survey respondents reinforces the importance of a comprehensive management and operations program to ensure continuous data center availability. This eBook from Raritan describes where colored rack PDUs can directly help.

    Your data center will continue to be a critical piece of your business environment. As reliance around your IT platform grows – it will be important for you to find ways to maximize uptime and reduce outages due to human error. One of the best ways to control your physical data center platform is to optimally organize everything that makes the entire environment run.

    Download this eBook today to learn exactly how colored rack PDUs simplify the management process and increase uptime. There are a number of reasons administrators look at new types of rack PDUs for help. As the eBook outlines, 5 of these great reasons include:

    • Easily identify redundant power feeds to IT equipment
    • Make it easier technicians working in your IT rack
    • Clearly identify your data center power chain
    • Identify different voltages in your data center
    • Reduce lighting requirements in your data center

    Little things can cause a server to go offline – like accidentally unplugging a power cord. Good rack PDU technologies, like those from Raritan, offer SecureLock (colored locking plug cords) to prevent cords from accidentally being disconnected from rack PDUs. By reducing the chances of human error within your data center, you’re capable of keeping a healthier running environment with better uptime. Furthermore, good PDU design allows administrators to better track physical data center resources.

    5:07p
    Docker Turns 1.0: Production-Ready Containers for Moving Apps Out

    Docker aims at solving perhaps the biggest problem in IT today: how to move your apps from one place to another without breaking them? The company’s answer is containerization, for which it provides a unique lightweight runtime solution.

    Docker the open source project has hit a major milestone in just 15 months from its birth: Version 1.0 has been released, meaning it’s ready for production deployments. Docker also  announced enterprise support packages and is hosting its first, and already sold-out, DockerCon in San Francisco this week.

    The company provides enterprise training and support for the release. There are two tiers of support: the seven-by-24 premium tier and a standard package, which provides support during work hours.

    Docker is OS virtualization, a level up from hypervisor virtualization. It is an open platform for developers and sysadmins to build, ship and run distributed applications. It consists of two major pieces: the Docker Engine, the container standard that helps build, ship and run distributed applications built on Docker, and Docker Hub, a cloud-based service  that acts as a repository for users content and workflows.

    Docker enables applications to be quickly assembled from components and eliminates the friction between environments.  As a result, IT can ship faster and run the same app, unchanged, on laptops, virtual machines running in an in-house data center, or the cloud.

    Since its inception 15 months ago, the open source project has seen unprecedented community growth and adoption, including more than 2.75 million downloads, 95 percent of contributions coming from outside of Docker the company, more than 8,500 commits and 6,500 Docker-related projects on GitHub.

    A grassroots approach

    Scott Johnston, senior vice president of product at the firm, said app mobility problems have been “bedeviling” IT professionals for decades. “Unless you catch everything, there’s always something that breaks,” he said.

    Shipping an application to a server is hard, and there’s been tremendous investment on the part of tech giants to try and solve this issue. Docker took a grassroots-movement approach to it and has succeeded in involving a lot of community effort, including participation by IT software heavyweights.

    “We’re from the school of container technology which we simplified along with the packaging and shipping of the container,” said Johnston. “There’s been 15 months of rapid innovation. There’s over 440 contributors to the project, including lots of contributions from Red Hat, Google. A lot of rapid change is going on.”

    All the big cloud players already support or will announce support for Docker. Amazon’s cloud embeds Docker in its Linux and there’s Docker support for its Platform-as-a-Service offering Elastic Beanstalk. Rackspace and Google both support it with their Infrastructure-as-a-Service and PaaS products. Red Hat is embedding it in RHEL 7,  and the open-source PaaS community Cloud Foundry supports it as well.

    Problem close to founder’s heart

    How did such an old issue get solved in 15 months? Johnson said it was because Docker became an impartial aggregation point for tech minds to meet and solve the problem. “It also comes from founder and CTO Solomon’s [Hykes] background being close to this issue,” he said.

    “He started as system admin on the operations side.  He saw the same problem in the configuration management space. The genesis was: what if we took another view of the golden image – some people call it ‘golden tarballs’ because they frequently got out of date, etc. We asked ourselves, what if we made it a fine-grained control. It came from a very true place.”

    In a statement, Hykes said, “We’d also like to salute the many enterprises that ignored our statements about ‘production readiness’ and deployed Docker in prior releases. Your bravery (and unvarnished feedback) has been critical as well.”

    Docker touches on several of Data Center Knowledge’s cloud predictions for 2014: namely the one that said open source was moving from alternative to prime time, and the one that forecasted a take-off of container technology as an easy way to spin applications up and down.

     

    6:49p
    Mesosphere Raises $10.5M to Commercialize Twitter ‘Fail Whale’ Killer Mesos

    Mesosphere, a startup with technology that aims to centralize management of IT resources across multiple data centers and provider cloud environments, closed a $10.5 million Series A funding round led by Andreessen Horowitz.

    The company is commercializing Apache Mesos, which lets organizations manage a mix of distributed infrastructure resources like a single machine. Mesos powers Twitter’s private data centers and is widely credited for eliminating the “Fail Whale”, Twitter’s fail screen that was ingrained in the cultural zeitgeist during Twitter’s formative shaky-uptime years. Mesosphere will provide enterprise products based on the open source technology as well as commercial support.

    Making management of an entire data center seem like management of a single gigantic computer is widely touted as the next-generation approach to data center management. It would enable resource-starved organizations scale their data centers like tech giants, such as Google or Twitter, scale theirs.

    ‘The holy grail of cloud computing’

    “Managing your data center as if it’s a single computer is the holy grail of cloud computing, and Mesosphere actually delivers on that compelling vision,” Brad Silverberg, a Mesosphere investor and former top Microsoft executive who led the company’s Windows, Office and Internet platform business units, said. “Mesos is the first technology that actually executes the vision.”

    Mesosphere co-founder and CEO Florian Leibert said companies that try to scale as efficiently as Google usually get stuck because they do not have enough engineering muscle or orchestration and automation tools sophisticated enough to pull it off. “The most efficient path to cloud computing is to run on top of Mesos, and Mesosphere is unlocking that opportunity and making it more applicable for mainstream enterprise developer and ops teams.”

    Google has recently taken its data center management approach a step forward, bringing online a “neural network,” an artificial intelligence system that uses the massive amount of IT operations data generated by its data centers to make decisions that will ultimately drive even greater efficiency of its infrastructure.

    Mesos helps optimize utilization of data center resources, automate IT operations and make application deployment easier. Mesosphere aims to enable enterprises scale applications across private data centers and cloud.

    Mesos provides developers with simple command-line and API access to compute clusters for deploying and scaling applications. It solves the base-layer “plumbing” problems required to build distributed applications and makes applications portable, so they can run on different cluster environments, including private or public cloud, without rewriting code.

    For operations teams, it abstracts and automates even difficult low-level tasks related to deploying and managing services, virtual machines and containers in scale-out cloud and data center environments. Mesosphere claims two- or three-fold resource utilization improvements and promises built-in fault tolerance, reduced system administration time and more predictable, reliable and efficient scaling.

    Examples of companies using Mesos include Airbnb and HubSpot, both of whom use it to manage their Amazon cloud infrastructures, as well as eBay, Netflix, OpenTable, PayPal and Shopify, among others.

    From Twitter and Airbnb to Mesosphere

    Some of the big-name users of Mesos were not only early adopters but also  Mesos started as an open source project at University of California, Berkeley’s AMPLab, where it was created by then graduate student Benjamin Hindman.

    Mesosphere’s Leibert founded his company after having helped Twitter and Airbnbn stand their Mesos environments up.

    Twitter discovered the open source project, deployed Mesos in production and has been running its entire application infrastructure on top of it ever since, and Leibert was one of the architects that brought it to Twitter. He then went on to build a Mesos-based analytics infrastructure for Airbnb, the hyper-growth company that disrupted the hotel industry, before founding Mesosphere.

    “We’re using Mesos to manage cluster resources for most of our infrastructure,” said Brenden Matthews, distributed computing engineer at Airbnb. “We run Cronos, Storm and Hadoop on top of Mesos in order to process petabytes of data.”

    Mesosphere’s seed funding round included $2.25 million from Andreessen Horowitz, Kleiner Perkins, Foundation Capital and SV Angel. The company has built tools for making Mesos easier to use and deploy, including Marathon for large-scale orchestration and Deimos for Docker (a container solution) integration.

    11:33p
    DuPont Fabros Completes 9 MW in Santa Clara Using New Electrical Design

    DuPont Fabros Technology has completed a new phase at its Santa Clara, California, data center, the first project to use the company’s new electrical system design, which enables it  to deploy capacity in smaller chunks than it has traditionally done.

    The wholesale data center provider, whose biggest customers are Facebook, Microsoft, Yahoo and Rackspace, has made the change to more closely match the demand it is seeing. The move reflects a broader market shift away from the massive long-term data center lease contracts that have traditionally been bread and butter for developers like DFT and its competitor Digital Realty Trust.

    According to a January market report by Avison Young, a commercial real estate company, there was “significant downward pricing pressure” on data center providers throughout 2013, and that pressure was particularly seen in larger wholesale deals.

    DFT has been rethinking its data center design overall, including both power and cooling infrastructure. The ACC7 data center it is currently building in Ashburn,Virginia, is going to feature all the latest design innovation by the developer.

    Traditional phase sliced

    The company announced that it has finished commissioning work on Phase IIA of its SC1 facility in Santa Clara on Monday.

    It is called Phase IIA precisely because the space that was previously referred to as Phase II has been split into smaller chunks, DFT spokesman Christopher Warnke said. Phase II as a whole is 18.2MW, only half of which has now been commissioned.

    With the older design, DFT could not bring online 4.5 MW of capacity in a larger building, for example, and then bring online another 4.5 MW without disrupting power delivery to the capacity that’s already online, Warnke explained.

    There isn’t a static smallest capacity block the company can deploy, but there is a number of factors outside of client need that determine what the size of that increment will be. The biggest ones are the kind of uninterruptible power supplies and generators used. Those in addition to DFT’s N+2 redundancy scheme are the primary limiting factors.

    The smallest possible increment in Santa Clara was 4.55 MW, but the company deployed 9.1 MW because of demand. “It depends on where we see potential prospects, lease signing, etc.,” Warnke said. “We could go down to 4.55 MW in that facility if we wanted to.”

    Space in Phase IIA going quickly

    Three-quarters of the newly completed phase has been leased, and development of the 9.1 MW Phase IIB has commenced. “Pricing remains constant, and we’re seeing good demand out there,” Warnke said about the Silicon Valley market.

    He could not share the names of any of DFT’s tenants in Santa Clara, saying only that one of them was one of its top four customers (Facebook, Microsoft, Yahoo and Rackspace). According to the Avison Young report, however, Microsoft signed for 6 MW with DFT in Santa Clara in 2013. Another SC1 customer mentioned in the report is Dropbox, which last year signed for 1.5 MW.

    DFT plans to finish Phase IIB in March of 2015, at which point its entire SC1 data center will be built out, totaling 36.4 MW of available critical load across 176,000 square feet of computer-room space.

    << Previous Day 2014/06/09
    [Calendar]
    Next Day >>

Data Center Knowledge | News and analysis for the data center industry - Industry News and Analysis About Data Centers   About LJ.Rossia.org