Data Center Knowledge | News and analysis for the data center industry - Industr's Journal
 
[Most Recent Entries] [Calendar View]

Friday, June 26th, 2015

    Time Event
    12:00p
    Startup Translates All Data to Data Hadoop Can Use

    While popular, Hadoop is a notoriously difficult framework to deploy. Standing up a Hadoop cluster in a company data center is a complex and lengthy process.

    There is a market for helping companies do that, which is why so many startups and incumbent IT vendors have flooded the space in recent years.

    One startup, BlueData, founded by a pair of VMware alumni, is tackling one of the hardest problems: presenting data stored in a variety of different formats on a variety of different systems to Hadoop in a uniform way that Hadoop understands and doing it quickly.

    The need to process a wide variety of data by enterprise analytics systems is growing. More and more companies need to process data from their internal sources as well as external and process it all together, according to market research firm Gartner. The role of data generated by the Internet of Things is also growing in importance.

    Going against the orthodox notion that Hadoop should run on bare-metal servers, BlueData’s platform is based on OpenStack and uses KVM virtual machines. Convincing the market that you can stand up a Hadoop cluster using VMs and still get the performance right is one of the biggest hurdles in meetings with customers and investors, Jason Schroedl, BlueData’s VP of marketing, said.

    But they did manage to convince a handful of big-name customers, including Comcast, Orange, and Symantec, and a group of VCs who have pumped $19 million into the Mountain View, California-based startup since it was founded three years ago.

    “When we go to pitch our stuff, it’s always, ‘Why don’t I move to Amazon? How do you beat bare metal?’” the company’s co-founder and chief architect Tom Phelan, a 10-year VMware veteran, said.

    Translating Everything to HDFS in Real Time

    The pitch is simple. BlueData’s platform, called Epic, presents data from a set of disparate file systems or object stores as if it is coming from Hadoop Distributed File System. Using proprietary tech, it converts data to HDFS in real time and delivers it directly to VMs in a Hadoop cluster.

    It supports a variety of Hadoop distributions, including numerous versions of Cloudera and Hortonworks. It supports both native Spark and Spark on top of Hadoop. Spark is an open source distributed-processing framework for real-time analytics.

    One big reason the system is so fast is BlueData’s own caching architecture. Its VMs are fast and lightweight, and it runs HDFS outside of the cluster.

    Customers access Epic through a RESTful API or through a user interface. The user selects the amount of nodes in the cluster and a Hadoop application, and a few minutes later they have a virtual cluster ready to run their analytics job.

    Because the cluster is virtual, a user can stand it up temporarily, run the job, and dismantle it after, freeing up IT resources for something else. It can be used to generate infrequent reports, for example, or for testing applications that rely on Hadoop.

    Hadoop on AWS in Docker Containers

    BlueData focuses primarily on on-premise enterprise deployments, but the company recently introduced a cloud-based version of the platform called Epic Light. It runs in cloud VMs on Amazon Web Services and uses Docker containers.

    Cloud-based deployments are not really BlueData’s sweet spot, Phelan said. Epic cannot take advantage of the high performance it’s engineered for without backend access to the data center systems it runs on.

    The company introduced Epic Light mainly because Docker containers are so popular nowadays, but also because containers can potentially provide a performance improvement over using virtual machines when deployed on-prem. “We do pay a penalty for virtual machines,” Phelan said. “They have a CPU overhead.”

    Epic consumes more CPU cycles to run Hadoop jobs than bare-metal solutions, and there are customers that need to consolidate as many Hadoop workloads into as few CPU cycles as possible. “Containers are the best solution for them,” he said.

    There are also customers that simply don’t have dedicated bare-metal servers in their data centers to put Epic on. They can only allocate VMs, and you cannot run BlueData’s VMs inside other VMs. These customers have to use the container-based version too.

    Will Containers Replace VMs?

    Asked whether the full enterprise version of Epic can be entirely container-based, Phelan said it would not be feasible, at least today. “State-of-the-art containers still have security liability issues,” he said.

    Epic is a multi-tenant system, and Docker containers cannot really isolate applications from one another all that well when sharing a host. Secure separation between different users is a must for many enterprise customers, and there is “still a pretty good attack surface within containers,” Phelan said.

    Whatever form virtualization will ultimately take, companies have access to more data today than they have ever had, and many of them are anxious to put it to work. As enterprise use of data analytics continues to grow, so does the size of the opportunity for companies like BlueData.

    Companies look at Big Data as a way to grow business value, and the easier it is for their analytics engines to access wide pools of data, the more effective those engines will be.

    3:00p
    Five Real-World Tips for Creating Corporate Cloud Policies

    With cloud computing taking off at a very fast pace, some administrators are scrambling to jump into the technology. Unfortunately, many organizations are purchasing the right gear, deploying the right technologies, but still forgetting the corporate policy creation process.

    One big push for cloud computing has been the concept of “anytime, anywhere, and any device.” More often than not this means allowing customers to use their own devices while pulling data from one or several corporate locations. Although this can be a powerful solution, there are some key points to remember when working with cloud computing policy creation.

    After experiencing several cloud deployments, both large and small, I’ve seen some big success factors which help move the process along. The following five pieces of advice can fit into use cases spanning from simple cloud storage projects to entire cloud application and content delivery migrations. With that, if you’re working with the cloud consider the following:

    1) Create Champions and Train Your Users

    A positive cloud experience often begins with the end-user. This is why when creating a BYOD or mobile cloud computing initiative it’s important to train the user. Simple workshops, booklets, and training documentation can really help solidify a cloud deployment. Within large enterprises, executives will actually create and assign cloud champions. These are people who may be temporary folks from the actual cloud provider who are there full-time to assist with questions, comments, concerns, and even demos. In one large organization they set up several small kiosks to allow employees to test new devices, applications, and interfaces. This ease into the cloud allowed many of these companies to create a much more comfortable and accepting end user. Plus, post-deployment, you have a user that’s much more familiar with their new cloud environment. In turn, you see fewer help-desk calls and more user productivity.

    2) Design New Cloud and Device Policies

    Although a device accessing a cloud architecture may belong to the user, the data being delivered is still corporate-owned. This is where the separation of responsibilities and actions must take place. Users must still be aware of their actions when they are accessing internal company information. It’s important to create a policy which will separate personal and corporate data. There are a number of ways this can be accomplished. Creating secure application sandboxes on user’s devices is one way to lock down corporate data on non-corporate devices. When it comes to creating a new usage policy, a big recommendation is to actually explain what’s changing and how you, the organization, will be delivering data to user devices. This will help put users at ease when using personal devices. They’ll understand boundaries and how the information is actually getting there. Similarly, if the device is corporate-owned, show the user where usage policies are evolving and how this allows them to be secure and still productive.

    3) Allow Users to Get Their Own Devices

    One of the greatest strengths of cloud computing is that it can eliminate the need to manage the end point. Some estimates mark the management of a corporate desktop between $3,000 and $5,000 over the life of the computer. Many organizations are creating a stipend program allowing users to purchase their own devices where they are responsible for the hardware. After that, the company is able to deliver the entire workload via the cloud. This can revolve around both end-point and mobile devices. Remember, the whole point of the cloud is to allow workloads to be delivered to any device, anywhere. This kind of agility can allow for dynamic cost savings and more productivity. However, successful deployments of stipend programs work very closely with user training. This is also where a “device champion” can help users understand options, use cases, and how to be most effective.

    4) It’s Not Free for All. Have Approved Devices

    Jumping from the previous point, when creating a cloud policy, it’s important to work around approved and tested devices. If BYOD is the plan, test out a specific set of devices which are known to work with the corporate workload. This should be done during the very initial stages of a cloud deployment. Are you creating an application delivery architecture? Are you designing a cloud app to be pushed down via a browser? Or maybe you’re creating a platform for desktop access and data delivery. The point is that different devices will behave differently given the workloads. It’s important to test this out. One successful way to do so is through a proof of concept. Today, device makers are eager to lend gear to organizations for end-user testing. This means you can create small users test groups and really understand application and cloud behavior.

    5) Update General IT and Computer Policies

    Almost every organization has a computer usage policy. With cloud integration, it’s time for an update. Devices are no longer sitting on the LAN. They are now distributed around the world. This policy should have a subsection outlining usage requirements, considerations, and responsibilities aimed at both the user and the organization. Remember, this goes beyond a new HR handbook. The new cloud and computer usage policies should define geo-fencing, accessing data which requires a VPN, and even how information is logged/monitored when delivered via cloud. These policies extend the users, your executive teams, and, very importantly, your IT teams. As critical as it is to get the end-user on board, your IT team has to be completely in sync with a cloud deployment. Defining new computer usage and IT policies around cloud helps assign roles, clarify new responsibilities, and how this impacts the overall team.

    The reality is simple: having control over your cloud environment will revolve around the amount of time spent planning the deployment. There are many different verticals in the industry and lots of different ways to approach cloud policy creation. Still, the best piece of advice I can leave you with is this: Do your absolute best to align the entire organization. Some of the absolutely most successful cloud projects I’ve been a part of would involve every employee. Information was readily available and would be translated to the user based on their specific role. Basically a “how cloud will empower you” sort of guidance. In turn, you create users, champions, and a productive force firing on all cylinders to push your cloud initiative forward.

    3:30p
    Friday Funny: Pick the Best Caption for “Hot Aisle”

    Hot aisle containment can be tasty!

    Here’s how it works: Diane Alber, the Arizona artist who created Kip and Gary, creates a cartoon, and we challenge our readers to submit the funniest, most clever caption they think will be a fit. Then we ask our readers to vote for the best submission and the winner receives a signed print of the cartoon.

    Congratulations to Michael S., whose caption for the “Colors” edition of Kip and Gary won the last contest with: “Crayola’s in the cloud now.”

    Several great submissions came in for last week’s cartoon: “Green Data Center” – now all we need is a winner. Help us out by submitting your vote below!

    Take Our Poll

    For previous cartoons on DCK, see our Humor Channel. And for more of Diane’s work, visit Kip and Gary’s website!

    4:00p
    Weekly DCIM News Roundup: June 26

    FNT Software and Future Facilities advance DCIM offerings, FieldView Solutions talks about the physical world meeting the virtual world in healthcare facilities, and Part 4 of our DCIM Decisions series talks about the challenges of implementing a DCIM solution in your data center.

    1. FNT Software Advances DCIM Solution. FNT Software announced release 10 of its FNT Command DCIM software. The new edition includes Graphic Center Technology, FNT Command Dashboard, EasySearch, Business Gateways, and FNT Command Mobile as well as enhanced usability, intelligence and analytics.
    2. Future Facilities Updates 6Sigma DCX software. Future Facilities announced release 9.3 of its 6SigmaDCX suite. The new edition includes new features and speed improvements.
    3. DCIM in Healthcare: The Physical World Meets the Wirtual World. FieldView Solutions CMO Sev Onyshkevych discusses the role DCIM tools play in helping healthcare IT organizations monitor and track physical environments, as well as examine and pinpoint potential failure points and efficiency gains.
    4. DCIM implementation – the challenges. Part 4 of a five part series on the DCIM journey – about the challenges of retrofitting an operating data center, as well as some of the considerations for incorporating DCIM systems into a new design.
    6:06p
    Google Provides Beta Git Repository Hosting Service for Google Cloud Platform

    logo-WHIR

    This article originally appeared at The WHIR

    Without issuing an official announcement, Google beta launched a private Git repository hosting service on Google Cloud Platform known as Cloud Source Repositories.

    According to a Google webpage spotted by VentureBeat, Cloud Source Repositories provides beta users free hosting of up to 500 MB of storage. It requires Git and the Google Cloud SDK. Using the Git command line tool, users can setup the Cloud Source Repository as a remote Git repository for their local repository. They can also connect a Cloud Source Repository to a hosted repository service such as GitHub or Bitbucketthat will automatically sync.

    Cloud Source Repositories is integrated with the Google Developers Console, providing a source code editor for viewing repository files and making quick edits. It also works with Google Cloud Debugger providing insight into Java applications running on Google Compute Engine and App Engine.

    As VentureBeat notes, it could be difficult for Google to overtake GitHub and Bitbucket, which are currently the most popular source code hosting providers. Cloud Source Repositories could be considered another tool in Google’s garage that means developers can get most of their needs satisfied without leaving the Google ecosystem.

    “Google Cloud Source Repositories provides a crucial part of our end-to-end cloud tooling story,” Google Cloud Platform product manager Chris Sells wrote in an email to VentureBeat. “By allowing you to manage your source in your cloud projects along with your other cloud resources, you’ve got a one-stop shop for everything you’re doing in Google Cloud Platform. The Cloud Source Repositories service provides a private Git repository that works with your existing tools while providing a high degree of replication and encryption to make sure that your code is as safe and secure as you’d expect from Google’s cloud infrastructure.”

    Indeed, building an ecosystem of tools that make it easier for developers to build and deploy applications on their own platform is becoming a major way for cloud providers to attract and keep developers in a competitive landscape. For instance, Amazon Web Services introduced its Git repository, CodeCommit, in November, and Microsoft provides the Visual Studio Online repository service for its Azure cloud.

    It is unclear what the pricing of Cloud Source Repositories will be once it’s generally released. Github’s freemium pricing model provides public repositories for free, but plans including private repositories start at $7 per month or individuals and $25 per month for organizations.

    Earlier this year, Google announced that it would be discontinuing its project hosting service Google Code which was a more direct equivalent to GitHub and Bitbucket. In contrast to Google Code, Cloud Source Repositories’ integration with Google Cloud Platform could make Cloud Source Repositories the default tool for developers in the Google ecosystem.

    This firs ran at http://www.thewhir.com/web-hosting-news/google-provides-beta-git-repository-hosting-service-for-google-cloud-platform

    7:33p
    In Pennsylvania Keystone Stacks Data Center Modules Three-High

    A company that converted a former Pennsylvania steel mill into a data center recently brought online the first set of data center modules in the facility.

    Taking an unusual approach to modular data center design, Keystone NAP is stacking custom shipping-container-like modules three-high. The approach takes advantage of the building’s unique physical characteristics and maximizes the use of real estate.

    “In order for us to best use the space, we wanted to have multiple levels … inside the building,” Shawn Carey, the company’s co-founder and senior VP of marketing, said.

    Modular data center design can mean several different things. It could mean using modular components that come together to form the building, which is the approach Facebook has used. It could also mean using modular electrical and mechanical infrastructure components.

    In Keystone’s case, modular means shipping entire container-like modules manufactured at a remote factory to the site for quick deployment. Used by enterprises and some data center providers, the value of this approach is in being able to deploy data center capacity quickly and in small increments instead of spending a lot of money upfront to build infrastructure that supports a lot of capacity, most of which may sit unused for years.

    Keystone collaborated on design of the modules with Schneider Electric, which manufactures and supplies them to the data center provider. Custom elements of the design include hot-aisle containment, high rack density (22 racks per module), and the ability to stack them on top of each other.

    Schneider delivered the first KeyBlock to the location earlier this year.

    A single stack consists of six modules (two stacks of three adjacent to each other). The first six are up and running in the facility in Fairless Hills, but the building can accommodate up to 100.

    That’s in addition to expansion space on the property, which can accommodate 50 more if necessary. “We’re continuing to innovate and evolve the design for each phase, and the size could shift smaller or bigger as we need to,” Carey said.

    All six modules that are already on site have been leased to customers, but he declined to name them. The customers are in the financial services, network services, and managed services businesses.

    Keystone NAP uses a 50-ton crane in the building to move the modules and customer equipment. Upper-level modules are accessible by stairs. (Photo: Keystone NAP)

    Keystone NAP uses a 50-ton crane in the building to move the modules and customer equipment. Upper-level modules are accessible by stairs. (Photo: Keystone NAP)

    The reason Keystone management decided to stack data center modules on top of each other was because the building was long (1,000 feet) and fairly narrow (60 feet) but had high ceilings (60 feet). Stacking modules was an easier alternative to building out multiple stories in the building.

    The building’s five-foot-thick foundation is strong enough to support that kind of weight. Besides the weight of the modules themselves, some customers, according to Carey, are bringing in extremely heavy custom high-performance computing gear.

    Power density and the building’s power capacity allow for high-density computing equipment. The modules support up to 400 watts per square foot, and the building’s current total power capacity is 32 MW, Carey said.

    Schneider manufactures the modules at a factory in North Carolina and delivers them on site for assembly. It takes about 90 days between the time a customer signs a contract with Keystone and the time they can start bringing in their IT equipment.

    There was a lot of skepticism in the data center industry about this kind of modular data center build-out when it started. But the use of prefabricated modular data centers has been growing steadily.

    A recently published report by Research and Markets estimated that the modular data center market will grow from $8.37 billion this year to $35.11 billion by 2020.

    Being able to deploy data center capacity quickly and matching the size of the deployment to the requirement at a certain period of time are big advantages in the capital-intensive data center business, and modular data centers provide both of those abilities.

    8:52p
    Data Center Design Today Has to Account for Constant Change

    While most legacy application workloads never go away, new ones get added all the time. That puts a lot pressure for data center operators that need to be able to continue to support legacy applications while addressing the needs of emerging ones that now often appear without much notice.

    At the Data Center World conference in National Harbor, Maryland, this September, Trevor Slade, product manager for IO, a provider of data center colocation services, will show how a modular approach to data center management enables IT organizations to not only “future-proof “ their data centers, but also achieve higher levels of IT agility.

    His session is titled “Strategies for Responding to Demands for Increased Capacity”

    Slade said one the biggest challenges that IT organizations now face is pressure on the data center facility crated by the rise of hyper converged systems. It’s now easier than ever to scale out systems.

    As those systems scale out, the amount of power and cooling required to support them changes considerably over time.

    Given varied levels of density inside the data center, IT organizations need to better isolate those workloads to make sure the needs of one set of applications doesn’t wind up impinging on another, Slade said.

    Over time, this approach also makes it simpler to absorb refreshes to the systems themselves.

    “You need to think about it as managing data centers within data centers,” he said. “That way [you] can future-proof from an operations standpoint.”

    Just as critical, future-proofing also needs to encompass making certain that the physical network can extend to external cloud services, which should really be treated as a logical part of the overall virtual data center environment.

    There’s no doubt that managing data centers is a more complex endeavor than ever. To rise to that challenge, the data center itself needs to be fundamentally designed from the ground up to absorb as much change as possible with the least amount of disruption to IT operations.

    For more information, sign up for the spring Data Center World, which will convene in National Harbor, Maryland, on September 20-23, 2015, and attend Trevor Slade’s session titled “Strategies for Responding to Demands for Increased Capacity”

    << Previous Day 2015/06/26
    [Calendar]
    Next Day >>

Data Center Knowledge | News and analysis for the data center industry - Industry News and Analysis About Data Centers   About LJ.Rossia.org