Data Center Knowledge | News and analysis for the data center industry - Industr's Journal
[Most Recent Entries]
[Calendar View]
Wednesday, May 11th, 2016
Time |
Event |
12:00p |
Data Center Transformation Will Unfold in Four Steps There are a multiplicity of trends simultaneously altering our collective vision of what a data center is, and what it is becoming. And those trends are not necessarily acting in concert. We thought software-defined networking would make it easier for data centers to stage workloads more efficiently on a Layer 3 that was more effectively decoupled from Layer 2. But then NFV came along, and suddenly telcos are introducing the rest of the world to a completely new way to envision the role of the data plane in SDN.
It’s not as easy to predict where data center technology is going when all the trends converge. At the OpenStack Summit in Austin, Texas, a few weeks ago, network functions virtualization stole the show. Attendance at sessions that had the slightest relationship to NFV was as much as two orders of magnitude higher than those dealing with ordinary OpenStack administration. IT professionals are curious as to whether this new methodology for workload orchestration will have any impact, directly or indirectly, upon data center architecture.
NFV came about as a result of the common need among communications providers to automate the provisioning of customer services when deployed on common, commodity servers. Virtualization was essentially the means to an end; NFV’s initial goal was automation. What makes NFV attractive to data centers outside of telcos is that high-level automation aspect. What makes it risky is the degree to which NFV would reform data centers to make this automation feasible.
Read more: Telco Central Offices Get Second Life as Cloud Data Centers
The Four-Step Program
It would be technically inaccurate to say that Tom Nadeau wrote the definitive book on SDN, because he actually wrote or co-wrote several (with co-author Ken Gray). He’s currently busy completing the book on NFV, due out in August from Morgan Kaufmann Elsevier. With Cisco, Nadeau’s achievements included serving as the principal architect for the MIBs for MPLS protocol; and with Juniper, he led the SDN development effort. Now with Brocade, he is the driving force behind VNF Manager, a commercial implementation of OpenStack’s Tacker component, staging virtual network functions on an NFV platform.
Related: Specialized Data Center Network Gear on Its Way Out
In an interview with Data Center Knowledge at the recent OpenStack Summit in Austin, he told us he disagreed with the opinion held by a number of OpenStack contributors that NFV will be a methodology confined mainly to telcos.
“If you step forward in the evolutionary progress of virtualization, Step 1 is what we’re doing with virtual machines,” said Nadeau. “OpenStack deploys a virtual machine, and there you have it. If you look at the cost model around that, it’s going to be difficult to make that cost-effective in the long run. Where you need to go is Step 2, which is containers; Step 3, microkernels; Step 4, Platform-as-a-Service.”
Arguably, the whole point of OpenStack is to enable data centers to deploy resources using a service model inspired by Amazon. Even internally within organizations, users should be able to provision the services and applications that apply to them through a self-service portal. From that perspective, Nadeau’s Step 4 (which will be fleshed out in his forthcoming book) is the main goal in the first place.
But it’s Steps 2 and 3 that tell the full story here. For data centers to deliver services to their users the way modern telcos do, Nadeau plots a course that leads not only to Docker-style containerization but to minimal, bootstrap operating loaders, capable of provisioning and managing servers at startup until a full OS kernel is available. OpenStack’s microkernel (MK) is called Razor, which its proponents also describe as a “provisioning engine.” In practice, Razor enables OpenStack deployments on bare metal servers.
Coupled with a container environment, Razor changes the picture of the optimum physical server in a private cloud. Now it looks a lot more like a product of the Open Compute Project, Facebook’s, Microsoft’s, and now also Google’s effort to socialize the specification for “plain vanilla” in the data center.
Related: Equinix, AT&T, Verizon Join Facebook’s Open Source Data Center Project
Nadeau said VNF Manager will be developed with this trajectory in mind: weaning data centers off of reliance upon first-generation virtual machines such as the VMware variety, and toward a highly scalable NFV platform where heterogeneous workloads may co-exist. In such an environment, he said, it should not matter much to the admin where network functions are deployed.
“In fact, the more you cater to the enterprise environment, the more ubiquitous that’s going to be,” he continued. “I think enterprises, large and small, have a need for these things.”
This, Too, Shall PaaS
Regardless of their size, Nadeau believes, enterprises are driving a more cloud-native applications development model. That drive is pressing all clouds, including the private variety championed by OpenStack, to adopt one path toward PaaS-style provisioning — a single path that works for everyone, including the largest customers.
“If you can be part of that heterogeneous PaaS model that has physical elements at the end, and maybe some virtualized functions and then applications running over a message bus in the middle — which is the enterprise application model — it all makes sense,” said Nadeau.
“Plain vanilla” OpenStack, consisting of just the open source components without any vendor support baked in, does enable containerization today, by way of a component called Magnum. In a containerized model, highly reduced virtual containers stored in an open source format rely upon the Linux kernel for virtualization, rather than upon a hypervisor. While this does represent the state of OpenStack today, Nadeau acknowledged, most industries that have adopted OpenStack remain at Step 1 of their journey to the PaaS model.
“They’re still using this model of what I call ‘aggregated virtual machines,’” he explained. Software-defined networks rely upon virtualized appliances such as virtual routers — which happen to be Brocade’s key products. But these vRouters, vSwitches, and vNICs are too often deployed within VMs, introducing tremendous overhead and making automated deployment difficult.
What’s more, such an environment is more difficult to scale up — by some accounts, more so than a physical environment, by virtue of all the automation instructions that have to be accounted for. With the more clearly defined NFV that Nadeau and his colleagues seek, virtual network appliances would lose the “appliance” motif — the notion that they’re effectively emulators for physical devices, floating blithely in a VM envelope.
Disaggregation
“As we progress down that roadmap,” Nadeau predicted, “you’ll see disaggregated VMs, and more and more disaggregation, until we get to that point at the end where you have just the nuggets of the functionality that you would need from that router, and maybe from somewhere else… database functions, for example, or analysis functions, and combining them together to create the service.”
Once we have more microkernels and microservices running in this NFV environment, we asked Nadeau, will we have inadvertently weaned ourselves from OpenStack as we have come to know it? Or will OpenStack take on a new mission?
“I think a lot of what exists today in OpenStack is fine and will be preserved,” he responded. “There are things that you need to wrap around OpenStack to augment it, to make it suitable for that microservices/PaaS model going forward. And a lot of guys are doing that today: Cloud Foundry, Open Baton — there’s a variety of these things happening. And there are people seeing that.
“What I see happening in this technology, there are service providers and enterprises, and they’re all coming this way,” Nadeau continued, locking his fingers together to illustrate his point. “I think a lot of what both sides of this coin want to do, eventually, is the same thing.” | 3:00p |
Akamai Pledges to Source Renewable Energy for Data Centers Akamai has well more than 200,000 servers running in data centers spread across 126 countries. That’s the kind of distributed system you build if you want to be one of the world’s largest content delivery networks.
This week, the company announced it wants to get to a point where at least half of that infrastructure is powered by renewable energy. It has given itself four years to get there.
But how do you source renewable energy for a network that’s so spread out? A company like Google or Microsoft can commit to a long-term power purchase agreement with a developer of a 10MW wind farm on the same grid as one of its mega data centers, knowing that single facility will need the entire 10MW, if not more.
A company like Akamai doesn’t have a large concentration of servers in any single location. It takes only a little bit of capacity in each location from a variety of colocation providers, and most of the global colocation industry hasn’t prioritized renewable energy to power data centers.
“We don’t own any facilities and already pay the landlords for our electricity,” Nicola Peill-Moelter, Akamai’s senior director of environmental sustainability, wrote in a blog post. “Solar panels on roofs and wind-farm power purchase agreements for our individual facilities are not options for us.”
While the data center colocation industry as a whole can hardly be referred to as a major steward for renewable energy, some individual companies have recently made commitments to renewable energy of unprecedented scale.
Equinix, for example, has signed wind power deals it said would make its North American footprint 100 percent renewable. Digital Realty is offering its customers one year of premium-free renewable energy anywhere around the world where it has data centers. Las Vegas-based Switch has made big renewable energy commitments in Michigan and Nevada.
To reach is 50-percent renewable goal, it will try out a method it hopes other companies with highly distributed infrastructure will be able to use too. The option is called Contract for Differences.
The way it works is Akamai agrees to act as an “off-taker” for X amount of energy from a renewable generation developer over a long term at a fixed energy price. The generation will be located on the same grid as one of Akamai’s colocation data centers is located.
The developer sells the energy on the wholesale market, and the difference between the actual sale price and the fixed price Akamai has agreed to goes to Akamai as either credit or debit. Akamai continues to buy electricity from its utility provider but gets to keep and retire renewable energy credits it receives through its contract with the developer.
This may seem like an elegant “finance-innovation” solution, but it gets messy. First, Akamai’s footprint doesn’t remain static. The network is constantly growing. Its traffic increased 20-fold over the last seven years, and it expects a 45-percent increase between now and 2020.
Second, financial instruments like Contract for Differences and renewable energy sources aren’t as readily available in places like India or Australia as they are in California, for example, Peill-Moelter wrote. But the company believes the situation will improve.
“No one said this would be easy, or that we would have all the answers at the start. That’s why it’s called a commitment,” she wrote.
Read more: Cleaning Up Data Center Power is Dirty Work | 3:30p |
The Top Five Reasons Your Application Will Fail Robert Haynes is a Networking Architecture Expert with F5 Networks.
Applications fail. Large applications. Small applications with the potential to be the next big thing. Applications with redundant infrastructures. And even applications in the cloud. Sometimes they fail suddenly, and sometimes they just can’t cope with demand.
When they do fail, it’s not long before you give your vendors a call. You are paying them for support, after all. We’ve been on the other side of that phone call many times because we have thousands of customers running every kind of app, in almost every infrastructure model you can think of.
Here are the top five causes of application failures we’ve seen over the years:
Mostly, it’s Human Error
Most failures are due to admin error. In fact, several of my colleagues put this as reasons 1-3 of their top 5. These errors can be simple mistakes, such as rebooting the production database cluster instead of the QA. Or they can be systemic errors in the overarching architecture design, like synchronous storage mirroring without checkpoints – copying a database corruption to the DR site in real-time. My advice: mitigate these risks through increased automation and testing. Your changes should be preconfigured, tested, and then executed in production with a minimum opportunity for error. For this to work, it’s important that every component from your live environment is represented in the test environment. Fortunately, nearly all vendors now offer virtual versions of their components in a ‘lab’ edition. This allows you to create a test environment that behaves as near to your production environment as possible.
It Looks Like it’s Working, but it’s Not
Another common cause for failure is an application server failure or misbehavior that remains undetected by monitoring systems. Just because the application server responds to an ICMP ping or returns a “200 OK” to an HTTP request, does not mean things are working properly. Monitoring and health checking services must report application health accurately. For a web application, make sure that your health checks perform a realistic request and look for a valid response. Some organizations even create a specific monitor page that exercises critical application functions to return a valid response.
Capacity Planning Failures
Sure, the application worked in test, and flew through user acceptance testing. What happens once it goes live and twice as many users as predicted turn up? An unusably slow application is effectively offline. In an ideal world, of course, we would test our productions applications against the expected load and beyond. But testing applications at scale can be complex and expensive. And predicting application demand can be difficult. The best mitigation is to build application architectures that can scale reliably and rapidly. Fortunately, mostly thanks to cloud computing, there are plenty of design patterns for applications that can scale horizontally to meet demand. Designing an application architecture built to scale from the start will help you respond rapidly to unexpected demand.
With a Whimper, not a Bang
Some of the hardest application failures to detect don’t happen with a shower of sparks before the (virtual) lights going out. They occur over time, slowly building up until their effects become noticeable. Memory leaks, connections held open, database cursors consumed. Because applications are now complicated, interconnected entities comprised of many processes, finding the culprit can be tricky. Under pressure to fix the problem quickly, the old standby of “have you tried turning it off and on again?” can be a tempting fix. However, unless you have appropriate resource monitoring, you’re probably going to be back here soon. If the application doesn’t restart cleanly, you might need to rely on your backup or DR procedures.
When the DR, Backup, or Failover Doesn’t Work
Although you might scoff at the thought of backups and DR not being tested, it’s surprisingly common. Stories abound of backups silently failing for months, critical servers being missed from schedules, and, in an extreme example, DR equipment being ‘repurposed’ to meet another project’s deadlines. In an example I personally witnessed, this last case led to eight days of downtime during a critical business period. When the production database went down and stayed down, the operations team enacted the DR failover procedure. Except, half the DR site was now missing. The result: the company was unable to sign up any new users, leading to a significant financial impact on the business. This is an unusual example, but unless your backups, your DR, and your procedures are testable and tested, you might find yourself in a similar situation.
What Probably isn’t to Blame? Hardware
When in doubt, blame the build, not the bricks. About the least common cause of application outage is hardware failure, which happens when a device just crashes or stops working. Clean failures are usually easy to deal with, and most critical components run in clusters of two or more. Application server farms span multiple physical hosts, and storage subsystems have RAID and other technologies to protect data. Everyone I’ve consulted uniformly places individual hardware failure at the bottom of the list (and human error at the top).
Looking at this list, it’s clear that focusing on reducing the chances for human error and designing for scalability can prevent application failure. Just as important should be having excellent visibility for spotting problems early, combined with robust and tested recovery procedures when all else fails.
Industry Perspectives is a content channel at Data Center Knowledge highlighting thought leadership in the data center arena. See our guidelines and submission process for information on participating. View previously published Industry Perspectives in our Knowledge Library. | 3:51p |
IBM Takes Former Hybrid Cloud Sales Head to Court  By Talkin’ Cloud
IBM is suing former senior sales executive Louis Attanasio for disclosing corporate secrets related to its cloud computing business to a competitor, according to a report by Fortune on Wednesday.
According to the report, the complaint against the company’s former general manager for global sales of hybrid cloud was filed in Manhattan on Monday for $500,000.
It is alleged that Attanasio sent confidential documents to his personal email address before departing IBM for Informatica on April 6. The emails included a series of confidential messages with his manager discussing “extremely sensitive details about IBM’s revenue targets, its performance relative to those targets, and resource allocation,” the complaint said.
A long-time IBM employee, Attanasio is now employed as Chief Revenue Officer at Informatica. The company provides a range of services that compete with IBM including cloud and data integration, big data management, and data governance. Its customers include Western Union and Citrix.
IBM is looking for an injunction ordering Attanasio to “honor a 12-month global non-compete agreement and to return a $250,000 payment he accepted to stay on in 2015, plus an additional $251,357 of equity compensation he recently received,” according to Fortune.
Violations of non-compete agreements are nothing new in the cloud computing space as vendors look to scoop up top talent to fill in skill gaps. A couple years ago AWS sued a former employee over violation of a non-compete when he left to join Google.
The issue comes as the White House has released a report that calls for non-compete reforms. According to the report, only 24 percent of workers and fewer than half the workers with non-competes say they know trade secrets.
The lawsuit with Attanasio is not the only thing keeping IBM’s legal team busy these days; Groupon filed a countersuit against IBM this week two months after IBM accused it of patent infringement.
This first ran at http://talkincloud.com/cloud-computing/ibm-takes-former-hybrid-cloud-sales-head-court-over-non-compete | 4:00p |
IBM’s Watson Goes to Cybersecurity School  By The WHIR
IBM will address the cybersecurity skills gap by sending Watson to school, the company announced Tuesday. Watson for Cyber Security is part of a year-long research project in collaboration with 8 universities in the US and Canada.
The cloud-based cognitive system has been “trained” in the language of security, and beginning this fall Watson will be scaled to receive training from the California State Polytechnic University, Ponoma; Pennsylvania State University; the Massachusetts Institute of Technology; New York University; the University of Maryland, Baltimore County (UMBC); the University of New Brunswick; the University of Ottawa, and the University of Waterloo. IBM’s X-Force research library will also be used as training material for Watson.
IBM hopes Watson will discover patterns and evidence of otherwise-hidden cyber attacks, allowing IBM to improve security analysts’ capabilities. Cognitive systems could automate the connections between data, emerging threats, and remediation strategies, the company said. It plans to use Watson for Cyber Security for deployments beginning in beta production this year.
Security analysts may need help, given the explosion in data available. IBM says the average organization sees over 200,000 pieces of security event data each day, and enterprises spend $1.3 million and nearly 21,000 hours just on false positives. The company also notes that the 75,000 items in the National Vulnerability Database, the 10,000 security research papers a year, and 60,000 security blogs published each month challenge analysts to move at the speed of information.
The looming problem is that they may not have an easy time hiring help, as studies have indicated that cybersecurity skills are in short supply.
“Even if the industry was able to fill the estimated 1.5 million open cyber security jobs by 2020, we’d still have a skills crisis in security,” said Marc van Zadelhoff, General Manager, IBM Security. “The volume and velocity of data in security is one of our greatest challenges in dealing with cybercrime. By leveraging Watson’s ability to bring context to staggering amounts of unstructured data, impossible for people alone to process, we will bring new insights, recommendations, and knowledge to security professionals, bringing greater speed and precision to the most advanced cybersecurity analysts, and providing novice analysts with on-the-job training.”
In addition to Watson’s training, UMBC announced it will create an Accelerated Cognitive Cybersecurity Laboratory in collaboration with IBM Research.
Cybersecurity may be the niche Watson needs to get out there and get a job in the real world, after failing to impress at this year’s Consumer Electronics Show.
This first ran at http://www.thewhir.com/web-hosting-news/ibms-watson-goes-to-cybersecurity-school | 4:14p |
AMD, the Best Bet in Semiconductors, Looks to Defy History (Bloomberg) — Advanced Micro Devices Inc. is in its fifth year of losses. Its investors started the year holding a stock down 90 percent from its peak a decade ago. Its industry-leading 27 percent stock rally in 2016 is a sign, though, that it finally might pose a threat to Intel Corp. rather than shareholder returns.
AMD is one of only three companies capable of the extreme engineering that crams billions of transistors onto postage stamp-sized squares of silicon resulting in microprocessors and graphics processors that are the main components of computers. But during most of the last decade it’s been too slow to field new products and its chips haven’t stacked up against the competition. In 2012 analysts said it was on course to run out of cash.
On April 22, those shareholders who stuck with the company through the turmoil got a payoff: The stock rallied 52 percent a day after AMD announced a deal to bring in cash from licensing technology to China.
“I don’t think anyone expected that,” said Raymond James analyst Hans Mosesmann. “It’s very disruptive potentially.”
Read more: China Server Deal Gives AMD Stock Biggest Surge in 35 Years
Lisa Su, AMD’s chief executive officer, was personally involved in the China deal. It’s a rebuke to skepticism of her plan to bring in revenue from technology licensing and given her company another source of cash that Intel can’t easily take away.
Investors and analysts like AMD’s China joint venture because it could provide the chipmaker with a path back into the lucrative server market. AMD is partnering with Tianjin Haiguang Advanced Technology Investment Co. — which is backed by the Chinese Academy of Sciences — to create a company that will make server parts for the Chinese market. AMD will get $293 million in licensing revenue in return for providing processor know-how. The Chinese gain access to technology that they can use to reduce their dependence on imports.
Intel, which owns some of the fundamental technology that will be transferred, is unlikely to challenge the arrangement, fearing it would risk its own right to sell chips in the world’s most populous nation, Mosesmann and others argue.
Mosesmann, a long-time AMD supporter, described the China deal as a “clear and present danger” to Intel’s hold on the lucrative server business. It’s evidence that AMD has value not appreciated by the market, he said.
But is this the sign of a turnaround or merely a short-term gain? Intel has dominated the server business for years: More than 99 percent of worldwide server processor shipments are Intel’s. In 2006, AMD’s Opteron server chip had more than 20 percent of the server market. But its follow-up, Barcelona, was late and never performed as billed, beginning the company’s slide back into obscurity in that market.
In the first three months of this year, Intel’s data center group, the division which makes server chips, produced about $1.8 billion of operating profit. AMD lost $109 million during the first quarter– its sixth quarterly loss in a row.
Su is more up front about her company’s challenges and less prone to promise what she can’t deliver, according to Topeka Capital Markets analyst Suji De Silva.
“She’s the least hyping CEO they’ve had,” he said. The deal with China has “brought something that people had written off back to life.”
Under Su, the highest-ranking woman in the semiconductor industry, AMD is pushing into what it calls semi-custom chips. By designing products for individual customers, it’s become the heart of Microsoft Corp.’s Xbox One and Sony Corp.’s PlayStation 4. While that’s helped stabilize earnings, it still gets more than half of its revenue from the personal computer and graphics chip markets, where it’s in the cross hairs of Intel and Nvidia Corp.
In the first quarter, when worldwide PC shipments fell to their lowest level since 2007, it surprised analysts by not coughing up more market share to Intel. That helped the stock rally and caused hope its products can hold their own, particularly in graphics, where it’s lost market share to Nvidia. AMD is starting to see their long-term plan toward profitability bear fruit, Drew Prairie, a company spokesman, said in an e-mail.
Still, even shareholders who have supported it want more evidence that the improvements are sustainable. In the PC market it’s possible to create the appearance of stronger sales by shipping more products into inventory. That typically unravels in the next quarter if those chips don’t sell and unused stockpiles mount up.
The big test of whether AMD is really on the way back will come when it introduces a new design for PC processors called Zen, which the company aims to start selling next year.
“If Zen is a home run this thing probably works,” said Stacy Rasgon, an analyst at Sanford C. Bernstein. Like many of those who’ve followed AMD for a long time, he’s not yet ready to bet that this is the beginning of a new chapter at the company. “History would suggest it’s not, but you never know.” |
|