Data Center Knowledge | News and analysis for the data center industry - Industr's Journal
[Most Recent Entries]
[Calendar View]
Friday, September 26th, 2014
| Time |
Event |
| 12:00p |
Web caching: Facebook’s Problem of a Thousand Servers Distributed computing systems at extreme scale are governed by a different set of rules than small-scale systems. One of the differences Facebook has discovered is the role of web caching.
Traditionally used to take some load off database servers and load websites quicker, web caching has become a necessity for the site rather than a neat nice-to-have optimization feature. And Facebook isn’t the only one. Caching has taken a key infrastructure role at Twitter, Instagram and Reddit as well.
Facebook infrastructure engineers have built a tool called mcrouter to manage caching. Earlier this month, the company open sourced the code for mcrouter, making the announcement at its @Scale conference in San Francisco.
It is essentially a memcached protocol router that manages all traffic to, from and between thousands of cache servers in dozens of clusters in Facebook data centers. It makes memcached possible at Facebook’s scale.
Memcached is a system for caching data in server memory in distributed infrastructure. It was originally developed for LiveJournal but is now an indispensable part of infrastructure at many major Internet companies, including Zynga, Twitter, Tumblr and Wikipedia.
Instagram adopted mcrouter for its infrastructure when it was running on Amazon Web Services, before it was moved to Facebook data centers. Reddit, which runs on AWS, has been testing mcrouter and plans to implement it at scale in the near future.
Formalizing open source software for web scale
Facebook uses and creates a lot of open source software to manage its data centers. At the same conference the company announced a new initiative, formed together with Google, Twitter, Box and Github, among others, to make open source tools, such as mcrouter, easier to adopt.
Together, the companies formed an organization called TODO (Talk Openly, Develop Openly) which will act as a clearinghouse of sorts. Details about its plans are scarce, but in general, the organization will develop best practices and endorse certain open source projects so users can have certainty that what they are adopting has been used in production by one or more of its members.
Where “Likes” live
Mcrouter has become necessary at Facebook as the site added certain features. Rajesh Nishtala, a software engineer at Facebook, said, one of these features was social graph, the application that tracks connections between people, their tastes and things they do when using Facebook.
The social graph contains people’s names, their connections and objects, which are photos, posts, Likes, comment authorship and location data. “These small pieces of data … are what makes caching important,” Nishtala said.
These bits of data are stored on Facebook’s cache servers and get pulled from them every time a user’s device loads a Facebook page. The cache tier performs more than 4 billion operations per second, and mcrouter is what ensures the infrastructure can handle the volume.
From load balancing to failover
Mcrouter is a piece of middleware that sits between a client and a cache server, communicating on the cache’s behalf, Nishtala explained. It has a long list of functions, three of the most important ones being cache connection pooling, splitting of workloads into “pools” and automatic failover.
Pooling cache connections helps maintain site performance. If every client connected directly to a cache server on its own, the cache server would get easily overloaded. Mcrouter runs as a proxy that allows clients to share connections, preventing such overloads.
When a variety of workloads are competing for cache memory space, the middleware splits them into pools and assigns those pools and distributes them across multiple servers.
If a cache server goes down, mcrouter automatically switches to a backup cache. Once it has, it continuously checks whether connection to the primary server is back.
It can also do “pool-level” failover. When an entire pool of cache servers is inaccessible, mcrouter automatically shifts the workload to a pool that’s available.
Reddit expects mcrouter to make AWS easier
Reddit has been testing mcrouter on one production pool, Ricky Ramirez, systems administrator at Reddit, said.
The site scales from about 170 to 300 application servers on AWS every day and has about 70 backend cache nodes with about 1TB of memory. Ramirez’ team (three operations engineers) relies on memcached for a lot of things.
The pain point they are addressing with mcrouter is their inability to switch to new instance types Amazon constantly cranks out. “It’s very stressful and takes a lot of time out of the operations engineers,” Ramirez said.
After a successful test run on one production pool, doing about 4,200 operations per second, the team plans to use mcrouter a lot more. The next pool that involves Facebook’s middleware will do more than 200,000 operations per second.
The engineers plan to use it to test new cloud VM instance types and to replace servers seamlessly, without downtime. By offloading some of the complexity of managing changes in the infrastructure, Ramirez said he expects to see significant performance gains.
More details on mcrouter on the Facebook Engineering Blog | | 4:00p |
Data Center Jobs: Peter Kazella & Associates At the Data Center Jobs Board, we have a new job listing from Peter Kazella & Associates, which is seeking a Critical Facilities Solutions Manager in Atlanta, Georgia.
The Critical Facilities Solutions Manager is responsible for conducting critical facilities operations and maintenance (O&M) program analysis and reporting, designing critical environment programs for internal and external clients, coordinating with design/construction team, providing consulting services on existing critical environment O&M programs, assessing, recommending and implementing industry best practices into new and existing critical environment programs, and providing feedback to senior management for the continuous improvement of the critical environments program. To view full details and apply, see job listing details.
Are you hiring for your data center? You can list your company’s job openings on the Data Center Jobs Board, and also track new openings via our jobs RSS feed. | | 4:30p |
Friday Funny: Pick the Best Caption for Ghosts Kip and Gary are at it again, bringing us the late-week laugh we know and love. Let’s roll into another weekend of fun with our Data Center Knowledge Caption Contest.
Several great submissions came in for last week’s cartoon – now all we need is a winner! Help us out by scrolling down to vote.
Here’s how it works: Diane Alber, the Arizona artist who created Kip and Gary, creates a cartoon and we challenge our readers to submit a humorous and clever caption that fits the comedic situation. Then we ask our readers to vote for the best submission and the winner receives a signed print of the cartoon!
Take Our Poll
For previous cartoons on DCK, see our Humor Channel. And for more of Diane’s work, visit Kip and Gary’s website! | | 4:49p |
Microsoft Slashes Azure Connectivity Service Rates Microsoft has yet again lowered Azure pricing for some of the cloud connectivity services. The new cuts affect the more hidden side of cost of cloud computing which isn’t talked about nearly as much as storage and compute: the connection and delivery part of the chain.
The new prices are effective immediately. Compute and Storage pricing was recently cut and remains the same.
The price for ExpressRoute, a service that enables private connections, has gone down. For example, ExpressRoute NSP (1Gbps) was dropped from $12,000 a month to $8,700 a month. ExpressRoute is one of the more popular services among data center providers looking to enable customers to securely connect to Azure.
BizTalk saw several price cuts. BizTalk is a cloud integration service that provides Business-to-Business (B2B) enterprise application integration, Electronic Data Interchange (EDI) processing and hybrid connections capabilities. Two BizTalk servers and four different BizTalk service levels saw price reductions:
- BizTalk Services: Standard dropped from $4.03/hour to $2.93/hour
- BizTalk Server: Standard dropped from $.66/hour to $.48/hour
Cache, CDN, and Data Transfer services saw incremental price cuts as well.
In addition to price cuts across “the network-centric” services, prices were dropped for media services encoding, mobile services, multi-factor authentication, SQL server for virtual machines, scheduler, traffic manager and a few more.
Price cutting continues to occur in the cloud. In May, Google reduced the rates of its on-demand virtual server business called Compute Engine by about 32 percent.
Amazon followed almost immediately by announcing its 42nd consecutive rate reduction, IaaS price cuts that ranged from 10 percent to 40 percent, depending on the type of service.
About one week later Microsoft slashed the rates for rentable virtual compute and storage infrastructure on its Azure cloud by up to 35 percent.
The full Azure pricing update list is here. | | 5:55p |
European Data Center Startup Zenium Buys Frankfurt Site Zenium Technology Partners, a new London-based data center provider, has bought a data center in Frankfurt from the Dutch design and engineering giant Imtech.
Zenium has been around for only few years, and this is its second data center. Frankfurt is one of Europe’s biggest and most active data center markets, where the company now has about 54,000 square feet of live data center space and a campus with room for expansion.
The data center came online in 2012 and is partially occupied by an unnamed global telecoms and cloud services provider, according to a Zenium announcement. It is a carrier-neutral facility, and numerous providers are offering connectivity services there.
The campus can accommodate more than 100,000 square feet of additional data center space across two buildings.
Called Frankfurt One, it is the second data center location in Zenium’s portfolio. A three-building campus in Istanbul is currently under construction.
The Istanbul location will have about 130,000 square feet of data center space at full build-out. Zenium has already secured a deal for about 5,000 square feet at Istanbul One with a major Turkish systems integrator.
Zenium is privately funded. The company secured an equity investment from private investment fund Quantum Strategic Partners in 2013 but did not disclose the amount.
The company says its strategy is to get into both top-tier European markets as well as emerging ones. With Frankfurt and Istanbul, it now has presence in both types of markets.
Zenium’s founder and CEO Franek Sodzawiczny co-founded Sentrum, a London colocation provider whose three-site greater-London portfolio was acquired by Digital Realty Trust in 2012 for about £716 million. | | 6:09p |
Colt Lights Up Second Dublin-London Network Route Colt has added a second fully operational network route from Dublin to London via Manchester. The high-bandwidth route adds additional capacity and resiliency between the two major European metros.
Colt’s investment strategy is to grow existing infrastructure and create new long-distance routes to address customer demand. Dublin’s data center profile is on the rise and London is a key European hub, making a second route to Dublin for Colt a wise expansion. The announcement follows Colt’s launch of another low-latency route from Dublin to London via Birmingham.
The network is a 100 Gbps route providing up to 4.2Tbps network capacity, the equivalent of 280 15Gb HD films every second. It is ready to support both IPv6 and 1000Gbps Ethernet.
Colt’s infrastructure now consists of 29,200 miles, connecting 42 metropolitan area networks, as well as 23,300 miles of transatlantic routes and U.S. presence.
The latest expansion completes a network ring that links the two capital cities and connects hundreds of Ireland-based companies in financial, media and retail sectors to more than 195 European cities. It connects Colt customers to all major business parks in Dublin, 20 data centers in and around the city and provides fiber access to more than 250 buildings.
New routes will launch in the Iberian Peninsula, Northern Europe and the Netherlands later this year. The new network expansions will help address European organizations’ infrastructure concerns. A Tech Deficit research study revealed eight in 10 organizations admitted their current network would not meet business needs in two years’ time.
“With half of the world’s top banks in Ireland, we’re seeing an increasing demand for high-bandwidth connectivity, driven by the move to cloud services and the mission critical nature of data in the digital economy,” said François Eloy, executive vice president of network services at Colt.
Dublin’s data center profile on the rise
Dublin has a growing data center profile, which means connectivity between it and London is growing in importance. It is close to mainland Europe and offers low taxes and a desirable cool climate. Dublin has become one of the top destinations for massive data center construction projects.
Major tech companies Microsoft, Google and Amazon have all built and continue to expand data centers there. European providers Interxion and TelecityGroup are present. Digital Realty recently launched a fourth Dublin data center. SunGard also operates in the market. | | 6:58p |
Bash Bug Has Cloud Providers, Linux Distro Firms on High Alert The widespread critical vulnerability Shellshock is the new Heartbleed. Also dubbed the “Bash Bug,” it affects GNU Bash, a very common open source program. It’s a major vulnerability but might not be a major threat depending on how quickly everything gets patched.
The GNU Bash bug is widespread and requires very little technical knowledge to exploit. It allows someone to remotely take control of a system that uses Bash. It is on par with the recent Heartbleed vulnerability in terms of the scale of potential damage.
GNU Bash is a command shell used on Linux, Mac OS X and BSD. Linux is everywhere. It’s on more than half the servers on the Internet, on Android phones, and most connected devices collectively referred to as the “Internet of Things,” thanks to Linux being open source and often the OS of choice for web stuff.
Complicating the matter is the fact that there are many Linux distributions. All of the major distribution providers have released a patch available in the base repository that provides at least a partial fix. Many are working feverishly towards fixing the vulnerability.
Cloud and hosting providers are all trying to keep customers safe. Given the amount of customers on a cloud and the amount of control they have over configurations, the vulnerability is a major concern.
This problem is not unique to one service provider, though all providers are notifying customers. Rackspace, for example, is advising customers to patch, and others are providing ongoing status or rolling out patches to those that have automatic updates. Updates for Rackspace customers are available at https://status.rackspace.com/.
Popular digital currency Bitcoin is also a potential target. Bitcoin Core is controlled by Bash, possibly affecting Bitcoin miners and systems. Given the worth of Bitcoin, it’s a potentially attractive target, according to Trend Micro.
Major Linux distro provider Red Hat updated customers today: “Red Hat has become aware that the patch for CVE-2014-6271 is incomplete. An attacker can provide specially-crafted environment variables containing arbitrary commands that will be executed on vulnerable systems under certain conditions. The new issue has been assigned CVE-2014-7169.
Trend Micro has seen attacks in the wild already. It is providing some tools here.
Troy Hunt goes into more detail about Bash, what it is, what the problem is and the potential ramifications. “The potential is enormous – ‘getting shell’ on a box has always been a major win for an attacker because of the control it offers them over the target environment,” he writes. | | 8:46p |
Data Center Connectivity News: 365 Main, Hurricane Electric, Zayo, ByteGrid Fremont, California-based Hurricane Electric added a second Point-of-Presence (PoP) at Digital Realty Trust‘s 365 Main, implementing network expansion in a well-known colocation data center in San Francisco. The PoP adds Hurricane’s connectivity options in the Bay Area, including 10 and 100 gigabit Ethernet services. Hurricane operates two colocation data centers of its own in the Bay Area as well.
This is the second network expansion in San Francisco for Hurricane Electric this year. The company added a PoP at Telx’s SFR1 data center at 200 Paul (also owned by Digital Realty) earlier this year.
“The addition of a second PoP at Digital 365 Main represents both a strategic network expansion for Hurricane Electric and also an opportunity for customers and networks in the San Francisco Bay Area,” said Mike Leber, president of Hurricane Electric. “In addition, this new PoP will provide customers of Digital 365 Main with reduced router hops and improved quality in the delivery of next generation IP.”
The 365 Main facility currently contains 8,600 kW of critical IT capacity and services early, middle-market and Fortune 1000 corporations.
European exchange operator AMS-IX (Amsterdam Internet Exchange) recently established a PoP in the Digital 365 Main data center. At the time of opening, AMS-IX Bay Area had written commitments from a handful of parties and verbal commitments from about 15 more to peer in San Francisco.
Hurricane Electric’s global Internet backbone is IPv6-native and does not rely on internal tunnels for IPv6 connectivity. The company provides both IPv4 and IPv6 connectivity.
Zayo Plugs Dark Fiber Network Into ByteGrid in Maryland
ByteGrid, which recently welcomed Mid-Atlantic Crossroads’ MAX network into its Maryland data center, added Zayo Group to its “Maryland Connect” program, its interconnection capacity and connectivity ecosystem. Zayo will offer its bandwidth infrastructure and network transport solutions. Customers have access to Zayo’s dark fiber network.
ByteGrid’s enterprise, government and service provider customers can now connect to Zayo’s network, which spans more than 77,000 route miles in over 295 markets and includes over 14,000 on-net buildings across the U.S. and Europe.
“Zayo continues to expand its network throughout Maryland, Virginia and Washington, D.C., to meet customer bandwidth demands,” said John Real, vice president of strategic channels for the carrier. “Connecting to ByteGrid’s Silver Spring, Maryland, data center enables us to provide our services to the data center’s small and large commercial enterprise and government customers.” | | 9:00p |
Report: Feds Underreport Data Center Consolidation Savings by Billions A federal government watchdog agency released a report charging that agencies have been inaccurately reporting cost savings they expect to get by consolidating data centers, collectively missing the mark by billions of dollars.
The Government Accountability Office estimated that agencies would save as much as $3.1 billion through next year. The amount of savings they reported for the same period was $876 million, however, according to a GAO report released Thursday.
The report attributes a portion of the gap to the inability of six agencies to calculate their baseline, pre-consolidation data center costs. These agencies reported having closed 67 data centers but reported between zero and very little savings.
They are:
- Department of Health and Human Services
- Department of the Interior
- Department of Justice
- Department of Labor
- General Services Administration
- National Aeronautics and Space Administration (NASA)
Another big part of the problem is underreporting of planned cost savings between fiscal 2012 and 2015 by 11 of the 24 agencies participating in the Federal Data Center Consolidation Initiative (now in its fifth year). Some agencies blamed the problem on communication problems, while others gave no reason to the GAO.
Like two years ago, one of the most common reporting problems for agencies is finding power usage data for their data centers, which makes true cost savings calculation impossible.
Vivek Kundra, former CIO of the federal government, rolled out FDCCI in 2010 to rein in unchecked sprawl of government data centers. The initial inventory, completed in 2011, concluded that there were about 3,000 data centers total, but as the initiative’s working definition of data center changed, the number grew, reaching about 9,600 facilities as of May of this year.
Department of Defense has had more data centers than any other government agency, but has also shut down the most facilities since the initiative started. Its most current baseline is 2,308, of which 374 were shut down as of May. Including these, Defense has identified for about 940 data centers for closure.
Agriculture and Treasury follow Defense closely, the former’s baseline being 2,277 data centers and the latter 2,137. Agriculture is planning to get rid of 2,254 of its facilities, while Treasury has put only 111 data centers (5 percent of the total) on the chopping block.
Since 2012, the government has been trying to take a more nuanced approach to optimization of its IT infrastructure than simply counting buildings and IT rooms. It merged FDCCI with another initiative, called PortfolioStat with the goal of analyzing and consolidating the actual applications agencies use and consolidating physical infrastructure to support the rationalized application portfolios.
Earlier this week, the U.S. Senate passed a bill meant to expedite the government’s data center consolidation, however.
The legislation sets deadlines and requires agencies that are behind on inventories and consolidation strategies to act. It also directs GAO to verify data agencies produce and Office of Management and Budget to report on cost savings to Congress routinely. |
|