Data Center Knowledge | News and analysis for the data center industry - Industr's Journal
 
[Most Recent Entries] [Calendar View]

Friday, April 15th, 2016

    Time Event
    9:00a
    DataStax Brings Graph Databases to Enterprise Cassandra

    The term you may hear is “convergence.” The definition, if we’re using an honest dictionary, is the state of affairs when a vendor wants to be your one-stop-shop for every selection in a given product space. There’s nothing particularly wrong with this, if the results add measurable value to your data center above and beyond the obvious cost savings.

    In February 2015, San Francisco-based DataStax — which produces a commercial implementation of Apache Cassandra — acquired Aurelius, the producer of a graph database engine called TitanDB. It’s not a database in itself, but moreover a system for interpreting elements of data in terms of their type of relationship to one another. More to the point, a graph database is concerned with how data is related (for example, X is in B’s file because X owes B money), rather than simply that data is related (e.g., X and Y are members of table Q).

    What made TitanDB (or just “Titan”) unique is that it didn’t really have to build an entirely new database to function using graph methodologies. In fact, Titan actually required a back-end data store. Some organizations were using Hadoop’s HBase as a back-end, while others chose BerkeleyDB. But many preferred Cassandra, partly because it enabled them to leverage its continuous availability, as well as the absence of a single point of failure.

    Multi-modal

    As Robin Schumacher, DataStax’ vice president of products, told Datacenter Knowledge in an interview, it’s this seemingly natural dependency between the two components that compelled it to acquire Aurelius, and then to add graph database methodology to its ongoing collection of access modes, for what it now calls DataStax Enterprise Graph.

    “The rise of the multi-modal database is born out of these cloud applications, where it’s becoming ‘the new normal,” said Schumacher. “If you can’t handle this in a single database, you are back to what you’ve seen in the relational world, where people will use one data management vendor for transactions, another for analytics, a third for search. (Then) your application has to be smart enough to direct it to the right data management vendor, that has different security paradigms, backup paradigms, etc. Let’s not do that.”

    DataStax Enterprise was already capable of addressing a mixed workload problem, he said, where customers could run transactions, analytics, and search in the same cluster. Add to that the capability to run graph traversals, he said, and it becomes feasible to process new and informative styles of workload with the same data already being managed by Cassandra.

    Regardless of whether your data is stored in the cloud, on-premises, or a combination of the two, Schumacher contended, cloud applications will mandate that this same data is accessible through any number of modes simultaneously. Retail applications for both desktop and mobile may utilize multiple modules, such as a product catalog, a user profile manager, a fraud detection system, a recommendation engine, a clickstream analyzer, and a log analyzer.

    “Each module may have different data management requirements,” he explained. “One may need a data model that is very adept at handling time-series data, and that can write data very, very fast. You may need a model that is more JSON-oriented, if I have a particular Web application that’s communicating with browsers. Maybe I have a recommendation engine, and I need to be able to smartly analyze the moves that you’re making on my Web site or mobile app, do some analysis, understand the relationships between you, the products I’m selling, the various vendors involved, and maybe come back to you with smart, real-time recommendations.”

    Traversal

    It’s this last mode of access that is best suited for graph database access — in DataStax’ case, accessing the Cassandra data as though it were a graph database. A graph relationship is different from a conventional RDBMS relation, because it begs to be explained on a blackboard using circles, arrows, and geometry.

    As database research analyst Curt Monash explained on his firm’s website a few years back, a graph database describes the full relationship between two nodes of data — and here, the “-ship” suffix is extremely powerful. It implies that the associations between nodes can be both qualitatively and quantitatively different from one another. The quality of that difference is a stored property in the graph database. The quantity is expressed as a kind of “weight,” which represents the degree of importance or prominence of a relationship.

    160414 Graph data model

    The diagram above (courtesy DataStax) represents a typical graph database schema. When a database built upon a schema like this represents the association between a Web site’s customer and an inventory item, the relationship can represent a purchase, a glance on the item’s Web page, or a comment of approval or disapproval. And the weight could perhaps be employed to relate this customer association with all the other customers for that same item, or all the other items that same customer has purchased.

    There are ways to accomplish this same type of relationship representation within a conventional RDBMS, explained Monash in a note to Datacenter Knowledge. But none of them are easy.

    “You can implement a graph in a relational DBMS, in one or more long, narrow tables,” he told us. “For some use cases that works well. In others, it requires a lot of joins, and indeed an unpredictable number of joins.” By “joins,” Monash is referring to the grafting of tables onto each other to form wider records of relations.

    “That’s because the number of joins needed is tied to the path length (to a first approximation, it’s the path length minus 1), so if the path length is unpredictable, so is the number of joins,” the database analyst continued. “Needing many joins stresses performance. Needing an unpredictable number of joins stresses SQL syntax.”

    Masterless-ness

    DataStax’ Schumacher argued that this type of unpredictability, as introduced by the requirements of RDBMS and traditional SQL syntax, translates into non-determinism for customers looking to perform real-time analytics transactions.

    “The underlying architecture limited you,” he said. “So the beautiful thing we have here is, we build on Cassandra, which allows us no downtime. With the older structures, the best you could do was ‘high availability.’ You couldn’t have continuous availability.”

    That continuous availability, he continued, is made feasible through Cassandra’s native “masterless” architecture, in which nodes are continuously replicated, and no single node serves as the “master” over the remainder as “slaves.”

    Schumacher did make it clear to us that his company is primarily an operational database vendor, and that DataStax Enterprise addresses use cases that are not as centered upon analytics as typical “big data” deployments. “We are happy to leave the purely analytic and/or data warehousing, data lake use cases to the Hadoop vendors. That said, we certainly support operational analytics in our database, and you need that to be able to make real-time decisions that are necessary for a transactional system.”

     

    7:26p
    Microsoft to DOJ: The Cloud Isn’t an Automatic Fourth Amendment Exemption
    By WindowsITPro

    By WindowsITPro

    Microsoft has filed suit against the DOJ, stating that the federal government is routinely violating the Fourth Amendment rights of its customers by preventing the company from notifying users of government requests for their data.

    It’s a case that could have wide-ranging implication for how much consumers can trust increasingly popular cloud providers, including Microsoft, with their most sensitive data, at a time when the industry is pushing to have more and more of that data stored remotely.

    At issue is whether users give up some of their expectations of privacy when they ask a third party to hold onto their data, and what due process is given to those users.

    “People do not give up their rights when they move their private information from physical storage to the cloud,” Microsoft stated in the blistering filing. As individuals and business have moved their most sensitive information to the cloud, the government has increasingly adopted the tactic of obtaining the private digital documents of cloud customers not from the customers themselves, but through legal process directed at online cloud providers like Microsoft. At the same time, the government seeks secrecy orders under 18 U.S.C. § 2705(b) to prevent Microsoft from telling its customers (or anyone else) of the government’s demands.

    Microsoft maintains that while cloud storage is a natural evolution of how people store and access their private data, the courts and legislation have not kept up.

    The impact is getting hard for Microsoft to ignore, even as the company is prevented from releasing specifics even to impacted customers. In the filing, Microsoft said that over the last 18 months it has received 2,600 secrecy orders that prohibit it from disclosing orders for information — and that in two-thirds of those cases, the orders have no end date to the secrecy.

    In addition to denying customers their Fourth Amendment rights, the company argues, it infringes on the company’s First Amendment rights to discuss the cases.

    The Department of Justice stated that it is reviewing the case.

    The confrontation comes at a time of high tension between tech companies and the federal government: Apple is fighting efforts to, through the courts or legislation, put backdoors in device encryption, a fight that Microsoft ultimately endorsed.

    But while Apple was defending a precedent on the side of encryption that had been won in the 90s, Microsoft is picking what could be a more challenging case: Reversing a precedent that has been running for decades, and establishing a stronger Fourth Amendment than the courts have so far recognized.

    Without such protections, Microsoft and other cloud providers might have to get increasingly creative about how they protect customer data, as the company did recently by opening a German Azure instance that it legally does not control.

    Original article appeared at http://windowsitpro.com/cloud/microsoft-doj-cloud-isnt-automatic-fourth-amendment-exemption

    10:04p
    Optiv Security Prepares to Enter Tech IPO Market
    By The VAR Guy

    By The VAR Guy

    Nutanix and Dell’s SecureWorks are reportedly getting some company in the 2016 tech IPO market, which is experiencing a freeze unlike any we’ve seen since 2009. Bloomberg recently reported that Optiv Security has hired Goldman Sachs and Morgan Stanley to help it prepare for an initial public offering.

    Planning for the IPO has just begun, but sources say it could move forward as early as later this year, depending on the state of the market. 2016 has been a desolate year so far in the IPO market in general, and the tech market specifically.

    Optiv and SecureWorks are both part of the booming cybersecurity market, which is poised to hit more than $170 billion in 2020. SecureWorks’ business model is based on subscription services and contracts, and notably has yet to turn a profit despite rising annual revenues. Despite lukewarm analyst predictions, the company hopes to raise up to $158 million in next week’s IPO.

    Denver-based Optiv, though making the transition toward a service-based model, still operates in large part as a traditional VAR. It was formed early last year by the merger of Accuvant, backed by Blackstone Group (BX), and FishNet Security, backed by InvestCorp. Together, the two entities had revenues totaling $1.5 billion in 2014. Optiv reported revenue of $2.0 billion by the end of last year. Blackstone, the largest private equity firm in the world, remains the company’s majority shareholder.

    << Previous Day 2016/04/15
    [Calendar]
    Next Day >>

Data Center Knowledge | News and analysis for the data center industry - Industry News and Analysis About Data Centers   About LJ.Rossia.org