Eventual Consistency in NoSQL Databases: Theory and Practice
One of NoSQL's goals: handle previously-unthinkable amounts of data.
One of unthinkable-amounts-of-data's problems: previously-improbable events become extremely probable, precisely because the set of interactions is so large. Flip a coin a hundred times, and you're not likely to get 50 heads in a row. But flip it a few trillion times, and you probably will find some 50-heads streaks.
So NoSQL's performance strength is also its mathematical weakness.
This order of scale can result in lots of problems, but one of the most common is consistency -- the C in ACID -- clearly a fundamental desideratum for any database system, but in principle much harder to acheive for NoSQL databases than for others.
Emerging database technologies have forced developers and computer scientists to define more exactly what kind of consistency is really needed, for any given application. Two years ago, ACM (the Association for Computing Machinery) published an extremely helpful examination of the attenuated notion of consistency called 'eventual consistency'. Their summary:
Data inconsistency in large-scale reliable distributed systems must be tolerated for two reasons: improving read and write performance under highly concurrent conditions; and handling partition cases where a majority model would render part of the system unavailable even though the nodes are up and running.
The article surveys technical solutions as well as user considerations that might soften the undesirability of anything less than perfect, instantaneous consistency. It's not long (4 pages plus pictures), and explains some deep database issues quite clearly.On the more practical side of the problem: Russell Brown recently gave a talk at the NoSQL Exchange 2011 on exactly this topic. More specifically, he showed how some distributed systems (Riak in particular) try to minimize conflicts, and suggested some ways to reconcile conflicts automatically using smart semantic techniques.
Check out the NoSQL Exchange page for Russell's talk here, which includes an embedded video. But read the ACM article first for a broader overview, since Russell launches into technical details pretty quickly.
0
The concept of "NoSQL" has been spreading due to the growing demand for relational database alternatives. Perhaps the biggest motivation behind NoSQL is scalability. NoSQL solutions can offer a way to store and use extremely large amounts of data, but with less overhead, less work, better performance, and less downtime. In some cases, companies just don't need as many of the complex features and rigid schemas provided by relational databases. In essence, NoSQL is a movement that aims to reexamine the way we structure data and draw attention to data storage innovations with the hope of finding solutions to the next generation's data persistence problems. Neo Technology, the creators of the open source Neo4j graph database, are one of the leaders in the NoSQL movement. You can find out more about the NoSQL database Neo4j in the project discussion forums and try out the new Spring Data Neo4j, which enables POJO-based development.
MongoDB vs. RDBMS Schema Design
In this article, based on chapter 4 of MongoDB in Action, author Kyle Banker explains how MongoDB schema differs from an equivalent RDBMS schema, and how common relationships between entities, such as one-to-many and many-to-many, are replicated in MongoDB.
How You Should Go About Learning NoSQL
Yesterday I tweeted three simple rules to learning NoSQL. Today I'd like to expand on that. The rules are:
1: Use MongoDB. 2: Take 20 minute to learn Redis 3: Watch this video to understand Dynamo.Before we get going though, I want to talk about two different concepts which'll help us when we talk about specific technologies.
Understanding node.js
Node.js has generally caused two reactions in people I've introduced it to. Basically people either "got it" right away, or they ended up being very confused.
If you have been in the second group so far, here is my attempt to explain node
More great Posterous themes at themes.posterous.com.


