Today I’m going to share a bunch of links related to some stuff I’ve been digging into recently.
In my quest to broaden the knowledge about modern NoSQL / Big Data solutions, I’ve encountered Apache Cassandra that led me straight to HBase and this was already an edge of an abyss known as Hadoop, that sucked me in momentarily (there’s a good occassion for that - 1.0 milestone of Hadoop was release something like week ago). For those that haven’t heard about it - Hadoop is an open source implementation (Java-based, sadly…) of MapReduce solution. And MapReduce is distributed computing paradigm considered to be the foundation of Google’s search engine’s success. Surprisingly the idea is very simple (always “Keep It Simple, Stupid”!) - you can find the details here:
http://en.wikipedia.org/wiki/MapReduce
and here:
http://hadoop.apache.org/
I’m not saying you’re going to use it on your next project. Neither will I. But it’s always better to know the options :)
Now back to NoSQL - why would anyone want to give up on classic SQL-based RDBMSes? We all use them for decades and they seem to meet our needs. Unfortunately (or rather fortunately!) market is changing, demands are evolving and so software has to follow. Modern day data storage has:
- to be able to hold humongous amount of data
- to support more flexible data schemas
- to scale horizontally (yeah, welcome to the cloud reality)
- and at least but not least, it has to remain ultra-fast
If you want some fast hands-on experience, I’d suggest two interesting technologies:
- HBase (http://hbase.apache.org/) - for those interested in BigData approach (but in quite classic tabular style) - btw. HBase is the database behind Hadoop.
- MongoDb (http://www.mongodb.org/) - map oriented (some say - key-value oriented), non-Java based (there are .NET drivers available!) implementation - definitely worth of checking
P.S.