Saturday, April 12, 2014

NoSQL Databases


NoSQL Features

  • Support large volumes of data
  • Can be used for data of record or big data analysis
  • Non-relational (tables cannot have foreign key constraints)
  • Easy to distribute data amongst multiple nodes (cluster-friendly)
  • No set schema - even within one table each row may contain data with different attributes (columns)
  • Popular database: mongoDB, Cassandra, CouchDB, RavenDB, Neo4j, HBASE, redis, riak, Project Voldermart, DynamoDB, Azure Tables

Data Models

Key-value

  • Given a key, give me the value. The value can be anything. Persistent hash map.
  • The value content cannot be queried directly but can store metadata and write queries against the metadata.
  • riak, redis, Project Voldemar, Dynamo

Document Data Model

  • Storage of large number of documents; each document is a data structure, usually JSON
  • The contents can be queried
  • mongoDB, RavenDB, couchDB

Column-family

  • Key mapped to a one or more column families. Each column family has key-value pairs. Multiple families (e.g. Invoice and InvoiceLineItems in one record).
  • Conducive to distribution of data amongst nodes (don't have to jump nodes to get one aggregate)
  • Entire aggregate in one record.
  • Cassandar, HBASE 

Graph 

  • NoSQL works well when accessing a pre-defined aggregate. To access (re-arrange) a different aggregate is more difficult in NoSQL than RDBMs
  • Special queries for graph structures
  • Node and arcs structure. Each to move from relation to relation.
  • Neo4j