Friday, November 20, 2015

Amazon DynamoDB Summary

In General

Amazon describes DynamoDB as follows.
  • DynamoDB is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability.
  • You can use DynamoDB to create a database table that can store and retrieve any amount of data, and serve any level of request traffic (provided you follow best practices for hash key values). 
  • DynamoDB automatically spreads the data and traffic for the table over a sufficient number of servers to handle the request capacity specified by the customer and the amount of data stored, while maintaining consistent and fast performance. The requested capacity can be adjusted anytime.
  • All data items are stored on solid state disks (SSDs) and are automatically replicated across multiple Availability Zones in a Region to provide built-in high availability and data durability.
  • DynamoDB resides within (and stays within) a particular AWS region. However, the data is replicated synchronous to 3 availability zone within that region.

Tables, Items and Attributes

  • DynamoDB consists of tables. Tables in turn contain items. Each item contains attributes (name-value pairs). Data is stored a key-attribute pairs; no fixed schema. Item size is limited to 400KB.

Primary Key

  • Items have a primary key. Primary key can be a single attribute of an item. This attribute is designated as the hash. Primary key of an item is unique within a table. Alternatively, if the data model dictates, the primary key can be composed of two item attributes. The first primary key attribute is the hash as before and the second primary key attribute is called the range. In case of composite primary key, the value of hash plus range value of an item must be unique within a table.
  • Primary id must be a scalar Number, String or Binary.
  • Queries can based on equality to particular hash value; however queries can be based on a range of values for the range value. Range values are lexicographically sorted for string values.

Data Structure

Table

  • Contains items. Max item size 400KB.
  • Hash groups items together.
  • Hash and Range uniquely defines an item; If there is no Range defined, Hash must be unique. Range determines order (ASCII string ordering)

Data Types

  • Number (N) & Number Unordered Set (Hashset)
  • String (S) & String Unordered Set (Hashset)
  • Binary (B) (Base64 encoded) & Binary Unordered Set (Hashset) 
  • Boolean (BOOL)
  • NULL
  • List (L) (Ordered), equivalent to JSON Array
  • Map (M) (Unordered Name-Value pairs), equivalent to JSON object
  • JSON Document as attribute value: heterogeneous list ([]) and map ({}). 

JSON Mapping to DynamoDb Types

  • string -> S
  • number -> N
  • boolean -> BOOL
  • null -> NULL
  • array -> L
  • object -> M

Local and Secondary Indexes

  • For faster lookup, additional indexes can be defined for a table.
  • Local Secondary Index (LSI): Same hash key with an alternate scalar range key. 
  • Global Secondary Index (GSI): Different hash and range.

Local Secondary Index (LSI)

  • Table must have Hash and Range
  • Alternate index on Range
  • Can project selected attributes into the index for faster lookup
  • 10 GB limit

Global Secondary Index (GSI)

  • Different hash than the table.
  • Kept as a second table.
  • Can project selected attributes into the index for faster lookup.
  • Asynchronously updates only (eventual consistency only).

Write Operations

  • Items can be added, updated or deleted.
  • UpdateItem: Can update a single attribute of an item
  • Atomic increments are supported to an item's attribute.
  • Conditional writes to items are supported.
  • Conditional Expressions (Put/Update/DeleteItem): specify condition for the operation to be successful

Query

  • DynamoDB tables can be queried using the primary key and secondary indexes
  • When querying, the application can decide whether it wants strong or eventual consistency when making API calls. Strong consistency costs twice as much as eventual consistency in terms of Read Capacity.
  • For JSON document stored in items, data can be projected out of document upon getItem using expressions.
  • Projection Expression (Query/Get/Scan): Reach in the json document and pull out a value
  • Filter Expression (Query/Scan): Relational expression

Scan

  • It is possible to iterate through each item of a table without using primary or secondary indexes.
  • Scans are slower and consume large amount of a capacity. Queries are preferred when looking for desired table items.

Streams - Change log of dynamodb

  • Time ordered change log, stays in queue 24-hours
  • Can use lambda to process stream (Kinesis), or KCL
  • Can use to replicate to a different region (open source library available)
  • Create derivative table
  • Update Elasticache

Security

  • Access can be controlled at table and individual items and attributes level (IAM roles assumed by users)

APIs

  • DynamoDB operations are performed over HTTP using a POST request. All requests are signed and authenticated.
  • A low-level API is available for constructing, signing, and making the requests. 
  • Two higher level APIs are also available:Document-based API and an Object Persistence API.

Best Practices

Understand how partitioning works

  • DynamoDB creates additional partitions for your table, as the amount of data in your table grows, or as you provision additional read and write capacity. Each partition can potentially be on a different compute node.
  • Each partition is limited to 10GB so if your table is over 10GB, it will have at least 2 partitions.
  • The read capacity of a partition is limited to 3000 RCUs (read capacity units). The write capacity of a partition is  limited to 1000 RCUs. So if you request an RCU of 3001, you will have at least two partitions in your table.
  • The number of partitions in a table is the MAX of required partitions per the previous two criteria (data size and capacity).
  • The total capacity of table cannot realized, if your data is not evenly distributed amongst partitions. Let's say you have two partitions, P1 and P2. Let's say you have specified 6000 RCU for the table. The RCU for P1 and P2 individually is limited to 3000 RCU.
  • The most important takeaway from all this discussion is to use hash keys that spread over large number of partitions.

Use varied hash keys

  • Use varied hash keys, if possible, so that the data is distributed to different partitions and no particular partition gets hot. 
  • For example, say you want to increase a counter for a particular item ABC. For a given item, save multiple counters (stored in multiple partitions), ABC_1...ABC_10 (write sharding). 
  • To get the total, add up counters in the 10 partitions (scatter and gather query). Large number of requests can be handled and no particular partition gets too hot.
  • This makes getting a specific item more cumbersome (need to look in all 10 partitions). To get around this, calculate the partition number based on some inherent attribute of an item. When looking up an item, use the attribute value to determine the hashkey (partition) to hit. Example to calculate partition number: val % (num partitions) + 1.

How to test at scale

  • When you start out, your production tables will be small
  • Over time, the data size will increase resulting in additional partitions being added to the table
  • Based on how partitioning works and how your data 8s distributed among partitions, the resultant performance might be drastically different than what you started with.
  • To test at a scale that is inline with what your production loads might be, you would want to create a table with about the same about of partitions as you would expect in production.
  • There are two ways to create a table with lots of partitions. 
  • You can upload the table with lots of table (recall if the table grows beyond 10GB, additional partition is added). 
  • An easier way is to initially create a table with a very large provisioning of RCUs and WCUs (recall each partition is limited to 3000 RCU/1000 WCU). After the table has been created with multiple partitions, you can reduce the table throughput provisioning (the partitioning remains the same even after the reduction in throughput provisioning) and perform your test (Keep the storage to capacity ratio the same as you would expect in production to get the desired behavior).

Read/Write throughput throttling

  • When a particular partition gets hot i.e. gets large number of read or write requests greater than the supported capacity, throttling (errors on your application side) occurs.
  • It this load occur in short bursts, throttling may not occur because DynamoDB reserves a portion of your unused capacity for later "bursts" of throughput usage. DynamoDB currently reserves up 5 minutes (300 seconds) of unused read and write capacity. But you cannot count on it being available at all times.

Use parallel processing when bulk export and upload

  • When doing bulk export, use parallel scan (so that one partition does not get hot). 
  • When uploading items in a sequence, mix up the items based on the item partition key so that uploads keep all partitions busy equally.
  • Limit page size and rate so the capacity is not totally consumed by upload, stifling the throughput of the table for production needs.

Keep item size small

  • To conserve read and write capacity, limit item size
  • Keep urls/pointers to large data in S3 (as blobs). 
  • For large records, use master-detail items. Master contains "header" information. Details in separate times. Now we require less read capacity to get a list of masters.

How to support different access patterns

  • Use Global Secondary Indexes to support different access patterns. Use GSI to model N:M relationships.
  • Hash and range keys can be used to model 1:N relationship. Use Local Secondary Indexes (LSI) to support different access patterns. 
  • Use composite range keys where appropriate. For example, INPROGRESS#20151120. Now we can easily find all INPROGESS items by using begins_with("INPROGRESS#")
  • DynamoDB only writes to an index if there is a value for the attribute. In other words, DynamoDB uses sparse indexes, i.e. does not use capacity when there is no attribute value to be written to the index. 

How to handle time series data

  • For time series data, it is better to have separate tables for periods of time (daily, monthly, weekly). 
  • Old data can be deleted by just deleting the table.
  • Provisioning on old table that are not accessed as frequently can be reduced to save cost.

Use separate tables for hot and cold tables

  • Don't mix hot and cold partitions in one table. 
  • Use separate tables and adjust provisioning accordingly. 
  • Some data might we write intensive while other data may be read intensive. Recall read and write capacity is set separately for a table. 
  • Possibly, use different tables for different user types (e.g. real time, analytics)

Use conditional updates to save on consumed capacity and consistency

  • Items in a table can be conditionally updated. For example, API can specify that an item should only be updated if item's attribute has a certain value on the server. For example, we want to allow a person to vote once. Write a condition that checks whether a person has previously voted and only add vote item if the condition is not met.
  • Another way to implement conditional updates is to use the Read/Modify/Write pattern. DynamoDB mapper API allows you to create and update a "Version" attribute of an entity. The Write is only successful, if the item version has not changed since we read, modified and attempt to write the item.

Use transactions library when you must support transactions

  • Use the Java SDK transaction library for updating multiple items in a transaction (uses conditional updates, version attribute of items and two additional tables). 
  • Alternatively, augment DynamoDB with a RDBMS for recording transactions and duplicating values into DynamoDB tables.

Use eventual consistency

  • Use eventual consistency when reading data, whenever possible. Uses half of the provisioned read capacity. 
  • The data is consistent on all nodes within one second anyways.

Use application cache to save on read capacity

  • Use an application cache (such Amazon Elasticache) for hot reads. 
  • Will reduce the read capacity needed.

Adjust table provisioning with changing loads

  • Table provisioning can be changed throughout the day based on traffic. This saves cost.

Use indexes for performance and for saving capacity

  • Use global or secondary indexes to reduce the amount of provisioned read capacity required to quickly find items.

Use streams for propagating changes made  to DynamoDB items

  • DynamoDB can publish changes (essentially a DB change log) to a stream that can be processed by another client (such as a Lamdba function or a KCL client). 
  • Streams are ordered, and persistent for 24 hours. 
  • Streams can be used to replicate data from one region to another (cross region replication library is available).  
  • Can use to perform aggregations. 
  • Can use to create indexes/derivative tables. 
  • Can use Lambda to update an application cache when an item in table changes. 
  • All this can be done in near real-time.

Data Analysis

  • DyanmoDB is designed for fast access, not for performing slicing and dicing of data.
  • Export data to a RDBMS or data warehouse such as Amazon Redshift for analytics. 
  • Can use EMR. 
  • Can also use Streams to analyze in real time.

Keep back ups of your tables

  • For backups, export DynamoDB tables to S3.