Friday, December 18, 2015

Redis - Introduction

What is it?

  • An in-memory database
  • Can be persisted to storage
  • A key-value NoSQL database. 
  • All data types have unique key name. Using the key and a command get the value (or field within the value)
  • Can set expiration on keys
  • Data types mimic JSON (sets, lists)
  • Redis can only scale vertically (keys need to stay together to manage sorted sets)
  • May partition Redis into special purpose clusters
  • Leaderboard (sorted sets, keep track of counts)
  • Can perform union of sets (useful for recommendation engines, perform unions to find related items)
  • Share data at high speed across a fleet of instances
  • Can have primary end point for writes, read replicas for reads (read replicas are updated asynchronously), can use read replica for failover. App needs to be aware which on is read and which one is writes. Write replica is the primary and has a non-changing DNS name.
  • Redis can be used as primary database. Use replicas for availability, create snapshot to S3 off the read replicas
  • Large key names take up space, create smaller hashes for these keys
  • Redis data structures cannot be horizontally sharded. 
  • A Redis primary node can handle both reads and writes from the app. Redis replica nodes can only handle reads (up to 5 nodes), similar to Amazon RDS Read Replicas. Redis asynchronously replicates the data from the primary to the read replicas.
  • Because Redis supports replication, you can also fail over from the primary node to a replica in the event of failure.
  • ElastiCache for Redis has the concept of a primary endpoint, which is a DNS name that always points to the current Redis primary node. If a failover event occurs, the DNS entry will be updated to point to the new Redis primary node. To take advantage of this functionality, make sure to configure your Redis client so that it uses the primary endpoint DNS name to access your Redis cluster.
  • In order to split reads and writes, you will need to create two separate Redis connection handles in your application: one pointing to the primary node, and one pointing to the read replica(s). Configure your application to write to the DNS primary endpoint, and then read from the other Redis nodes.
  • ElastiCache auto-failover will update the DNS primary endpoint with the IP address of the promoted read replica. If your application is writing to the primary node endpoint as recommended earlier, no application change will be needed. However, because you read from individual endpoints, you will need to change the read endpoint of the replica promoted to primary cluster to the new replica's endpoint.
  • For Redis, because the engine is single-threaded, you will need to multiply the CPU percentage by the number of cores to get an accurate measure of CPU usage. Once Redis maxes out a single CPU core, that instance is fully utilized, and a larger instance is needed. Suppose you're using an EC2 instance with four cores and the CPU is at 25 percent. This situation actually means that the instance is maxed out, because Redis is essentially pegging one CPU at 100 percent. 
  • Amazon ElastiCache publishes a number of notifications to Amazon SNS when a cluster change happens
Key
  \
    \
      Value (can be any of the following data types)

Data Types

  • Strings
  • List - List of of strings sorted in order they are added
  • Sets - Collection of unique strings, can add many times, not ordered
  • Hashes - Nested key-value stores, good for representing objects, good performance for up to 100 fields/keys
  • Sorted Sets - Items are unique and are sorted

General

  • SCAN 0 - Scan all keys
  • KEYS * - Show keys matching pattern
  • TTL - Time to live
  • INFO - Information about the server (e.g. INFO CLIENTS)

String Commands

  • SET
  • GET
  • APPEND
  • INCR & DECR (if string is number)
  • GETRANGE (substring, get part of string)
  • MGET/MSET (get/set multiple values at the same time)
  • STRLEN

List Commands (Ordered List, Queues and Stacks)

  • LPUSH & RPUSH - add item to left or right side (performant)
  • LREM - remove elements
  • LSET - place item at an index
  • LINDEX - item at a given index
  • LRANGE - items from given range
  • LLEN - length of list
  • LPOP & RPOP - remove from left or right side. imparts queue/stack semantics to lists
  • LTRIM - trim to specified range
Redis lists can be used to hold items in a queue. When a process takes an item from the queue to work on it, the item is pushed onto an "in-progress" queue, and then deleted when the work is done. Open source solutions such as Resque use Redis as a queue; GitHub uses Resque. Resque is a Redis-backed Ruby library for creating background jobs, placing them on multiple queues, and processing them later.

Set Commands

  • SADD - adds a member
  • SCARD - number of members
  • SDIFF, SINTER, SUNUNION - set math 
  • SISMEMBER - is it
  • SMEMBERS - all members
  • SMOVE - move item from one set to another
  • SREM - remove one or more items

Hash Commands

  • HSET - set value of hash field
  • HMSET - set multiple hash values at the same time
  • HGET - get value of hash field
  • HMGET - get multiple values
  • HGETALL - all fields and values
  • HDEL - delete field value
  • HEXISTS - does a field exist
  • HINCRBY - increase a number value by given amount
  • HKEYS - all keys in hash
  • HVALS - all values in hash

Sorted Set Commands

  • ZADD - add one or more items (updates score for the item)
  • ZCARD - number of members
  • ZCOUNT  - count members within the specified score range
  • ZINCRBY - increment score for item
  • ZRANGE - range of members by index
  • ZRANK - rank of member in a set by score
  • ZREM - removes item
  • ZSCORE - score for an item

Pub and Sub

  • Publish and subscribe to a channel - can be used as message bus
Client 1 (subscriber)
subscribe greeting-channel

Client 2 (publisher)
publish greeting-channel "hello"

Client 1 receives the "hello" message because it has subscribed to that channel

Patterned channel name
psubscribe greet*

unsubscribe
punsubscribe

Transactions

Atomic transactions. No rollback.

set account-a 100
set account-b 200

Transfer 50 from a to b.

//Start a transaction block (use multi) - commands are queued (not executed yet)

multi                        // start transaction queue
incr account-a -50   // queued
incr account-b 50    // queued
exec                         // perform the transaction

50 is deducted from a and added to 50 atomically but the transaction does not care if the balance of either account has changed before the transaction started.

What if account-a drops to 0 before the debit?

Need to use watch command.

watch account-a
multi                        // start transaction queue
incr account-a -50   // queued
incr account-b 50    // queued
exec                         // perform the transaction only if account-a has not changed

Client Libraries

  • C#- ServiceStack.Redis - Has options to set timeouts and disconnected client
  • Java - Jedis

Sizing

  • Estimate size -  serialize, count chars, multiply by 2 + 2K for each entry (key, pointers)