AWS - DynamoDB

DynamoDB Deepdive

  • RDB optimized for storage; nonsql optimized for compute
  • DynamoDB supports both Key-value and document data models

Terminology

Keys:

  • Partition Key; Hash Key;

  • Sort Key;

  • Range Key ;

    • Local secondary index (LSI)
    • Global secondary index (GSI) (max 5 GSI per table)
    • If data size >10G use GSI
    • Primary Key = Partition key + Sort Key
  • Attribute;

  • composite attributes ; composite key : a way of construct partition key

  • Partition Key used to decide which partition it belongs to (uidling unordered hash index)

  • A Partition has 10G limit, if total storage with same partition key exceeded the limit, then sort key is used.

  • Table --> Items --> Attributes --> Partition Keys (Mandatory attribute); Sort Key (Optional)

  • DynamoDB each partition will have totally 3 copies (including itself); when write, you will get success response when 2 writes succeed.

  • Hash Range Table : a table where Hashkey+RangeKey to identify an item.

Scaling

  • Scaling on throughput : WCU and RCU (CU is Capacity Unit)
    • Partition needed = Roundup((total RCU/3k)+(total WCU/1k))
  • Scaling on size (maxsizeperitem=400kb, maxsizeperpartition=10G)
  • Final partition = ceiling( ScalingByThrougput, ScalingBySize)
  • heat map – showing by time and partition dimention about which data being requested. If all data access is focused from a specific partition, then we got Hot Keys which we should avoid by re-design the paritition key.
  • DynamoDB Burst capacity is built in

Data Modeling

Store the data how you will access it.

  • New feature since 2015: Support documents (JSON)

Patterns

  • Use Lambda as DynamoDB Stored Procedure

  • Real Time Voting

    • Write Sharding (Add random key to make sure data is spread into multi partition)
  • Event logging – Don’t mix hot data and cold data

    • Time series table (static ttl time stamps)
      • Archive cold data to S3
      • precreate daily,weekly monthly tables and provision accordingly to current table
    • Time series table (Dynamic ttl time stamps) – data being updated
      • have a GSIKey to label recent updated data (??? 36min:36s)
      • Put GSI key as partion key and update timestamp as sort key
      • using lambda to filter out expired data and rotate into data lake
  • Product Catelog - Black Friday

    • Use cache, and logic of updating cache can be implemented lambda
  • Common Pattern - Online Gaming (filter)

    • Problem: filter against non sort key. The engine will read all qualified records
      • Solution 1: create composite key (in mongodb it’s called combind index)
      • Solution 2: Sparse indexes
  • Messaging App - Mixed small and large attributes — Wrong

    • large attribute will consume more RCU
    • Solution : split the table. Separate metadata and message body into different table. Provision RCU accordingly.
  • Sparse Index – index created on optional attribute

    • for example game-score-table has a column called award. When we want to filter by award, we want a Award GSI being created althrough this field has high chance to be null.

MVCC – Transaction model in nonsql db

  • MultiVersion Concurrency
    • For example, creating partition locks.
      • Manage versioning across items with metadata
      • Tag attributes to maintain multipe versions
      • Code your app to recognize when updates in progress
      • App layer error handling and recovery logic

New Feature: DynamoDB Streams

  • Stream of updates to a table (Asynchronous)
  • Exact once ; Strictly ordered (per item)
  • 24 hour life time
  • Cross-region replication is using this feature
  • Even multi-master scenario
  • Pattern :use lambda to read the stream and update
  • Pattern: use Kinesis to interact with stream and aggregate into multi-target
  • Use DynamoDB Stream to update Elastic Cache content
  • ListStreams / DescribeStream / GetShradIterator…

Reference

Deepdive DynamoDB — Very good
https://youtu.be/bCW3lhsJKfw

Deepdive DynamoDB
https://youtu.be/VuKu23oZp9Q

How to choose the right key
https://aws.amazon.com/blogs/database/choosing-the-right-dynamodb-partition-key/

038.mp4 039.mp4 – DynamoDB Overview

No SQL DB

  • Secondary indexes
  • Atomic Counters

Terminology,

040.mp4 – DynamoDB Handson

  • Create a Table

  • Select Primary Key & Sort Key ; Index Key & Sort Key

  • Add item from Console

  • Prepare a JSON file to bulk upload data

  • using command line to batch upload data

    1
    aws dynamodb batch-write-item --request-items path/to/prepared.json
  • using command line to query the table

Supported Data types
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DynamoDBMapper.DataTypes.html

Table name rule
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.NamingRulesDataTypes.html#HowItWorks.NamingRules
charactor, number, underscore, dash, dot

Reward Makes Perfect
0%