Hey Hey Hey


  • Home

  • Archives

  • Tags

  • Search

AWS - Amazon Route 53

Posted on 2019-07-23 |

Route53 Resolver (Released in 2018.12)

  • Issue: in hybrid architecture, VPC can’t access Data Center name and Data center can’t access VPC private DNS name.
  • Traditional workaround:
    • spin up EC2 to run bind or unbound as DNS server, used to forward request to plus-2 resolver
    • need to consider failover and sometimes a group of DNS server per vpc
  • This requirement is called Recursive DNS lookup.

How Route53 Resolver works

  • only works for single region (can’t span region)
  • multiple VPCs under multiple accounts (as long as they are in same region) can share the same Resolver endpoint
  • Need to provision ENI for the resolver, for HA and performance, recommend to provision multiple ENIs
    • One ENI serving one direction of querying (for example, from VPC to On-Pre)
  • When a resolve request received, it will check against all resolve rules, if no matching, treat as local.
    • rules can be shared between accounts (via Resource Access Manager – RAM)

Route 53 Resolver Demo

  • Resolving sequence

    • Auto defined Rules: VPC / Private Hosted Zones/ Internet Resolver
    • Extra rules
      • tips, have “.” rule work as default query forward rule, anything not fit in auto defined rules will go to ns.mycompany.com
      • tips, ns.mycomany.com have a “.” rule to recursive request to internet if no rules matched
      • tips, a rule to forward any request to acquriedcompany.com to ns.acquriedcompany.com
  • API used to create endpoints;

    • Endpoint need to have attached security group to allow port 53
    • API to create rule
    • API to share defined rules
  • Monitoring: Cloudwatch and CloudTrail

Terminology

Authoritative DNS
Recursive DNS

Reference

https://youtu.be/D1n5kDTWidQ

Read more »

AWS - Database family

Posted on 2019-07-22 |

Use the right Tool for the right job

Aurora benefit :

  • 5x throughput vs MySQL and 3x to Postgres

  • Max 15 read replica

  • six copies of data across 3 AZ and continuous backup to S3

  • AWS DMS (Data Migration Service)

New Tools

Data tools are not competing each other, they are complementing each other.
Pick the use case then apply the corresponding tech

  • RDB
  • Key-value
  • Document
  • In-memory
  • Graph (Nepture)
  • Time-Series
  • Ledger

RDB Key-value Graph

RDB: data integrity ; transaction
Key-value: partitioned by keys, consistent performance at scale
Graph: Vertices and Edges

Case Study

  • Airbnb

    • Dynamo for use search history
    • ElastiCache : caching
    • RDS : transaction data
  • A book store

    • Used DynamoDB (key-value) to put book information
    • ElastiSearch — Steam dynamodb change to trigger lambda to put into elastisearch index
    • leader board — use elasticache ; (???) sorting
    • Recommendation engine – use graph db to record people with book and purchases
Read more »

AWS - EFS

Posted on 2019-07-22 |

Reference

https://youtu.be/4FQvJ2q6_oA

  • AWS has 3 main adoption patterns, that can be mapped to 3 storage categories
    • Re-Hosting – Block Storage
    • Re-Platform – File Storage – EFS
    • Re-Architecting – Object Storage

What’s new

  • EFS only support linux; new FSx for Windows File Server
  • FSx for Lustre
  • Support Multi-VPC access
  • AWS DataSync : initial full copy, and subsequent incremental transfers of changed data to cloud ; Muti thread
  • TCO Example, 100G standard storage, 400G Infrequent, around $50/month

Deep Dive

  • Performance mode
    • General Purpose , focus on low latency (max 7k iops/sec) – Recommend to start with
    • Max I/O, focus on I/O (higher latencies)
  • Throughput mode
    • Busting Throughput – Recommend to start with
    • Provisioned throughput (you can decrease every 24 hours)
  • EFS Infrequent Access (85% cheaper)
    • Auto lifecycle management (any file not being accessed more than 30 days)

Security Model

Network using ACL; Access using POSIX or IAM; Encrypt ; Compiance with HIPAA etc.

Use case

Read more »

AWS - GreenGrass

Posted on 2019-07-04 |

AWS - S3

Posted on 2019-06-21 |

Trouble shooting : Public Object Access Denied

  • ACL and Bucket Policy all set Public
  • Account and Bucket level allow it to be Public
  • Observation: object uploaded from console works, object uploaded from another account failed.

Add below to specify the public access as well as assign the original bucket user to have full control

1
--acl public-read

use aws-js-s3-explorer

https://github.com/awslabs/aws-js-s3-explorer

Read more »

Kimball Dimensional Modeling Techniques Overview

Posted on 2019-06-17 |

Fundamental Concepts

Gather Business Requirements and Data Realities

samples in the book

Chapter 1 DW/BI and Dimensional Modeling Primer , p 5
Chapter 3 Retail Sales , p 70
Chapter 11 Telecommunications , p 297
Chapter 17 Lifecycle Overview , p 412
Chapter 18 Dimensional Modeling Process and Tasks , p 431
Chapter 19 ETL Subsystems and Techniques ,p 444

Collaborative Dimensional Modeling Workshops

Dimension models should be designed by folks who fully understand the business and their needs.

Four-Step Dimensional Design process

  • Select the business Process
  • Declare the Grain
  • Identify the Dimensions
  • Identify the facts

Business Processes

Operational Activities

Read more »

Kimball Dimensional Modeling Techniques applied to Inventory Sample`

Posted on 2019-06-17 |

Value Chain Introduction

For value chain, here introduces 3 models.

Inventory Periodic Model

  • Scenario: a grocery with 60,000 products * 100 stores, with daily periodic model, there would be 60k*100=6millon records per day.
  • Estimation
    • 14byte per row * 6million =84mb per day ; 3 years will be 84 * 1095day=91G data
    • or 60days of daily and archive old data to weekly snapshot;
  • Semi-Additive Facts
    • Pay attention to the use of " SQL AVG" when do summarize
  • Enhanced Inventory Facts
    • Adding more column to fact table including quantity on hand, quantity sold,
      • quantity sold daily / quantity at hand daily = number of turns
      • quantity sold whole year / average quantity at hand daily = number of turns for a year
      • Estimate number of days’ supply = current quantity at hand / average quantity sold per day
    • Adding inventory at cost and inventory value at latest selling price

Inventory Transactions model

P117

Read more »

Data Wharehousing, Business Intelligence, and Dimensional Modeling Primer

Posted on 2019-06-02 |

Key difference between operational system and Data warehouse

  • 一个往里面送数据,一个往外查数据
  • 一个要求transaction并且保持当前状态准确,业务逻辑严格按照流程来;一个要求大量查询和比对,查询需求不停变化

Goals of Data Warehousing and Business Intelligence

  • 收集的数据不好用
  • 收集的数据不是查询友好
  • 业务人员用起来不方便
  • 数据不一致
  • 我们想实现fact-based决策

所以DW需要,

  • 数据贴近业务人员;好理解
  • consistent: 一样的名字必须代表一样的东西
  • 能够支持需求变化,能够支持变化的时候对用户透明
  • 数据必须及时,即使需要clean和validate
  • 数据安全非常重要, DW的信息决定了一个企业“卖什么东西给谁以什么价格”
  • DW是一个decision support system
  • DW必须得到业务人员的支持和使用才能成功;跟业务系统不一样,DW是optional,不好用就会被废弃

Publishing Metaphor for DW/BI Managers

把DW必须成发行杂志。DW需要

  • 理解读者
  • 取悦读者
  • 保证发行

类似于发行杂志,DW需要选择数据源,保证数据准确,然后以正确的方式展现给读者(用户),定期更新。

Read more »

PostgreSQL

Posted on 2019-06-01 |

Reference

https://aws.amazon.com/blogs/database/managing-postgresql-users-and-roles/

CREATE ROLE readwrite;
GRANT CONNECT ON DATABASE “Datawarehouse” TO readwrite;
GRANT USAGE ON SCHEMA “dw_cons” TO readwrite;
GRANT USAGE, CREATE ON SCHEMA “dw_cons” TO readwrite;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA “dw_cons” TO readwrite;
ALTER DEFAULT PRIVILEGES IN SCHEMA “dw_cons” GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO readwrite;
GRANT USAGE ON ALL SEQUENCES IN SCHEMA “dw_cons” TO readwrite;
ALTER DEFAULT PRIVILEGES IN SCHEMA “dw_cons” GRANT USAGE ON SEQUENCES TO readwrite;

GRANT readonly TO “tableau_read”;
GRANT readwrite TO “tibco_write”;

Read more »

Google Cloud Study Jam

Posted on 2019-05-14 |

gcloud ai-platform local predict
–model-dir output/export/census/1557796906
–json-instances …/test.json

MODEL_BINARIES=$OUTPUT_PATH/export/census/1557797507/

Read more »
1…345…18
Rachel Rui Liu

Rachel Rui Liu

178 posts
193 tags
RSS
GitHub Linkedin
© 2021 Rachel Rui Liu
Powered by Hexo
Theme - NexT.Pisces
0%