AWS - ELB

Terminology

SSL Offloading

example : POODLE issue. 62% ELB updated within 24 hours

Proxy Protocol

  • for TCP layer

https://www.52os.net/articles/PROXY_protocol_pass_client_ip.html

X-Forwarded-For

  • designed for HTTP (Application Layer)

https://en.wikipedia.org/wiki/X-Forwarded-For

ELB architectures

  • ELB has it’s own VPC – managed by AWS
  • ELB load balancing normally relys on Route 53
    • Route 53 use Round robin to dispactch requests to ELB sits in different AZ
    • Route 53 can do health check to ELBs (150 sec to shift all traffic to healthy ELBs)
    • ELB DNS records will change over time ; Route53 shouldn’t point to ELB IP.
      • ELB record changes when scale; it won’t change when failure (automatically handles)
      • When IP changes, it will be drained and quarantined for 7 days
  • ELB as a global service will automatically support multi AZ; ELB can support cross zone load balancing (need config change)

ELB load balancer

Classic ELB

TCP / SSL HTTP / HTTPS
network layer layer 4 layer 7
connection passing through (optional: SSL Offloading) terminated & pooled
payload no modification header might be modified
routing Proxy Protocol X-Forwarded-For header
algorithm round-robin Least-Outstanding-Requests (might combines with sticky session)

New Application ELB with a extention on

  • Path based routing (content based routing)
  • HTTP/HTTPs only
  • container
  • native Web Socket , HTTP/2

Latest category:

Elastic Load Balancing supports three types of load balancers. You can select the appropriate load balancer based on your application needs. If you need flexible application management and TLS termination then we recommend you to use Application Load Balancer. If extreme performance and static IP is needed for your application then we recommend you to use Network Load Balancer. If your application is built within the EC2 Classic network then you should use Classic Load Balancer.

The Classic Load Balancer is ideal for simple load balancing of traffic across multiple EC2 instances, while the Application Load Balancer is ideal for applications needing advanced routing capabilities, microservices, and container-based architectures.

New feature with Application Load Balancer

Content based routing

ELB content based routing

  • very important feature for micro services.
  • Before, if you have different services hosting by different clusters, you have to have multiple ELB running to routing the requests to each individual cluster.

Shall we use single Application ELB for all as it can support content based routing?

  • Consider the Blast Radios and Isolation
  • SPF and configuration nightmare

Support for Container based architecture and micro service architecture

  • Muti port
    • classic: one port one instance ; Application ELB: multi port one instance
  • ECS can work with ELB seamlessly update the configuration

better API to interact with Application ELB

  • Listener: protocol & port
    • one ELB can have min 1 and max 10 listeners
    • Rounting rules are defined based on listeners
  • Target Groups : group of EC2 instances or Containers or micro services
    • Target group can associated with autoscaling group
  • Targets: EC2 instances or Containers or micro services
    • single target can be registered with multi target groups
  • Rules:
    • provide link between listeners to target groups
    • limit from 10-100(keep changing) rules per ELB
  • Delete protection (prevent delete via api)

better support for WebSocket , HTTP2, Real-time streaming

On by default:
WebSocket : like downloading something from tablet via http
HTTP2: better performance with supporting concurrent request
RealTime Streaming

other improvements

  • Native support for IPV6
  • improved Health check; it will return result when doing health check.
  • Fail Open : to prevent the wrongly configured health check which is too deep.
    image
  • Running in 3 zones is cheaper if we want to garantee fail over with same capacity when 1 zone fails.
  • Cross zone for Application ELB is always enabled.

ACM integration

  • Fully integrate with ELB (FREE)
  • Automatically Renew

Web Application Firewall

image

ELB Health check

  • TCP health check or HTTP health check
    • TCP health bar is very low
    • Try use HTTP with 2xx response if possible
  • Customize frequency and failure threshholds
  • Consider the depth and accuracy of the health check
    • If the check includes the backend DB, then 1 DB failure might result in all web server being labelled as unhealthy

ELB offloading

  • SSL protocols supported: TLS1.0 1.1 1.3 and SSLv3
  • “Server Order Preference”, used to negotiate the cipher. Server will use it’s own list and find the top one that client supports.
    • So server can define the most preferred Cipher at the top of server’s list.

ELB key configurations

  • Idle Timeout
    • Default 60 sec, allow 1 sec -> 1 hour
    • recommended config : how long the customer willing to wait till they get results (before they click retry)
    • anything longer than expected should be designed as error.

ELB metrics, Integrate with Cloudwatch

  • ELB by default is 1 min granularity.

Useful Metrics :

  • HealthyHostCount; UnhealthyHostCount.
    • make sure the timeout is specified carefully to make sure the unhealthy status is tagged correctly
  • Latency : request sending out from ELB till response received by ELB
    • Min / Max / Average Latency provided
    • Debugging individual request using Access Logs;
  • SurgeQueue (max length =1024) and SpillOvers
    • this is Old design ; and new idea is fail earlier
    • SurgeQueue is used to queue in requests ; spillovers are requests being rejected when surgequeue is full (receiving HTTP: 503 Service Unavailable or HTTP: 504 Gateway Timeout errors)

Updates with Application ELB

  • Metrics can collect at target group level or ELB level
  • Application ELB removed surgequeue and recommend to use “rejectedCount” metrics
  • Cloudwatch Percentiles
    • give me P99 — give me responsetime for 99% of my customer.
  • Request Tracing
    • Append header X-Amzn-Trace-ID, used for analysis the request through all ELB (might happen when services have dependencies to each other) and log to S3.

ELB integrate with AutoScaling

AutoScaling can scale based on Metrics collected by ELB

Don’t just look at the peak time. Sometimes the AutoScaling is shrinking the cluster too much which also result in bottleneck.

ELB sticky sessions

  • enable ELB to route a certain user always to same backend instance; if backend instance become unhealthy, then stickness can be moved to another instance
  • Application controlled: application can decide whether to include the cookie generated from ELB back to client or not
  • sticky session: provided but not recommended. Recommend to use Elastic Cache.

Best Practise & Trouble shooting

Issue: Tablet; Client or ISP cache the IP address of DNS, which result in when ELB IP changes , problem happens.

To solve the problem,

  1. When register with Route53 use alias like *.mydomain.com
  2. When application needs to refresh to connection, append GUID at beginning to force the refresh resolve. like send the request to 55152E66-3C6A-4F6D-B1B0-D05C506F0528.mydomain.com

Enalbe Access Log to trouble shooting

Once enabled, every request is logged into S3 for analysis.(every 5 min or every hour)
Analysis using Splunk , EMR, Hive etc
Log is indexed using date but including ELB ip address
Log content have detailed info to dig into which url is response slowly.

methodoloy when something failes

Metigation ; Isolation ; Restore Redundency

Something to memorize

ELB Access Log Naming Convention

1
s3://my-loadbalancer-logs/my-app/AWSLogs/123456789012/elasticloadbalancing/us-west-2/2014/02/15/123456789012_elasticloadbalancing_us-west-2_my-loadbalancer_20140215T2340Z_172.160.001.192_20sg8hgm.log

References

Best Practise 2014

https://youtu.be/K-YFw9-_NPE

Best Practise 2016

https://youtu.be/qy7zNaDTYGQ

081.mp4 082.mp4 - ELB Overview

ELB controller service
http://jayendrapatil.com/aws-elastic-load-balancing/

ELB trouble shooting (METHOD_NOT_ALLOWED)
https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/ts-elb-error-message.html

Extended reading
http://jayendrapatil.com/aws-elastic-load-balancing/

Reward Makes Perfect
0%