EC2 + MySQL DB
Vertical scaling
Basic monitoring: CPU, MEM, IO, NETWORK
EC2 with public static IP (AWS Elastic IP)
DNS (Route 53) to map the domain to the public ip
Security
Allow incoming requests from:
- 80 for HTTP
- 443 for HTTPS
- 22 for SSH
- prevent outbound connections
DNS -> Web Server -> MySQL + Object Store
Add Object Store (e.g. S3) to store static contents
DNS, CDN, Load Balancer -> Web servers, Application Servers -> MySQL (master — slave), Object Store
Horizontal scaling
- Multiple Servers across multiple AZs
- Multiple DBs in master-slave failover mode
Load Balancer
- AWS ELB is highly available
- Terminate SSL on LB to reduce the pressure on backend servers
Application Servers separate from Web Servers
- web servers can run as proxy
- some app servers process write APIs, some process read APIs
- they scale independently
Add CDN such as CloudFront
DNS, CDN, Load Balancer -> Web servers, Application Servers -> MySQL (master — slave), MySQL Read replicas, Memory Cache, Object Store
First configure MySQL DB cache to see if it’s sufficient, if not use memory cache to store:
- frequently accessed content from mysql
- session data
Add read replicas for mysql to reduce load on write master
- add LB in front of read replicas
- most services are read heavy vs. write heavy
More server instances
DNS, CDN, Load Balancer -> Web servers, Application Servers -> MySQL (master — slave), MySQL Read replicas, Memory Cache, Object Store
Add auto-scaling
- AWS AutoScaling
- a group per app server type/web server type, place each group in multiple AZs
- set up min/max number of instances
- scale up/down through cw, using metrics like cpu, latency, network traffic, custom metric
Automate DevOps
- chef
- puppet
- ansible
Monitor metrics
- host level — single EC2 instance metrics
- aggregate level — LB stats
- log analysis — splunk, cloudwatch, cloudtrail
- external site performance — new relic
- incidents — pagerDuty
- error reporting — sentry
DNS, CDN, Load Balancer -> Web servers, Application Servers -> MySQL (master — slave), MySQL Read replicas, Memory Cache, Object Store, NoSQL
Consider using data warehouse to store long-lived data if db is too large.
- Redshift can comfortably handle the constraint of 1TB of new content per month
Scale memory cache if we reach 40k reads/s
Think about other scaling patterns for DBs
- federation
- sharding
- denormalization
- SQL tuning
Some data can be moved to NoSQL DB such as DynamoDb
Some processes that do not need to be done in real-time, we can do it asynchronously with queues and workers
- SQS + Lambda