What are the auto-scaling strategies in cloud environments? What is the difference between horizontal and vertical scaling?
Horizontal vs Vertical Scaling
Vertical Scaling (Scale Up): Upgrade single instance specs (more CPU, memory)
- Pros: No application changes needed
- Cons: Hardware limits, downtime required, expensive, doesn't eliminate single point of failure
- Best for: Databases (short-term solution)
Horizontal Scaling (Scale Out): Add more instances
- Pros: Theoretically unlimited, high availability, pay per use
- Cons: Requires stateless application design, needs load balancing
- Best for: Web services, APIs, microservices (mainstream approach)
Cloud Auto-Scaling Types
Target Tracking Set a target metric, such as keeping CPU utilization at 70%, and the cloud platform automatically adds/removes instances. This is the simplest and recommended approach.
Step Scaling Take different actions when metrics breach different thresholds:
- CPU > 70%: Add 2 instances
- CPU > 90%: Add 5 instances
Predictive Scaling Forecasts future demand based on historical traffic patterns and scales proactively (good for applications with regular traffic patterns, like a morning peak at 9am)
Scheduled Scaling Scale up in advance for known high-traffic periods (e.g., 1 hour before a promotion starts)
Scaling Design Considerations
- Cold start time: Faster application startup means faster scaling response (target < 30 seconds)
- Health checks: Load balancers rely on health checks to determine if an instance is ready
- Scale-in protection: Set minimum instance count to prevent scaling to 0
- Sticky session problem: After horizontal scaling, ensure session state is externalized
✦ AI Mock Interview
Type your answer and get instant AI feedback
Sign in to use AI scoring
