categories.cloud-architecture Intermediate

What are the auto-scaling strategies in cloud environments? What is the difference between horizontal and vertical scaling?

AI Practice

Horizontal vs Vertical Scaling

Vertical Scaling (Scale Up): Upgrade single instance specs (more CPU, memory)

  • Pros: No application changes needed
  • Cons: Hardware limits, downtime required, expensive, doesn't eliminate single point of failure
  • Best for: Databases (short-term solution)

Horizontal Scaling (Scale Out): Add more instances

  • Pros: Theoretically unlimited, high availability, pay per use
  • Cons: Requires stateless application design, needs load balancing
  • Best for: Web services, APIs, microservices (mainstream approach)

Cloud Auto-Scaling Types

Target Tracking Set a target metric, such as keeping CPU utilization at 70%, and the cloud platform automatically adds/removes instances. This is the simplest and recommended approach.

Step Scaling Take different actions when metrics breach different thresholds:

  • CPU > 70%: Add 2 instances
  • CPU > 90%: Add 5 instances

Predictive Scaling Forecasts future demand based on historical traffic patterns and scales proactively (good for applications with regular traffic patterns, like a morning peak at 9am)

Scheduled Scaling Scale up in advance for known high-traffic periods (e.g., 1 hour before a promotion starts)

Scaling Design Considerations

  • Cold start time: Faster application startup means faster scaling response (target < 30 seconds)
  • Health checks: Load balancers rely on health checks to determine if an instance is ready
  • Scale-in protection: Set minimum instance count to prevent scaling to 0
  • Sticky session problem: After horizontal scaling, ensure session state is externalized

✦ AI Mock Interview

Type your answer and get instant AI feedback

Sign in to use AI scoring

Copyright © 2026 Wood All Rights Reserved · FE Interview Hub