categories.performance Basic
Latency vs Throughput
Explain the difference between latency and throughput and how to optimize each.
Definitions
Latency: The time from sending a request to receiving a response (milliseconds). Measures speed of individual requests.
Throughput: The number of requests a system can handle per unit time (RPS). Measures overall processing capacity.
Relationship
Little's Law: Throughput = Concurrency / Average Latency. Increasing concurrency or reducing latency both improve throughput.
Optimizing Latency
- Reduce unnecessary network round trips (RTT)
- Use caching to avoid repeated computation
- Optimize database queries (indexes)
- Deploy closer to users (CDN, edge computing)
Optimizing Throughput
- Horizontal scaling (add instances)
- Async processing (reduce blocking)
- Batch processing
- Connection pooling (reduce connection overhead)
Trade-offs
Sometimes both cannot be optimized simultaneously — e.g., batch processing improves throughput but increases latency for individual requests.
✦ AI Mock Interview
Type your answer and get instant AI feedback
Sign in to use AI scoring
