categories.performance Basic

Latency vs Throughput

AI Practice

Explain the difference between latency and throughput and how to optimize each.

Definitions

Latency: The time from sending a request to receiving a response (milliseconds). Measures speed of individual requests.

Throughput: The number of requests a system can handle per unit time (RPS). Measures overall processing capacity.

Relationship

Little's Law: Throughput = Concurrency / Average Latency. Increasing concurrency or reducing latency both improve throughput.

Optimizing Latency

  • Reduce unnecessary network round trips (RTT)
  • Use caching to avoid repeated computation
  • Optimize database queries (indexes)
  • Deploy closer to users (CDN, edge computing)

Optimizing Throughput

  • Horizontal scaling (add instances)
  • Async processing (reduce blocking)
  • Batch processing
  • Connection pooling (reduce connection overhead)

Trade-offs

Sometimes both cannot be optimized simultaneously — e.g., batch processing improves throughput but increases latency for individual requests.

✦ AI Mock Interview

Type your answer and get instant AI feedback

Sign in to use AI scoring

Copyright © 2026 Wood All Rights Reserved · FE Interview Hub