categories.observability Basic
What are the three pillars of observability? What are Metrics, Logs, and Traces each used for?
Observability vs Monitoring
Monitoring: Pre-define the metrics you want to watch (known unknowns)
Observability: Ability to infer the internal state of a system from external outputs — answering unforeseen questions (unknown unknowns)
Three Pillars
Metrics
Time-series data in numeric form, used to quantify system behavior.
- Characteristics: Low storage cost, efficient aggregation, good for alerts and dashboards
- Used for: CPU utilization, request QPS, error rate, P99 latency
- Tools: Prometheus + Grafana, CloudWatch, Datadog
Logs
Text records of events, capturing what happened, when, and with context.
- Characteristics: Information-rich but high storage cost; difficult to correlate across services
- Used for: Error details, user behavior, audit trails
- Tools: ELK Stack (Elasticsearch+Logstash+Kibana), Loki, Splunk
Traces
Record the complete path of a single request across multiple services.
- Characteristics: Reveals performance bottlenecks and dependencies in distributed systems
- Used for: Finding which service caused request latency, service call chain analysis
- Tools: Jaeger, Zipkin, AWS X-Ray, Tempo
How They Complement Each Other
- Metrics tell you "something is wrong" (error rate rising)
- Logs tell you "what happened" (specific error messages)
- Traces tell you "where it went wrong" (which service, which database query)
✦ AI Mock Interview
Type your answer and get instant AI feedback
Sign in to use AI scoring
