categories.pipeline-orchestration Advanced
Change Data Capture (CDC)
Explain how CDC works and its use cases.
What Is CDC
Change Data Capture tracks data changes (INSERT/UPDATE/DELETE) in a database and streams those changes to downstream systems in near real-time.
How It Works
Most CDC tools (e.g., Debezium) read the database transaction log (MySQL Binlog, PostgreSQL WAL) rather than polling the database, resulting in minimal production load.
Flow
Database transaction log → CDC tool (Debezium) → Kafka topic → Downstream consumers (data warehouse, cache, search index)
Use Cases
- Database sync: Stream OLTP data to an analytics warehouse in near real-time.
- Cache invalidation: Automatically evict Redis cache entries when the database updates.
- Search index sync: Automatically update Elasticsearch when the database changes.
- Audit log: Maintain a complete history of all database changes.
CDC vs Scheduled Batch
CDC achieves near real-time sync (second-level latency); batch sync typically has hour-to-day delays.
✦ AI Mock Interview
Type your answer and get instant AI feedback
Sign in to use AI scoring
