Stream Processing: Windowing and Time Semantics
Explain time semantics and window types in stream processing.
Two Time Concepts
Event Time: When the event actually occurred (carried by the event itself). More accurate but requires handling out-of-order events.
Processing Time: When the system receives and processes the event. Simple but may be inaccurate (network delays, retries).
Watermark
A watermark is a progress marker for event time, telling the system "all events with event time before this watermark have arrived," triggering window computation. Watermarks allow a configurable tolerance for late arrivals.
Window Types
Tumbling Window: Fixed-size, non-overlapping windows (e.g., one window per 5 minutes).
Sliding Window: Fixed-size overlapping windows with a slide interval (e.g., every 1 minute compute the last 5 minutes).
Session Window: Dynamically groups events by inactivity gap (e.g., session ends after 30 minutes of user inactivity).
Leading Frameworks
Apache Flink (most powerful stream processing framework), Apache Spark Structured Streaming, Kafka Streams.
✦ AI Mock Interview
Type your answer and get instant AI feedback
Sign in to use AI scoring
