Kafka Streams and KTables
2 min readNov 11, 2022
Kafka Streams
- We use Kafka streams to process data coming to Kafka topics.
- Connect is useful when you want data in and out, but to modify that’s where Kafka streams come in.
- Kafka streams is a streaming engine.
- Event streams are like topics in brokers, sequence of events etc..
- You can think Kafka stream as a standalone application or microservice that takes data from Kafka and process it.
Java Code for Kafka Stream
- Event streams have events which are a key value pair.
- Key isn’t unique in Kafka event streams.
Event Stream Java code
Topology
Kafka stream processor is a DAG processing nodes & edges that represent flow of event stream.
Kafka Stream Topology
KTable
- KTable can subscribe to only one topic at a time unlike Kstream.
- KTable represents latest vale of each record.
- KTable is backed by a state store state.
- Store is the copy of events KTable is built from.
- Ktable doesn’t not forward every change by default cache get flushed every 30 sec
- Global ktables: holds all records of all partitions — stores more static data
Java Code for KTable
KTable Java code
Joins in KStream
To execute Join in KStreams, keys must be same
- Stream-Stream — combine 2 event streams to new event — Results in a stream
- Stream-Table — Results in a stream
- Table-Table — Results in a table
Java code for Joins
Aggregations in KStream
- Remember state of the stream and hence able to perform (count, max, sum..)
- Groups by key. Kafka repartitions the data, so that same key is in one partition
- Reduce is like aggregation but reduce — has restriction of being same type. Aggregations don’t have restriction of begin same type, but key has to be same.