Kafka Streams and KTables

Amulya Reddy Konda
2 min readNov 11, 2022

Kafka Streams

  • We use Kafka streams to process data coming to Kafka topics.
  • Connect is useful when you want data in and out, but to modify that’s where Kafka streams come in.
  • Kafka streams is a streaming engine.
  • Event streams are like topics in brokers, sequence of events etc..
  • You can think Kafka stream as a standalone application or microservice that takes data from Kafka and process it.

Java Code for Kafka Stream

  • Event streams have events which are a key value pair.
  • Key isn’t unique in Kafka event streams.

Event Stream Java code

Topology

Kafka stream processor is a DAG processing nodes & edges that represent flow of event stream.

Kafka Stream Topology

KTable

  • KTable can subscribe to only one topic at a time unlike Kstream.
  • KTable represents latest vale of each record.
  • KTable is backed by a state store state.
  • Store is the copy of events KTable is built from.
  • Ktable doesn’t not forward every change by default cache get flushed every 30 sec
  • Global ktables: holds all records of all partitions — stores more static data

Java Code for KTable

KTable Java code

Joins in KStream

To execute Join in KStreams, keys must be same

  1. Stream-Stream — combine 2 event streams to new event — Results in a stream
  2. Stream-Table — Results in a stream
  3. Table-Table — Results in a table

Java code for Joins

Aggregations in KStream

  • Remember state of the stream and hence able to perform (count, max, sum..)
  • Groups by key. Kafka repartitions the data, so that same key is in one partition
  • Reduce is like aggregation but reduce — has restriction of being same type. Aggregations don’t have restriction of begin same type, but key has to be same.

--

--