2. EXAMPLE: MATECO
Mateco Business ‘Streams’
• Microsoft Dynamics 365
• CRM + ERP
• Invoicing
• Q.Rent
• Q.Planning
• Q.Service
• Q.Trade
• Integrations
Each ‘Stream’ has their own IT squad and
their own set of applications.
3. WHY DO WE NEED MESSAGES/EVENTS?
IT Architecture with multiple teams
Make the teams independent as much as possible
No central database
No distributed transactions
No blocking REST calls
But: Keep data consistent across teams’ services
‘Eventual Consistency’
Solution
each service publishes an event whenever it updates its
data
other services subscribe to events
when an event is received, that service updates its data
4. EVENT STREAMING
Information is packaged as ‘events’, with enough
information for any consumer to be able to handle it
The event is published in a central event store
The sender does not need to know which consumers
will process it
Allows:
Replicate information across independent services
In pseudo-real-time
Create new applications without disrupting other
applications
5. JMS TOPICS AND QUEUES VS KAFKA TOPICS
JMS
Topic: publish-subscribe
All subscribers receive all
messages
Messages are only sent to active
subscribers
Queue: send-receive
Messages are queue’d until a
consumer consumes it.
Allows horizontal scaling
Needs a separate queue for a
different receiver group
Kafka Topic
Stores all messages
Limited retention (optional)
Stores offsets on the server for
each consumer group
Multiple consumer(group)s can
receive all messages
Consumers can restart from the
beginning
New consumers can be added
7. KAFKA ARCHITECTURE: BROKERS
Broker: Server for Partitions
• Partition master
• Partition replica
ZooKeeper
• Used for synchronization within brokers
• Optional since Kafka 2.8
• Older Kafka versions used ZooKeeper
for connection management and to
store offsets
8. MESSAGE GROUPS VS PARTITIONS
JMS
Message Group ID is an optional header
Guarantees that messages for the same
Message Group ID are processed by the same
thread
KAFKA
Each message can have a key
A topic is divided into partitions, each message will
be put into a partition based on a hash of its key
Random partition if there’s no key
Subscribers get a fixed number of partitions
assigned
Partitions are also used for horizontal scaling
Partitions are distributed across servers
Consumers only need to connect to owners of their
partitions
9. KAFKA DEMO
Run Kafka on Kubernetes
Using helm charts, e.g: https://artifacthub.io/packages/helm/bitnami/kafka
Alternative: Confluent Cloud, AWS Managed Kafka, …
Demo application using Spring Boot: https://www.baeldung.com/java-kafka-streams-vs-kafka-consumer
10. KAFKA TOPIC CLEANUP POLICY
Kafka topics are stored in append-only segments.
Cleanup is still done, but per segment, not per
message
Time-based retention
Size-based retention
Unlimited retention
Each topic has a cleanup policy that decides what
happens when the retention expires
Delete: delete the oldest messages
Compact: delete messages with duplicate keys
Keep only the latest version of the value for each key
How to delete a key: overwrite it with an empty value,
a.k.a ‘tombstone’
11. KEY/VALUE SERIALIZATION
Message Keys and Values are Binary, but can be
serialized/deserialized by the client
AVRO is a popular encoding format, more
compact than JSON
You can generate java classes from AVRO schema’s
13. KAFKA STREAMS
Library for Event Processing
Write your event processing logic as a series of processors
Map, aggregate, reduce events
Write to tables, join with other tables
Kafka Streams makes it fault-tolerant and highly scalable
Load-balances the processors via Kafka partitioning
Re-distributes the load between processors via intermediate topics
Tables can be replicated (global tables) or partitioned (local tables)