Fast data arrives in real time and potentially high volume. Rapid processing, filtering and aggregation is required to ensure timely reaction and actual information in user interfaces. Doing so is a challenge, make this happen in a scalable and reliable fashion is even more interesting. This session introduces Apache Kafka as the scalable event bus that takes care of the events as they flow in and Kafka Streams and KSQL for the streaming analytics. Both Java and Node applications are demonstrated that interact with Kafka and leverage Server Sent Events and WebSocket channels to update the Web UI in real time. User activity performed by the audience in the Web UI is processed by the Kafka powered back end and results in live updates on all clients.
This presentation includes a demonstration of remote database synchronization through Twitter.
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
What is Kafka & why is it Important? (UKOUG Tech17, Birmingham, UK - December 2017)
1. What is
Apache Kafka &
Why is it
Important?
The Event Fabric
bringing IT together
What is Apache Kafka & Why is it Important? | UKOUG Tech17 1
µ
µ
What is
Apache Kafka & Why is it Important?
2. It would be so nice if I could
publish my ideas and actions,
accessible near instantly for
everyone who is interested
Heck, I do not even know these people
and they may not know me [personally]
– just my pearls of wisdom. And if they
are late to the party, they can also
check out the historic archives of my
eloquence
Without fretting about the numbers of
readers involved and whether they are
in the same time zone as me and online
when I publish my messages – and
which device they use
3. It would be so nice if I could
publish my ideas and actions,
accessible near instantly for
everyone who is interested
Heck, I do not even know these people
and they may not know me [personally]
– just my pearls of wisdom. And if they
are late to the party, they can also
check out the historic archives of my
eloquence
Without fretting about the numbers of
readers involved and whether they are
in the same timezone as me and online
when I publish my messages – and
which device they use
4. • Decoupled communication
• 0, 1 or many followers
• Scalable number of messages (and parties)
• Reliable (mostly available, few messages lost)
• Full history
• Open: cross device, cross location
• Not Sub-second, near real-time fast
• Rate limited (#messages/minute)
• Size limited (140-280 characters)
• Format limited (text)
• Not for private interactions
• Not (really) for programmatic use
7. What is Apache Kafka and why is it important? 7
Oracle
Database
ORDERS
Oracle Database
DVX_ORDERS
µ Oracle Application
Container Cloud
Oracle DBaaS Cloud
µLocally running
Node application
8. What does the Twitter for System Driven Event Interaction
look like?
What is Apache Kafka and why is it important? 8
• Decoupled communication – organized per topic
• 0, 1 or many Consumers per Topic
• Scalable number of messages (and parties)
• Reliable (distributed)
• Full history
• Open: libraries in many technologie & REST APIs
10. What does the Twitter for System Driven Event Interaction
look like?
What is Apache Kafka and why is it important? 10
• Decoupled communication – organized per topic
• 0, 1 or many Consumers per Topic
• Scalable number of messages (and parties)
• Reliable (distributed)
• Full history
• Open: libraries in many technologie & REST APIs
• Near real-time fast
• No Rate Limit
• No enforced size limit
• Anything goes (it’s all byte[])
• On premises or in cloud, private or trusted
• Very much for programmatic use
12. Messaging as we know it
• JMS, Oracle Advanced Queuing, IBM MQ, MS MQ, RabbitMQ, MQTT,
XMPP, WebSockets, Oracle Coherence, …
• Challenges
• Costs
• Scalability (size and speed)
• (lack of) Distribution (and therefore availability)
• Complexity of infrastructure
• Message delivery guarantees
• Lack of technology openness
• Deal with temporarily offline consumers
• Retain history
13. Introducing Apache Kafka
• ..- 2010 – creation at Linkedin
• Message Bus | Event Broker
• High volume, low latency, highly reliable, cross technology
• Scalable, distributed, strict message ordering, ….
• 2011/2012 – open source under the Apache Incubator/ Top Project
• Kafka is used by many large corporations:
• Walmart, Cisco, Netflix, PayPal, LinkedIn, eBay, Spotify, Uber, Sift
Science, Zalando, The New York Times, Airbnb, Coursera, ING Bank,…
• And embraced by many software vendors & cloud providers
• Client libraries available for Node, Java, C/C++, Python, Ruby, PHP, Go,
Rust, .NET, Perl, Scala DSL, Clojure, Swift and more
18. CONSUMING
• Messages are available to consumers only when they have been committed
• Kafka does not push
• Unlike JMS
• Read does not destroy
• Unlike JMS Topic
• (some) History available
• Offline consumers can catch up
• Consumers can re-consume from the past
• Delivery Guarantees
• Ordering maintained
• At-least-once (per consumer) by default; at-most-once and exactly-once can be
implemented
28. FAST DATA AND ACTIVE UI
• Handle influx
• Publish findings instantaneously
• Update UI & notify end user immediately
• Analyze in real time
• Decoupled components
• No data loss when a component is temporarily down
• Scalable with volume of events and of number of clients
29. THE CASE AT HAND
Client
Client
Client
Client
Show live
tweet feed
for
conferences
Show live
tweet
aggregates
per
conference
Allow users
to like tweets
–and show
live list of
liked tweets
Show a live
list of top 3
liked tweets
per
conference
Tweets on
#ukoug17
#ukoug_tech17
#ukoug_apps17
#ukoug_jde17
30. DEMO - REAL TIME, CROSS CLOUD, CROSS
TECHNOLOGY PUSH
Tweets on
#ukoug17
#ukoug_tech17
#ukoug_apps17
#ukoug_jde17
Client
Client
Client
Client
you
31. THE CASE AT HAND – STEP ONE
Client
Client
Client
Client
Tweets on
#ukoug17 #ukoug_tech17
#ukoug_apps17
#ukoug_jde17
Show live
tweet feed
for
conferences
Tweets
Topic
32. THE CASE AT HAND – STEP ONE AND TWO
Client
Client
Client
Client
Tweets on
#ukoug17 #ukoug_tech17
#ukoug_apps17
#ukoug_jde17
Show live
tweet feed
for
conferences
Tweets
Topic
34. THE CASE AT HAND
SERVER SENT EVENTS FOR PUSH BACK
Client
Client
Client
Client
Show live tweet
feed for
conferences
Tweets
Topic
Server Sent
Event
Tweets on
#ukoug17 #ukoug_tech17
#ukoug_apps17
#ukoug_jde17
35. SERVER SENT EVENT – SERVER SIDE
Client
Client
Client
Client
Server Sent
Event
38. THE CASE AT HAND
TWEET LIKES – CLIENT TO SERVER TO ALL CLIENTS
Client
Client
Client
Client
Show live
tweet feed
for
conferences
Tweets
Topic
SS
E
Allow users
to like tweets
–and show
live list of
liked tweets
Tweets on
#ukoug17 #ukoug_tech17
#ukoug_apps17
#ukoug_jde17
39. THE CASE AT HAND
WEB SOCKETS – FOR BI DIRECTIONAL PUSH
Client
Client
Client
Client
Show live
tweet feed
for
conferences
Tweets
Topic
SSE
WebSockets
Allow users
to like tweets
–and show
live list of
liked tweets
Tweets on
#ukoug17 #ukoug_tech17
#ukoug_apps17
#ukoug_jde17
41. THE CASE AT HAND
STREAMING ANALYSIS OF TWEET EVENTS
Client
Client
Client
Client
Show live
tweet feed
for
conferences
Tweets
Topic
SSE
WebSockets
Allow users
to like tweets
–and show
live list of
liked tweets
Show live
tweet
aggregates
per
conference
Tweets on
#ukoug17 #ukoug_tech17
#ukoug_apps17
#ukoug_jde17
42. THE CASE AT HAND - STREAMING ANALYSIS OF TWEETS
Client
Client
Client
Client
Show live
tweet feed
for
conferences
Tweets
Topic
WebSockets
Allow users
to like tweets
–and show
live list of
liked tweets
Show live
tweet
aggregates
per
conference
tweetAnalytics
Topic
Streaming
Tweets
Aggregation
µ
SSE
Tweets on
#ukoug17 #ukoug_tech17
#ukoug_apps17
#ukoug_jde17
43. KAFKA STREAMS
• Real Time Event [Stream] Processing integrated into Kafka
• Aggregations & Top-N
• Time Windows
• Continuous Queries
• Latest State (event sourcing)
• Turn Stream (of changes) into Table
(of most recent or current state)
• Part of the state can be quite old
• A Kafka Streams client will have state
in memory
• Always to be recreated from topic partition
log files
• Note: Kafka Streams is relatively new
• Only support for Java clients
45. EXAMPLE OF KAFKA STREAMS
Topic
groupBy
Aggregate
Join
Topic
Map
(Xform)
Publish
TweetMessage
Conference
Text
Author
Hashtag
Set Conference as key
Sum/Avg/Top3 by key
(==conference)
As JSON
Round aggregate
to nearest 100
Latest Conference
Details
Topic: CountTweetsPerConference
and possibly per time
window
48. THE CASE AT HAND - STREAMING ANALYSIS
OF TWEET LIKES
Client
Client
Client
Client
Show live
tweet feed
for
conferences
Tweets
Topic
WebSockets
Allow users
to like tweets
–and show
live list of
liked tweets
Show live
tweet
aggregates
per
conference
tweetAnalytics
Topic
Streaming
Tweets
Aggregation
µ
SSE
Show a live
list of top 3
liked tweets
per
conference
Tweets on
#ukoug17 #ukoug_tech17
#ukoug_apps17
#ukoug_jde17
49. KSQL FOR DECLARATIVE STREAM ANALYTICS
THROUGH CONTINUOUS QUERIES
create table tweetAnalytics as
select conference
, count(*)
from tweetsTopic
group by conference
create stream retweets
as
select *
from tweetsTopic
where text like 'RT%'
51. THE CASE AT HAND - STREAMING ANALYSIS
OF TWEET LIKES
Client
Client
Client
Client
Show live
tweet feed
for
conferences
Tweets
Topic
WebSockets
Allow users
to like tweets
–and show
live list of
liked tweets
Show live
tweet
aggregates
per
conference
tweetAnalytics
Topic
Streaming
Tweets
Aggregation
µ
SSE
Show a live
list of top 3
liked tweets
per
conference
Likes
Aggregation
µ
tweetLike
Topic
Top3TweetLikes
PerConference
Tweets on
#ukoug17 #ukoug_tech17
#ukoug_apps17
#ukoug_jde17
53. RUNNING TOP 3 OF
BEST LIKED TWEETS PER CONFERENCE
Server Sent
Event
54. END TO END FLOW CLOUD ENABLED
API
Cache
EventHub CS
µ
Tweets
Aggregation
µ
LikesTweets
UI µ
Client
Chrome
Client
Firefox
Likes
Aggregation
µ
API
µ
Tweet
Count
Likes
Top3
55. Key aspects of this demo – What Kafka can do for you
• Bridging Cloud(s) and on premises systems
• Providing decoupled interaction between microservices
• Performing Streaming Analysis
• Bridging technologies (Java, Node, …)
• Bridging the availability (no | one | multiple instances)
• Provide semi-push based synchronization
• Open
• Scalable
• Reliable & Available
• Fast
• Complete historical record
What is Apache Kafka and why is it important? 58
56. Oracle embracing Apache Kafka
• Event Hub Cloud Service = Managed Apache Kafka platform
• Managed Topics have been announced too
• Kafka as source for Golden Gate and ODI
• Data Pipeline with Data Hub (Apache Cassandra) & Event Hub
• Oracle Service Bus Kafka Adapter
• Integration Cloud
• Stream Analytics (aka Stream Explorer fka Oracle Event Processor)
• Oracle Native Container and Microservices Platform
• Fn Serverless Platform
• JET and ADF real time push based on Apache Kafka
• In general – the bridge between on premises [public] Cloud
What is Apache Kafka and why is it important? 59
57. Summary
• => == =>
• Apache Kafka is emerging as platform of choice for message exchange in a world of
• Microservices
• CQRS and Data Source Synchronization
• Clouds
• Fast Data (IoT) and Streaming Analysis
• Real time data integration & distribution
• Oracle is rapidly embracing Apache Kafka on various levels
• Getting started with Apache Kafka is not very hard at all
• The platform is open source – and has broad client support (Java, Node, …)
• Many resources are available – tutorials, blog article, demonstrations, presentation
slides and recordings of conference sessions, samples on GitHub
What is Apache Kafka and why is it important? 60
58. Thank you!
What is Apache Kafka and why is it important? 61
• Blog: technology.amis.nl
• Email: lucas.jellema@amis.nl
• : @lucasjellema
• : lucas-jellema
• : www.amis.nl, info@amis.nl
Editor's Notes
Fast data arrives in real time and potentially high volume. Rapid processing, filtering and aggregation is required to ensure timely reaction and actual information in user interfaces. Doing so is a challenge, make this happen in a scalable and reliable fashion is even more interesting. This session introduces Apache Kafka as the scalable event bus that takes care of the events as they flow in and Kafka Streams and KSQL for the streaming analytics. Both Java and Node applications are demonstrated that interact with Kafka and leverage Server Sent Events and WebSocket channels to update the Web UI in real time. User activity performed by the audience in the Web UI is processed by the Kafka powered back end and results in live updates on all clients.
Fast data arrives in real time and potentially high volume. Rapid processing, filtering and aggregation is required to ensure timely reaction and actual information in user interfaces. Doing so is a challenge, make this happen in a scalable and reliable fashion is even more interesting. This session introduces Apache Kafka as the scalable event bus that takes care of the events as they flow in and Kafka Streams for the streaming analytics. Both Java and Node applications are demonstrated that interact with Kafka and leverage Server Sent Events and WebSocket channels to update the Web UI in real time. User activity performed by the audience in the Web UI is processed by the Kafka powered back end and results in live updates on all clients. Introducing the challenge: fast data, scalable and decoupled event handling, streaming analytics Introduction of Kafka demo of Producing to and consuming from Kafka in Java and Nodejs clients Intro Kafka Stream API for streaming analytics Demo streaming analytics from java client Intro of web ui: HTML 5, WebSocket channel and SSE listener Demo of Push from server to Web UI - in general End to end flow: - IFTTT picks up Tweets and pushed them to an API that hands them to Kafka Topic. - The Java application Consumes these events, performs Streaming Analytics (grouped by hashtag and author and time window) and counts them; the aggregation results are produced to Kafka - The NodeJS application consumes these aggregation results and pushes them to Web UI - The WebUI displays the selected Tweets along with the aggregation results - in the Web UI, users can LIKE and RATE the tweets; each like or rating is sent to the server and produced to Kafka; these events are processed too through Stream Analytics and result in updated Like counts and Average Rating results; these are then pushed to all clients; this means that the audience can Tweet, see the tweet appear in the web ui on their own device, rate & like and see the ratings and like count update in real time