The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
Lambda Architecture: How we merged batch and real time
1. Lambda Architecture:
How We Merged Batch and Real-Time
Sewook Wee, Senior Engineering Lead
Sotos Matzanas, Tech Lead
June 27 2016
2. Our Goal
Our goal at Trulia is to
give consumers an easy
and enjoyable way to find
their next home by
providing data and
insights to help them
make the best decision.
5. Why Lambda Architecture
We needed a way to…
• Recalculate the full User Trait from full event body at scale
• Read it back fast
• Ability to add new metrics to old aggregates
• Refresh near real-time to catch up the delta
7. Our User Model
• We support both registered and unregistered users
• Registered users: user id + secondary id(s) (mobile, Web, email)
• User login: link and merge all known activity on all devices
8. Our Real-Time Complications
• User linkage can change while new batch is calculated
• New user linkage can appear during the day, and not reflected in
batch calculation
• We needed to plan for these and make sure eventual User Trait
reflects the state of a user as of right now
9. Event Event Event Parse
Linkage
Lookup
HBase
Store
Transfers Writes Reads
Redis
user id
Simple Real-Time Case
10. Parse
Linkage
Lookup
Yesterday’s
Lookup
(Hbase)
Rebalance user id
secondary id
user id
Redis
Store
Send as Control Events
Control
Bolt
Lookup
Change
Lookup
Change
Lookup
Change
Control Event Spout
Rebalance Time @Batch Completion Time
Get all user ids + secondary ids for today
Transfers Writes Reads
Current Real-Time Design
Event Event Event
Kafka Spout
Today’s
Lookup
(Hbase)
11. Transition to a New Epoch
• When rebalance of all ids is complete
• Completion of rebalance: no new user id has been rebalanced for 30
seconds
• Redis keys with TTL mark a heartbeat that disappears if no new
control events
12. Rebalance
Done for N+1
Midnight Rebalance
Done for N
Batch Layer
Events for
Epoch N
Batch Layer
N Done
Midnight
Batch Layer Events for N + 1
Speed Layer Events Epoch N
Batch Layer
N +1 Done
Timeline
Epoch Transitions
Serve
N + 1
Serving Batch N +
Speed N
Batch Layer
Epoch N
Real-Time
Layer Epoch N
Speed Layer Events for N + 1
Batch Layer
Epoch N + 1
Real-Time Layer
Epoch N + 1
Rebalance
for N
Event Processing Epoch Serving FromBatch Process
Rebalance
for N
13. Our Input and its Size
• Hundreds of millions of events per day
• Billions of events per month
• 12TBs of events, and growing
• Hundreds of millions of User Traits calculated daily
• Millions calculated in real-time
14. As a Result
• Continuously add new features to build data driven products
• Retroactively apply new features on old data
• A virtuous cycle of learning more, personalizing more, and
learning again
• Delivery of data and insights to help consumers make the
best decision
Sewook
Engage audience: Show of hands – how many of you have heard of Trulia before?
Trulia’s goal is to simplify the crazy experience of finding a home, by providing data and insights to help you make a better decision
It’s not just about finding the best home in the town, but finding the best home for you.
To do that, we need to know our users, so we’ve formed a personalization team
Sewook
The personalization team works to understand what our users are looking for
We have built a personal users platform based on the Lambda Architecture, where we track users’ activity in real-time, process them and build a digital signature or profile
We have built a digital profile of each user which we call a user trait, which I’ll explain a bit more on the next slide
Sewook
This slide explains how we’ve built our user traits.
Essentially, we take the repository of consumer activity events, process and generate the user trait.
The simplest approach to processing the data is batch processing, but that takes time and during the batch cycles the user trait becomes stale.
Another extreme approach is event by event full real-time processing, which is cool but historically we can ran into other issues, like full data re-processsing. Which is why we landed on Lambda Architecture.
Sewook
We like Lambda because it has batch and real-time benefits
Through Lambda, we can recalculate the full trait from event body in each batch cycle at scale
Whenever we need to change business rules or cleanse old data, we can do it very easily
We can read back each individual trait quickly
In addition, we have a real-time layer where we can catch up the delta
Hand presentation to Sotos: Sotos here will explain exactly how our personalization platform looks
Sotos
I’m going to share how we implement Lambda Architecture, our specific needs and complications and what our current architecture looks like.
Will mostly focus on the real-time process but first will walk you through our batch and how we built our Lambda Architecture batch part.
Sotos
Before I dig into the complications of our RT let me explain our user model a bit
No matter what avenue a user comes from, we always build a user trait, even though we have a secondary ID, so incases where the user is registered, we also build a unique user trait
We discover linkages through our batch workflow and store all discovered linkages in a unique table per batch run
Sotos
The issue is that linkages may change day to day, or we might discover a new linkage. We need to account for all these cases.
Our real time platform needs to properly marry all activity for a user.
Sotos
Sotos
Sotos
Explain what Epoch is
Batch layer events for Epoch N
Midnight line
Batch layer epoch N blue
Speed layer evenets epoch n orange
5. real-time layer epoch n not rebalance N
6.Batch layer N done line
7. Rebalance for N
8. Rebalance done for N line
9. Serving Batch N