Netflix Studio spent 8 Billion dollars on content in 2018. When the stakes are so high, it is paramount to track changes to the core studio metadata, spend on our content, forecasting and more to enable the business to make efficient and effective decisions. Embracing a Kappa architecture with Kafka enables us to build an enterprise grade message bus. By having event processing be the de-facto paved path for syncing core entities, it provides traceability and data quality verification as first class citizens for every change published.This talk will also get into the nuts and bolts of the eventing and stream processing paradigm and why it is the best fit for our use case, versus alternative architectures with similar benefits We will do a deep dive into the fascinating world of Netflix Studios and how eventing and stream processing are revolutionizing the world of movie productions and the production finance infrastructure.
3. Speaker Info
● Nitin Sharma, Content Finance Infrastructure @ Netflix
● Decade of work on Large Scale Distributed Systems
● Storage, Search, Messaging, Stream Processing, ML Infrastructure
4. How does Netflix make content?
LaunchProductionCreativeForecast Program Deals Post Production Financial
Reporting
Content Finance Infrastructure
6. We enable innovation for how content is
financially planned, produced and accounted for.
Music
Accounting
Content
Programming
Content
Payments &
Accounting
Production
Budgeting
Talent
Payments
Content
Forecasting
(cash & expense)
Production
Cost Reporting
Production
Cashflow
Content Forecasting &
Programming(CF)
Production Finance
Content Accounting (CA)
Content Finance Infrastructure
7. Production Finance ensures our productions are
financially healthy.
Cashflow Payment Cost ReportBudget
Estimated cost? How much do we
need on an
ongoing basis?
Talent Payments Snapshot of costs
8. Production Finance needs data from many other
services in the Studio Ecosystem.
Cashflow
Payment
Schedule
Productions
Talent
10. Adding new dependencies is non trivial and error
prone
Productions
Cost Report
Schedule
Payments
Cashflow
{episodes}
{launch date}
{Cashflow}
11. Request driven change communication causes
chaos.
● Request Driven Communication
○ Synchronous
○ Complex workflows
● Traceability
○ Source & metadata
● Uniformity
○ Is state consistent across the universe?
○ Non uniform reconciliation strategy
○ Duplicate work
12. Eventing-centric is better than Request-centric
message exchanges.
● The LOG - Canonical stream of facts
● Decoupling
● Data change or trigger
● Traceability
13. The Kafka ecosystem at Netflix enables teams to
easily embrace eventing.
● Paved Path
● Kafka + Flink - Stream Processing as a Service
● Fault tolerant and Multi-Region
● Observability out of the box
● Easy bootstrapping event listeners
14. Kafka is at the heart of the Netflix Studio Message
Bus.
Producers Data cleansing Stream Processing Consumers
15. We can easily produce events.
○ Event data & metadata
○ Normalized schema
■ Id of the entity , UUID, ts, type
● (optional) payload
■ Standard across producers
○ Publisher Client
○ Event Sources
■ Application/Services
■ CDC Events (Source -> Sink)
Producers
Normalized Schema
Kafka Client
Producer Event Stream
16. We can process and order events.
○ Input:
■ Kafka - Multiple input stream
■ Unordered
○ Processing:
■ Flink
○ Output:
■ Ordered & Keyed Kafka
Stream
■ Search Index
Unordered Producer Event Streams
Ordered & Enriched Streams
17. We enrich, flatten, and order entities.
○ Delayed Materialization
■ Circumvent ordering issues
○ Filter, Transform & Window
○ Enrich
■ F(Id, Entity) -> LatestState
■ Call Entity API for that Id
○ Config driven
○ Keyed Kafka
■ Partition Key
■ Order within partition
Enrich
Ordered & Enriched Streams
<id>
Payload
18. We can easily consume events.
○ Spring Boot 2
○ Spring cloud kafka connector
○ Stream Name
■ Entity -> Stream name
○ At Least Once
■ Idempotent
○ Offset
■ Default vs latest vs earliest
20. What if I want to add a new stream?
● Integrate new producer
● Add enricher in the stream
processing
● Add a sink
● Announce schema in registry
Enrich
Flatmap
Ordered & Enriched Streams
21. Eventing is the communication mechanism in
Netflix Studio Finance Ecosystem.
Schema RegistryProductions
Payments
Capitalization Cashflow
Schedule
Full Entity
State
22. Failure detection and recovery is a first class
citizen in design
Productions
Cashflow
Schedule
Full Entity
State
Live
Backfill
Slow
23. How do I know who has/hasn’t consumed what
data?
Schedule
Payment
Forecast
Watchdog
● Has an event made it through the
entire system? (Unified view)
● Has an event been consumed?
○ Offset Monitoring & Alerting
● Recon events
○ Kappa
○ Replay all events through
Streaming
24. Define Performance SLA based on Operational
Insights.
○ Freshness SLA
■ Message Consumption Lag
○ Max Transfer Rate
■ Payload size - Compress
■ Message rate - Source, process,
sink
○ Partitioning & Parallelism
○ Message Retention
○ X-Region Replication SLA