Building turn-key recommendations for 5% of internet video

Nir Yungster
Kamil Sindi
Building turn-key
recommendations for 5% of
internet video

1 About JW Player
Building a Recommender Service
Improving the Recommender
Future Directions
2
3
4
Agenda

About JW Player
● Open source video player + video
platform
● 5% of all video plays on the web
● Per month:
○ 40Bn plays
○ 100 TB events
● 15K Customers

PLAYER Data
Analytics
The fastest online
video player
(2008)
Data-driven products (e.g.
Recommendations)
(2016)
Dashboards, Audience
Measurement
(2014)
Data Has Become Core to JW Player’s Strategy
Video Management
and Delivery
(2011)
PLATFORM

Building a
Recommender
Service
Part 2

Increases views, engagement and ad revenue with minimal eﬀort or
investment by publisher
JW Recommendations

MVP Focused on Product Reqs and Scalability
● 20K requests per second
● Support legacy endpoints
○ Non-recommendations playlists
● Business rule features (e.g. sunrise, sunset, geo block)
● Include video metadata in response (conversions, manifest, etc.)
● Pass product “sniﬀ test”
● Rudimentary A/B testing using click-through rates
○ Beat random

Data Types At Our Disposal for Recommending
Association-based
recommendations
Content-based
recommendations
(& Trending videos)
Title: Top ten Snowboarding
Destinations in Colorado
Description, keywords

● Association → Association Rule Mining
○ Viewers who watched X also watched Y
We Layered Classic Algorithms That Were Easy to
Implement
● Content → BM25 (think tf-idf)
○ Elasticsearch
● Trending
○ Exp. weighted moving avg of plays

Rec 1: “Best hotels in Boulder”
Rec 2: “Amazing 1080”
Rec 3: “Best ski slopes in Colorado”
Rec 4: “Snowboarding is fun!”
Rec 5: “Top Snowboarding schools”
Rec 6: “Kardashian Katastrophe!”
Rec 7: “Cats on Skis”
Top ten Snowboarding
Destinations in Colorado, 2018
Example Recommendations
Similar titles
Highly
co-watched
Trending

Association Pipeline Content Pipeline
Architecture

Results: We Met Goals :-)
✓ 20K requests per second
✓ Support legacy endpoints
✓ Business rule features (e.g. sunset, sunrise, geo block)
✓ Include video metadata in response (conversions, manifest, etc.)
○ Use log-based architecture to sync from various sources
✓ Pass product “sniﬀ test”
✓ Rudimentary A/B testing
○ Beat random when looking at Overlay Click-Through Rate
○ Bested competitors in customer-led A/B tests

Beyond the MVP
How can we drive more value to customers?
How can we continue to grow competitive advantage?

Improving the
Recommender
Part 3

Wait, What Exactly Are We Improving?
Click Through Rate Completion Rate
Ad Impressions Viewer Time

Americans spend 2+ hrs on social media
Viewer Time, the Unit of Online Currency
Our publishers are ﬁghting for time
Recommendations can drive viewer
time by either:
● More Time per Session
● More Sessions (higher retention)

First, We Need Ability to Run Experiments
● Keep viewers in consistent
variant to measure:
○ Time/session
○ Viewer retention
A/B results (JW model vs random)
● 50% more time per session on recommended content
● 10% higher viewer retention (D1, D7)

The Natural Itch to Test Stuﬀ
We can now run experiments and understanding
impact on viewer time
Hypothesis
“If we boost recently
produced content,
recs will be more
relevant”
Experiment
What happens to
time spent?

Some of the Initial Tests That Were Tried
Experiment Result
Recommendation Algorithm (hypothesis)
Swap in Word2Vec title similarity instead of tf-idf
Boost recent content
Try trending only
Try diﬀerent ordering of layers
2 Weeks
3 Weeks
1 Week
2 Weeks

Oﬄine Testing = Faster Iteration
Fast Iteration Cycles
Build
Signals
Training
Data
Model
Model
Output
Predict
Evaluation,
validation
Improve Features,
Model, Data
Run Experiment
Build
Recommendation Algorithm (hypothesis)

Choosing An Oﬄine Performance Metric
● Time spent in a session aggregates behavior over a sequence of
recommendations
○ Predicting that directly is hard
● Pick closely related metric to measure eﬀectiveness of a single
recommendation
○ Time watched, percent watched?
○ Probability of an “engaged watch”

Video 1 Video 2
Pairwise Empirical Engagement Rate
(PEER Score)
PEER Score = Wilson Score ( )
% video 2 watches >= 30 seconds
Metric for List of Recommended Videos V :
nDCG (V), where PEER is relevance metric

● Signiﬁcant improvement
to time watched
○ 10% - 40% increase
● Improved CTR too
Success!

A/B Testing Learnings: Publishers Matter

A/B Testing Learnings: Publishers Matter
● Algorithm performance
○ Association vs Content
○ Optimal Training Window
● Publishers with viral events that aﬀect results
○ Test results change with such events
● Publisher quirks
○ Player, Recommendations implementation

Future Directions
Part 4
...it involves deep learning

● Algorithmic Perspective
○ More Context
○ Personalization
○ Progress in deep learning for recs
● Implementation / Maintainability
○ Single Uniﬁed Model (for widely varying publishers)
○ Flexible inputs (Anything2Vec)
Deep Learning Makes Sense For Us

We’ve Taken Some Good Initial Steps
● Built and A/B tested Tensorﬂow
model that performs on par with
our current algorithms
● Same context, unpersonalized
● AWS SageMaker used for training
on GPUs, serving model via
Tensorﬂow Serving
● Trained using triplet loss to learn
video embeddings
Anchor
Positive
Example
Negative
Example
FaceNet: A Unified Embedding for Face Recognition and
Clustering (2015)

Next Challenges
● Modeling
○ Score individual videos vs. learn to rank
○ How to choose positive & negative training samples?
○ Relevance metric for hyperparameter tuning
● Architecture
○ API traffic
○ Viewer profile service
○ Tensorflow is free, but scaling it is not

Takeaways
● “Just build” can work great for MVP recommender
● Oﬄine testing critical for algorithmic improvement
● Finding the right oﬄine metric is key

Acknowledgements
Data Science
Graham Edge
Matthew Yu
Rik Heijdens
Bobby Han
Engineering
Doug Shore
Alex Halter
Linda Cai
Dan Meng
Leo Yu
Franklin Dement

Building turn-key recommendations for 5% of internet video

Recommended

Recommended

More Related Content

Similar to Building turn-key recommendations for 5% of internet video

Similar to Building turn-key recommendations for 5% of internet video (20)

Recently uploaded

Recently uploaded (20)

Building turn-key recommendations for 5% of internet video