SlideShare a Scribd company logo
1 of 30
Modeling Impression Discounting In
Large-scale Recommender Systems
KDD 2014, presented by Mitul Tiwari
1
Pei Lee, Laks V.S. Lakshmanan
University of British Columbia
Vancouver, BC, Canada
Mitul Tiwari, Sam Shah
LinkedIn
Mountain View, CA, USA
2015/3/12
2
LinkedIn By The Numbers
300M members 2 new members/sec
Large-scale Recommender Systems
at LinkedIn3
 People You May Know  Skills Endorsements
Recommender Systems at LinkedIn
4
What is Impression Discounting?
5
 Examples
 People You May Know
 Skills Endorsement
 Problem Definition
5 days ago:
Recommended, but
not invited
Today:
If we recommend
again, would you
invite or not?
If you are not likely to invite, we better discount this recommendation.
1 day ago:
Recommended
again, but not invited
Example: People You May Know
Example: Skills Endorsement
 If you are likely to endorse, we will show a skills
suggestion again;
 Otherwise, we should leave this space for other skills
suggestions
Recommended Recommended Again
Problem Definition
 Impressions: recommendations shown to user
 Conversion: positive action - invite, endorse, etc.
 In natural language:
 We already impressed an item several times with no
conversion. How much should we discount that
item?
 More formally:
 For an user u and item i, given an impression history
(T1, T2 , …, Tn) between u and i, can we predict the
conversion rate for u on i?
Impression Discounting Framework
9
Outline
10
 Define an impression
 User, Item, Conversion Rate
 Features
 How many times the user saw this item?
 When is the last time the user saw this item? …
 Data analysis: impression features and conversion
relationship
 Impression discounting model fitting
 Experimental evaluation
Define an Impression
11
Features on Impressed Items
12
 ImpCount (frequency):
 the number of historical impressions before the current
impression, associated with the same (user, item)
 LastSeen (recency):
 the day difference between the last impression and the
current impression, associated with the same (user, item)
 Position:
 the offset of item in the recommendation list of user
 UserFreq:
 the interaction frequency of user in a recommender system
Data Analysis: PYMK Dataset
 Data: 1.09 B impression tuples
 Training dataset: 80% of data
 0.55 billion unique impressions
 0.87 billion total impressions
 20 millions invitations
 Testing dataset: 20% of data
 0.14 billion unique impressions (3.7%)
 0.22 billion total impressions (2.4%)
 5.2 millions invitations
Skills Endorsement Dataset
14
 Total dataset size: 190 million impression tuples
 Training dataset: 80% of data
 Testing dataset: 20% of data
Tencent SearchAds Dataset
15
 Publicly available for KDD Cup 2012 by the
Tencent search engine
 Total Size: 150 million impression sequences
 CTR of the Ad at the 1st, 2nd and 3rd
position: 4.8%, 2.7%, and 1.4%
Data Analysis: Conversion Rate
Changes with Impression Count
16
If we show an item to a
user repeatedly, the
conversion rate
decreases
17
 The conversion rate
decreases with both
ImpCount and
LastSeen
PYMK Invitation Rate Changes with
Impression Count and Last Seen
18
 Similar observation:
The conversion rate
also decreases with
both ImpCount and
LastSeen
Endorsement Rate Changes with
Impression Count and Last Seen
Impression Discounting Model Fitting
20
 Workflow:
 Model fitting process:
 Fundamental discounting functions
 Aggregation Model
 Offline and online
Conversion Rate with
ImpCount21
Discounting Functions
22
Conversion Rate with LastSeen
23
Aggregation Models
24
 Linear Aggregation
 Multiplicative Aggregation
f(Xi) is one of
discounting function
given earlier
Anti-Noise Regression Model
26
 The sparsity of observations increases variance in the
conversion rate.
 Typically happens when the number of observations are
relatively low.
Density Weighted Regression
27
 Add a weight with square error
 Original RMSE:
 Density weighted RMSE:
vi is assessed by the number of observations in a small
neighborhood of (Xi, yi)
Impression Discounting Framework
28
Offline Evaluation: Precision @k
29
 Precision at top k
 Precision improvement for different behavior sets:
More features yield
higher precision
Online Evaluation: Bucket Test on PYMK
30
 Use behavior set (LastSeen, ImpCount)
 On LinkedIn PYMK online system
Conclusion
31
 Impression Discounting
Model
 Learnt a discounting function
to capture ignored
impressions
 Linear or multiplicative
aggregation model
 Anti-noise regression model
 Offline and Online
evaluation (Bucket testing)
Questions?
32
 Related work in the paper
 Want to know more contact:
mtiwari@linkedin.com

More Related Content

What's hot

Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introductionLiang Xiang
 
Context-aware Recommendation: A Quick View
Context-aware Recommendation: A Quick ViewContext-aware Recommendation: A Quick View
Context-aware Recommendation: A Quick ViewYONG ZHENG
 
Tutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsTutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsYONG ZHENG
 
Recommendation System Explained
Recommendation System ExplainedRecommendation System Explained
Recommendation System ExplainedCrossing Minds
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableJustin Basilico
 
Movie lens movie recommendation system
Movie lens movie recommendation systemMovie lens movie recommendation system
Movie lens movie recommendation systemGaurav Sawant
 
How to build a recommender system?
How to build a recommender system?How to build a recommender system?
How to build a recommender system?blueace
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architectureLiang Xiang
 
Recommendation system
Recommendation system Recommendation system
Recommendation system Vikrant Arya
 
Matrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender SystemsMatrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender SystemsLei Guo
 
Recommendation System
Recommendation SystemRecommendation System
Recommendation SystemAnamta Sayyed
 
Netflix talk at ML Platform meetup Sep 2019
Netflix talk at ML Platform meetup Sep 2019Netflix talk at ML Platform meetup Sep 2019
Netflix talk at ML Platform meetup Sep 2019Faisal Siddiqi
 
Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringViet-Trung TRAN
 
[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systems[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systemsFalitokiniaina Rabearison
 
Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Sudeep Das, Ph.D.
 
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated RecommendationsHarald Steck
 
A Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixA Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixJaya Kawale
 
Kdd 2014 Tutorial - the recommender problem revisited
Kdd 2014 Tutorial -  the recommender problem revisitedKdd 2014 Tutorial -  the recommender problem revisited
Kdd 2014 Tutorial - the recommender problem revisitedXavier Amatriain
 

What's hot (20)

Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introduction
 
Context-aware Recommendation: A Quick View
Context-aware Recommendation: A Quick ViewContext-aware Recommendation: A Quick View
Context-aware Recommendation: A Quick View
 
Tutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsTutorial: Context In Recommender Systems
Tutorial: Context In Recommender Systems
 
Recommendation System Explained
Recommendation System ExplainedRecommendation System Explained
Recommendation System Explained
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms Reliable
 
Movie lens movie recommendation system
Movie lens movie recommendation systemMovie lens movie recommendation system
Movie lens movie recommendation system
 
How to build a recommender system?
How to build a recommender system?How to build a recommender system?
How to build a recommender system?
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
 
Recommendation system
Recommendation system Recommendation system
Recommendation system
 
Matrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender SystemsMatrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender Systems
 
Recommendation System
Recommendation SystemRecommendation System
Recommendation System
 
Netflix talk at ML Platform meetup Sep 2019
Netflix talk at ML Platform meetup Sep 2019Netflix talk at ML Platform meetup Sep 2019
Netflix talk at ML Platform meetup Sep 2019
 
Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filtering
 
[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systems[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systems
 
Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it!
 
Recommender system
Recommender systemRecommender system
Recommender system
 
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated Recommendations
 
A Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixA Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at Netflix
 
Kdd 2014 Tutorial - the recommender problem revisited
Kdd 2014 Tutorial -  the recommender problem revisitedKdd 2014 Tutorial -  the recommender problem revisited
Kdd 2014 Tutorial - the recommender problem revisited
 

Similar to Modeling Impression discounting in large-scale recommender systems

Towards Complex User Feedback and Presentation Context in Recommender Systems
Towards Complex User Feedback and Presentation Context in Recommender SystemsTowards Complex User Feedback and Presentation Context in Recommender Systems
Towards Complex User Feedback and Presentation Context in Recommender SystemsLadislav Peska
 
“Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keyn...
“Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keyn...“Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keyn...
“Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keyn...Michelle Zhou
 
Deductive, inductive, and abductive reasoning and their application in trans...
Deductive, inductive, and abductive reasoning and their application in  trans...Deductive, inductive, and abductive reasoning and their application in  trans...
Deductive, inductive, and abductive reasoning and their application in trans...Pragmatic Cohesion Consulting, LLC
 
Developing Web-scale Machine Learning at LinkedIn - From Soup to Nuts
Developing Web-scale Machine Learning at LinkedIn - From Soup to NutsDeveloping Web-scale Machine Learning at LinkedIn - From Soup to Nuts
Developing Web-scale Machine Learning at LinkedIn - From Soup to NutsKun Liu
 
Marketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success RatesMarketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success RatesRevolution Analytics
 
'Metrics That Matter': Gabrielle Benefield @ Colombo Agile Con 2014
'Metrics That Matter': Gabrielle Benefield @ Colombo Agile Con 2014'Metrics That Matter': Gabrielle Benefield @ Colombo Agile Con 2014
'Metrics That Matter': Gabrielle Benefield @ Colombo Agile Con 2014ColomboCampsCommunity
 
Introduction to recommendation system
Introduction to recommendation systemIntroduction to recommendation system
Introduction to recommendation systemAravindharamanan S
 
Digital analytics lecture4
Digital analytics lecture4Digital analytics lecture4
Digital analytics lecture4Joni Salminen
 
DMA MAC Presentation: Kajal Mukhopadhyay, Ph.D.
DMA MAC Presentation: Kajal Mukhopadhyay, Ph.D.DMA MAC Presentation: Kajal Mukhopadhyay, Ph.D.
DMA MAC Presentation: Kajal Mukhopadhyay, Ph.D.Kajal Mukhopadhyay, PhD
 
Rokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptxRokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptxJadna Almeida
 
Rokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptxRokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptxJadna Almeida
 
Digital Business Models 101
Digital Business Models 101Digital Business Models 101
Digital Business Models 101Willy Braun
 
Building Analytics for Growth
Building Analytics for GrowthBuilding Analytics for Growth
Building Analytics for GrowthKareem Azees
 
Lean Startup Metrics & Analytics
Lean Startup Metrics & AnalyticsLean Startup Metrics & Analytics
Lean Startup Metrics & AnalyticsNicola Junior Vitto
 
Cross channel attribution overview feb 2010
Cross channel attribution overview feb 2010Cross channel attribution overview feb 2010
Cross channel attribution overview feb 2010xplusone
 
Lean launch pad itp 3.9.2015
Lean launch pad itp 3.9.2015Lean launch pad itp 3.9.2015
Lean launch pad itp 3.9.2015Jen van der Meer
 
Artificial Intelligence at LinkedIn
Artificial Intelligence at LinkedInArtificial Intelligence at LinkedIn
Artificial Intelligence at LinkedInBill Liu
 

Similar to Modeling Impression discounting in large-scale recommender systems (20)

Towards Complex User Feedback and Presentation Context in Recommender Systems
Towards Complex User Feedback and Presentation Context in Recommender SystemsTowards Complex User Feedback and Presentation Context in Recommender Systems
Towards Complex User Feedback and Presentation Context in Recommender Systems
 
“Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keyn...
“Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keyn...“Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keyn...
“Big Picture”: Mixed-Initiative Visual Analytics of Big Data (VINCI 2013 Keyn...
 
Deductive, inductive, and abductive reasoning and their application in trans...
Deductive, inductive, and abductive reasoning and their application in  trans...Deductive, inductive, and abductive reasoning and their application in  trans...
Deductive, inductive, and abductive reasoning and their application in trans...
 
Developing Web-scale Machine Learning at LinkedIn - From Soup to Nuts
Developing Web-scale Machine Learning at LinkedIn - From Soup to NutsDeveloping Web-scale Machine Learning at LinkedIn - From Soup to Nuts
Developing Web-scale Machine Learning at LinkedIn - From Soup to Nuts
 
Marketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success RatesMarketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success Rates
 
'Metrics That Matter': Gabrielle Benefield @ Colombo Agile Con 2014
'Metrics That Matter': Gabrielle Benefield @ Colombo Agile Con 2014'Metrics That Matter': Gabrielle Benefield @ Colombo Agile Con 2014
'Metrics That Matter': Gabrielle Benefield @ Colombo Agile Con 2014
 
Introduction to recommendation system
Introduction to recommendation systemIntroduction to recommendation system
Introduction to recommendation system
 
GT_feed
GT_feedGT_feed
GT_feed
 
Digital analytics lecture4
Digital analytics lecture4Digital analytics lecture4
Digital analytics lecture4
 
DMA MAC Presentation: Kajal Mukhopadhyay, Ph.D.
DMA MAC Presentation: Kajal Mukhopadhyay, Ph.D.DMA MAC Presentation: Kajal Mukhopadhyay, Ph.D.
DMA MAC Presentation: Kajal Mukhopadhyay, Ph.D.
 
Linking Evaluation and Cost-Benefit Analysis in Criminal Justice: A Practical...
Linking Evaluation and Cost-Benefit Analysis in Criminal Justice: A Practical...Linking Evaluation and Cost-Benefit Analysis in Criminal Justice: A Practical...
Linking Evaluation and Cost-Benefit Analysis in Criminal Justice: A Practical...
 
Rokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptxRokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptx
 
Rokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptxRokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptx
 
Digital Business Models 101
Digital Business Models 101Digital Business Models 101
Digital Business Models 101
 
Building Analytics for Growth
Building Analytics for GrowthBuilding Analytics for Growth
Building Analytics for Growth
 
Conversion Rate Optimisation Master Class - Khalid Saleh, Invesp
Conversion Rate Optimisation Master Class - Khalid Saleh, InvespConversion Rate Optimisation Master Class - Khalid Saleh, Invesp
Conversion Rate Optimisation Master Class - Khalid Saleh, Invesp
 
Lean Startup Metrics & Analytics
Lean Startup Metrics & AnalyticsLean Startup Metrics & Analytics
Lean Startup Metrics & Analytics
 
Cross channel attribution overview feb 2010
Cross channel attribution overview feb 2010Cross channel attribution overview feb 2010
Cross channel attribution overview feb 2010
 
Lean launch pad itp 3.9.2015
Lean launch pad itp 3.9.2015Lean launch pad itp 3.9.2015
Lean launch pad itp 3.9.2015
 
Artificial Intelligence at LinkedIn
Artificial Intelligence at LinkedInArtificial Intelligence at LinkedIn
Artificial Intelligence at LinkedIn
 

More from Mitul Tiwari

Large scale social recommender systems at LinkedIn
Large scale social recommender systems at LinkedInLarge scale social recommender systems at LinkedIn
Large scale social recommender systems at LinkedInMitul Tiwari
 
Big Data Ecosystem at LinkedIn. Keynote talk at Big Data Innovators Gathering...
Big Data Ecosystem at LinkedIn. Keynote talk at Big Data Innovators Gathering...Big Data Ecosystem at LinkedIn. Keynote talk at Big Data Innovators Gathering...
Big Data Ecosystem at LinkedIn. Keynote talk at Big Data Innovators Gathering...Mitul Tiwari
 
Large scale social recommender systems and their evaluation
Large scale social recommender systems and their evaluationLarge scale social recommender systems and their evaluation
Large scale social recommender systems and their evaluationMitul Tiwari
 
Metaphor: A system for related searches recommendations
Metaphor: A system for related searches recommendationsMetaphor: A system for related searches recommendations
Metaphor: A system for related searches recommendationsMitul Tiwari
 
Related searches at LinkedIn
Related searches at LinkedInRelated searches at LinkedIn
Related searches at LinkedInMitul Tiwari
 
Structural Diversity in Social Recommender Systems
Structural Diversity in Social Recommender SystemsStructural Diversity in Social Recommender Systems
Structural Diversity in Social Recommender SystemsMitul Tiwari
 
Organizational Overlap on Social Networks and its Applications
Organizational Overlap on Social Networks and its ApplicationsOrganizational Overlap on Social Networks and its Applications
Organizational Overlap on Social Networks and its ApplicationsMitul Tiwari
 
Large-scale Social Recommendation Systems: Challenges and Opportunity
Large-scale Social Recommendation Systems: Challenges and OpportunityLarge-scale Social Recommendation Systems: Challenges and Opportunity
Large-scale Social Recommendation Systems: Challenges and OpportunityMitul Tiwari
 
Building Data Driven Products at Linkedin
Building Data Driven Products at LinkedinBuilding Data Driven Products at Linkedin
Building Data Driven Products at LinkedinMitul Tiwari
 
Social Network Analysis at LinkedIn
Social Network Analysis at LinkedInSocial Network Analysis at LinkedIn
Social Network Analysis at LinkedInMitul Tiwari
 

More from Mitul Tiwari (10)

Large scale social recommender systems at LinkedIn
Large scale social recommender systems at LinkedInLarge scale social recommender systems at LinkedIn
Large scale social recommender systems at LinkedIn
 
Big Data Ecosystem at LinkedIn. Keynote talk at Big Data Innovators Gathering...
Big Data Ecosystem at LinkedIn. Keynote talk at Big Data Innovators Gathering...Big Data Ecosystem at LinkedIn. Keynote talk at Big Data Innovators Gathering...
Big Data Ecosystem at LinkedIn. Keynote talk at Big Data Innovators Gathering...
 
Large scale social recommender systems and their evaluation
Large scale social recommender systems and their evaluationLarge scale social recommender systems and their evaluation
Large scale social recommender systems and their evaluation
 
Metaphor: A system for related searches recommendations
Metaphor: A system for related searches recommendationsMetaphor: A system for related searches recommendations
Metaphor: A system for related searches recommendations
 
Related searches at LinkedIn
Related searches at LinkedInRelated searches at LinkedIn
Related searches at LinkedIn
 
Structural Diversity in Social Recommender Systems
Structural Diversity in Social Recommender SystemsStructural Diversity in Social Recommender Systems
Structural Diversity in Social Recommender Systems
 
Organizational Overlap on Social Networks and its Applications
Organizational Overlap on Social Networks and its ApplicationsOrganizational Overlap on Social Networks and its Applications
Organizational Overlap on Social Networks and its Applications
 
Large-scale Social Recommendation Systems: Challenges and Opportunity
Large-scale Social Recommendation Systems: Challenges and OpportunityLarge-scale Social Recommendation Systems: Challenges and Opportunity
Large-scale Social Recommendation Systems: Challenges and Opportunity
 
Building Data Driven Products at Linkedin
Building Data Driven Products at LinkedinBuilding Data Driven Products at Linkedin
Building Data Driven Products at Linkedin
 
Social Network Analysis at LinkedIn
Social Network Analysis at LinkedInSocial Network Analysis at LinkedIn
Social Network Analysis at LinkedIn
 

Recently uploaded

Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Rithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfRithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfrahulyadav957181
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelBoston Institute of Analytics
 

Recently uploaded (20)

Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Rithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfRithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdf
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
 

Modeling Impression discounting in large-scale recommender systems

  • 1. Modeling Impression Discounting In Large-scale Recommender Systems KDD 2014, presented by Mitul Tiwari 1 Pei Lee, Laks V.S. Lakshmanan University of British Columbia Vancouver, BC, Canada Mitul Tiwari, Sam Shah LinkedIn Mountain View, CA, USA 2015/3/12
  • 2. 2 LinkedIn By The Numbers 300M members 2 new members/sec
  • 3. Large-scale Recommender Systems at LinkedIn3  People You May Know  Skills Endorsements
  • 5. What is Impression Discounting? 5  Examples  People You May Know  Skills Endorsement  Problem Definition
  • 6. 5 days ago: Recommended, but not invited Today: If we recommend again, would you invite or not? If you are not likely to invite, we better discount this recommendation. 1 day ago: Recommended again, but not invited Example: People You May Know
  • 7. Example: Skills Endorsement  If you are likely to endorse, we will show a skills suggestion again;  Otherwise, we should leave this space for other skills suggestions Recommended Recommended Again
  • 8. Problem Definition  Impressions: recommendations shown to user  Conversion: positive action - invite, endorse, etc.  In natural language:  We already impressed an item several times with no conversion. How much should we discount that item?  More formally:  For an user u and item i, given an impression history (T1, T2 , …, Tn) between u and i, can we predict the conversion rate for u on i?
  • 10. Outline 10  Define an impression  User, Item, Conversion Rate  Features  How many times the user saw this item?  When is the last time the user saw this item? …  Data analysis: impression features and conversion relationship  Impression discounting model fitting  Experimental evaluation
  • 12. Features on Impressed Items 12  ImpCount (frequency):  the number of historical impressions before the current impression, associated with the same (user, item)  LastSeen (recency):  the day difference between the last impression and the current impression, associated with the same (user, item)  Position:  the offset of item in the recommendation list of user  UserFreq:  the interaction frequency of user in a recommender system
  • 13. Data Analysis: PYMK Dataset  Data: 1.09 B impression tuples  Training dataset: 80% of data  0.55 billion unique impressions  0.87 billion total impressions  20 millions invitations  Testing dataset: 20% of data  0.14 billion unique impressions (3.7%)  0.22 billion total impressions (2.4%)  5.2 millions invitations
  • 14. Skills Endorsement Dataset 14  Total dataset size: 190 million impression tuples  Training dataset: 80% of data  Testing dataset: 20% of data
  • 15. Tencent SearchAds Dataset 15  Publicly available for KDD Cup 2012 by the Tencent search engine  Total Size: 150 million impression sequences  CTR of the Ad at the 1st, 2nd and 3rd position: 4.8%, 2.7%, and 1.4%
  • 16. Data Analysis: Conversion Rate Changes with Impression Count 16 If we show an item to a user repeatedly, the conversion rate decreases
  • 17. 17  The conversion rate decreases with both ImpCount and LastSeen PYMK Invitation Rate Changes with Impression Count and Last Seen
  • 18. 18  Similar observation: The conversion rate also decreases with both ImpCount and LastSeen Endorsement Rate Changes with Impression Count and Last Seen
  • 19. Impression Discounting Model Fitting 20  Workflow:  Model fitting process:  Fundamental discounting functions  Aggregation Model  Offline and online
  • 22. Conversion Rate with LastSeen 23
  • 23. Aggregation Models 24  Linear Aggregation  Multiplicative Aggregation f(Xi) is one of discounting function given earlier
  • 24. Anti-Noise Regression Model 26  The sparsity of observations increases variance in the conversion rate.  Typically happens when the number of observations are relatively low.
  • 25. Density Weighted Regression 27  Add a weight with square error  Original RMSE:  Density weighted RMSE: vi is assessed by the number of observations in a small neighborhood of (Xi, yi)
  • 27. Offline Evaluation: Precision @k 29  Precision at top k  Precision improvement for different behavior sets: More features yield higher precision
  • 28. Online Evaluation: Bucket Test on PYMK 30  Use behavior set (LastSeen, ImpCount)  On LinkedIn PYMK online system
  • 29. Conclusion 31  Impression Discounting Model  Learnt a discounting function to capture ignored impressions  Linear or multiplicative aggregation model  Anti-noise regression model  Offline and Online evaluation (Bucket testing)
  • 30. Questions? 32  Related work in the paper  Want to know more contact: mtiwari@linkedin.com

Editor's Notes

  1. Good afternoon! I am Mitul Tiwari. Today I am going to talk about Modeling Impression Discounting in Large-scale Recommender Systems. This is joint work with Pei Lee, Laks, and Sam Shah. Most of the work was done when Pei Lee was interning at LinkedIn last summer.
  2. Let me start with giving some context. LinkedIn is the largest professional network with more than 300M members, and its growing very fast with more than 2 members joining LinkedIn per second.
  3. Here are a couple of examples of large scale recommender systems at LinkedIn. People You May Know: recommending other people to connect with. Daily we process 100s of tera-bytes of data and 100s of billions of edges to recommend connections for each of 300M members. Suggested skills endorsements: is another example where we recommend a member and skill combination for endorsements.
  4. There are many such large scale recommender systems at LinkedIn. For example, Jobs recommendations, news articles suggestions, companies to follow, groups to join, similar profile, related search query etc.
  5. Let me describe what we mean by impression discounting using a couple of examples and formulate the problem.
  6. Let me start with the case of People You May Know or PYMK. Let’s say 5 days ago a Deepak was recommended to me to invite and connect on LinkedIn, and I ignored that recommendation. And Deepak is recommended to me again in PYMK 1 day ago. Now today PYMK systems needs to figure out whether to recommend Deepak again to me or not. Basically, if I am not likely to invite Deepak to connect then better discount this recommendation from my PYMK recommendations. Note that this is just an example since I am already connected to Deepak and I work very closely with him 
  7. Here is another example of skills endorsement suggestions. We have very small real-estate on the website to display endorsement suggestions. If we recommended certain skills in the past for some of your connections and you did not take any action on those suggestion. If you are not likely to endorse with those skills then it’s better if we suggest some other skills to endorse.
  8. Let me define certain terms. Impressions are recommendation that are seen by users. Conversion is a positive action taken on a recommendation. For example, in case of PYMK, invitation to connect. Or in case of Skills Endorsements, endorsing a connection for a skill. In Impression discounting problem, we have given a history of impressions or recommendations seen by users, and we would like to discount or remove the items from the recommendation set if it’s not likely to lead to conversion.
  9. Our goal is to build an impression discounting framework that can be used as a feature in the recommendation engine model or a plugin that can be applied on top of the existing recommendation engine. Our proposed impression discounting model only relies on the historical impression records, and can be treated as a plugin for existing recommendation engines. We use the dotted rectangle to circle the newly built recommender system with impression discounting, which produces discounted impressions with a higher overall conversion rate.
  10. Here is the outline of the rest of my talk. Let me define impression more in detail in terms of user, item, conversion, and behavior features. Then I will describe the exploratory data analysis work to find out the correlation between impression features and conversion rate Then will describe the impression discounting model fitting And finally will conclude with offline and online experimental evaluations
  11. An impression can be seen as a tuple with multiple fields such as User; (2) Item; (3) Conversion; (4) Features such as how many times this item was shown to the user; when was the last time this item was shown to the user, etc (5) Timestamp; (6) Recommendation score of the item to the use
  12. We can derive various features from impressions such as: (1) Impression count: the number of times the item was seen by the user; (2) Last seen: when was the last time the item was seen; (3) at what position the item was seen; (4) how frequently user
  13. Here are some data analysis results that shows that conversation rate drops as the number of the same item shown to the user increase The decrease in conversion rate is much more steep in case of endorsements Also note that in case of PYMK the decrease is more gradual. That mean, a recommendation in PYMK still has chance of acceptance even after multiple impressions
  14. Given this data and after doing correlation studies between various behavior features and conversion rate We would like to learn discounting functions that fits this data
  15. In the plot we show various discounting functions that we use to fit the relationship be tween conversion rate and the number of times an item was shown to a user.
  16. These are various discounting functions that we use to fit the data
  17. After fitting the best discounting function for each of the feature, we combined these functions using linear or multiplicative aggregation
  18. Note that in certain cases we didn’t have a lot of data points. For example, a very few items were shown to users more than 40 times and there is a lot of variance because of small number of data points. We developed an anti-noise regression technique to give lower weightage to such data points in the cost functions
  19. We adopted this technique from density based clustering to give lower weight to certain observations in the cost function Throw away information that you don’t need – analogy – denoising auto encoders in deep learning
  20. Our proposed impression discounting model only relies on the historical impression records, and can be treated as a plugin for existing recommendation engines. We use the dotted rectangle to circle the newly built recommender system with impression discounting, which produces discounted impressions with a higher overall conversion rate.
  21. Talked about modeling impression discounting in large scale recommender systems at LinkedIn such as People You May Know and Skills Endorsement Suggestions.