Efficient Top-N Recommendation by Linear Regression

•

12 likes•11,591 views

Mark Levy

Slides from Large Scale Recommender System workshop at ACM RecSys, 13 October 2013.

Technology

Efficient Top-N Recommendation
by Linear Regression
Mark Levy
Mendeley

What and why?
●

Social network products @Mendeley

What and why?
●

Social network products @Mendeley

●

ERASM Eurostars project

●

Make newsfeed more interesting

●

First task: who to follow

What and why?
●

Multiple datasets
–

2M users, many active

–

~100M documents

–

author keywords, social tags

–

50M physical pdfs

–

15M <user,document> per month

–

User-document matrix currently ~250M non-zeros

What and why?
●

Approach as item similarity problem
–

●

with a constrained target item set

Possible state of the art:
–
–

matrix factorization and then neighbourhood

–

something dynamic (?)

–
●

pimped old skool neighbourhood method

SLIM

Use some side data

SLIM
●

Learn sparse item similarity weights

●

No explicit neighbourhood or metric

●

L1 regularization gives sparsity

●

Bound-constrained least squares:

SLIM

users

R

W

R

x

≈
rj

items
wj

model.fit(R,rj)
wj = model.coef_

SLIM
Good:
–

Outperforms MF methods on implicit ratings data [1]

–

Easy extensions to include side data [2]

Not so good:
–

Reported to be slow beyond small datasets [1]

[1] X. Ning and G. Karypis, SLIM: Sparse Linear Methods for Top-N Recommender
Systems, Proc. IEEE ICDM, 2011.
[2] X. Ning and G. Karypis, Sparse Linear Methods with Side Information for Top-N
Recommendations, Proc. ACM RecSys, 2012.

From SLIM to regression
●

Avoid constraints

●

Regularized regression, learn with SGD

●

Easy to implement

●

Faster on large datasets

From SLIM to regression

users

R´

W

R

x

≈

items rj

rj
wj

model.fit(R´,rj)
wj = model.coef_

Results on Mendeley data
●

Stacked readership counts, keyword counts

●

5M docs/keywords, 1M users, 140M non-zeros

●

Constrained to ~100k target users

●

Python implementation on top of scikit-learn
–
–

●

Trivially parallelized with IPython
eats CPU but easy on AWS

10% improvement over nearest neighbours

Our use case
R´

W

R

x

≈
all users

j

items

items

readership

keywords

target
users

all users

j

Tuning regularization constants
●

Relate directly to business logic
–

want sparsest similarity lists that are not too sparse

–

“too sparse” = # items with < k similar items

●

Grid search with a small sample of items

●

Empirically corresponds well to optimising
recommendation accuracy of a validation set
–

but faster and easier

Software release: mrec
●

Wrote our own small framework

●

Includes parallelized SLIM

●

Support for evaluation

●

BSD licence

●

https://github.com/mendeley/mrec

●

Please use it or contribute!

Thanks for listening
mark.levy@mendeley.com
@gamboviol
https://github.com/gamboviol
https://github.com/mendeley/mrec

Real World Requirements
●

Cheap to compute

●

Explicable

●

Easy to tweak + combine with business rules

●

Not a black box

●

Work without ratings

●

Handle multiple data sources

●

Offer something to anon users

Cold Start?
●

Can't work miracles for new items

●

Serve new users asap

●

Fine to run a special system for either case

●

Most content leaks

●

… or is leaked

●

All serious commercial content is annotated

What's hot

Personalizing "The Netflix Experience" with Deep LearningAnoop Deoras

Recsys2016 Tutorial by Xavier and DeepakDeepak Agarwal

Deep Learning for Recommender SystemsJustin Basilico

Missing values in recommender modelsParmeshwar Khurd

Making Netflix Machine Learning Algorithms ReliableJustin Basilico

Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...Sudeep Das, Ph.D.

Recent Trends in Personalization: A Netflix PerspectiveJustin Basilico

Time, Context and Causality in Recommender SystemsYves Raimond

LSTM佳蓉倪

Tutorial on Deep Learning in Recommender System, Lars summer school 2019Anoop Deoras

Learning a Personalized HomepageJustin Basilico

Artwork Personalization at NetflixJustin Basilico

Recommending for the WorldYves Raimond

Recommender Systems In IndustryXavier Amatriain

Past, Present & Future of Recommender Systems: An Industry PerspectiveJustin Basilico

Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018Massimo Quadrana

Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...Universitat Politècnica de Catalunya

Recommendation Modeling with Impression Data at NetflixJiangwei Pan

Learning to rankBruce Kuo

Context Aware Recommendations at NetflixLinas Baltrunas

What's hot (20)

Personalizing "The Netflix Experience" with Deep Learning

Recsys2016 Tutorial by Xavier and Deepak

Deep Learning for Recommender Systems

Missing values in recommender models

Making Netflix Machine Learning Algorithms Reliable

Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...

Recent Trends in Personalization: A Netflix Perspective

Time, Context and Causality in Recommender Systems

LSTM

Tutorial on Deep Learning in Recommender System, Lars summer school 2019

Learning a Personalized Homepage

Artwork Personalization at Netflix

Recommending for the World

Recommender Systems In Industry

Past, Present & Future of Recommender Systems: An Industry Perspective

Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018

Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...

Recommendation Modeling with Impression Data at Netflix

Learning to rank

Context Aware Recommendations at Netflix

Viewers also liked

Offline evaluation of recommender systems: all pain and no gain?Mark Levy

Recommender Systems (Machine Learning Summer School 2014 @ CMU)Xavier Amatriain

Collaborative Filtering Recommender Based on Co-occurrence MatrixMarjan Sterjev

Latent factor models for Collaborative Filteringsscdotopen

Matrix Factorization Techniques For Recommender SystemsLei Guo

Efficient Top-N Recommendation for Very Large Scale Binary Rated DatasetsFabio Aiolli

[CIKM 2014] Deviation-Based Contextual SLIM RecommendersYONG ZHENG

[SAC 2015] Improve General Contextual SLIM Recommendation Algorithms By Facto...YONG ZHENG

Algorithms on Hadoop at Last.fmMark Levy

November 2014 HUG: Apache Tez - A Performance View into Large Scale Data-proc...Yahoo Developer Network

October 2016 HUG: The Pillars of Effective Data Archiving and Tiering in HadoopYahoo Developer Network

Brandie and Michaella's ProjectLacilia0024

October 2016 HUG: Pulsar, a highly scalable, low latency pub-sub messaging s...Yahoo Developer Network

February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...Yahoo Developer Network

Linear Regression Ordinary Least Squares Distributed Calculation ExampleMarjan Sterjev

Mutiple linear regression projectJAPAN SHAH

October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...Yahoo Developer Network

Co-occurrence Based Recommendations with Mahout, Scala and Sparksscdotopen

Recommendation System --Theory and PracticeKimikazu Kato

econometrics project PG1 2015-16Sayantan Baidya

Viewers also liked (20)

Offline evaluation of recommender systems: all pain and no gain?

Recommender Systems (Machine Learning Summer School 2014 @ CMU)

Collaborative Filtering Recommender Based on Co-occurrence Matrix

Latent factor models for Collaborative Filtering

Matrix Factorization Techniques For Recommender Systems

Efficient Top-N Recommendation for Very Large Scale Binary Rated Datasets

[CIKM 2014] Deviation-Based Contextual SLIM Recommenders

[SAC 2015] Improve General Contextual SLIM Recommendation Algorithms By Facto...

Algorithms on Hadoop at Last.fm

November 2014 HUG: Apache Tez - A Performance View into Large Scale Data-proc...

October 2016 HUG: The Pillars of Effective Data Archiving and Tiering in Hadoop

Brandie and Michaella's Project

October 2016 HUG: Pulsar, a highly scalable, low latency pub-sub messaging s...

February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...

Linear Regression Ordinary Least Squares Distributed Calculation Example

Mutiple linear regression project

October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...

Co-occurrence Based Recommendations with Mahout, Scala and Spark

Recommendation System --Theory and Practice

econometrics project PG1 2015-16

Similar to Efficient Top-N Recommendation by Linear Regression

Parallel analytics as a servicePetrie Wong

Production-Ready BIG ML Workflows - from zero to heroDaniel Marcous

Data Science Salon: A Journey of Deploying a Data Science Engine to ProductionFormulatedby

Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15MLconf

10 more lessons learned from building Machine Learning systems - MLConfXavier Amatriain

10 more lessons learned from building Machine Learning systemsXavier Amatriain

Triantafyllia VoulibasiISSEL

Data ops in practice - Swedish styleLars Albertsson

Document Clustering using LDA | Haridas Narayanaswamy [Pramati]Pramati Technologies

Recommender Systems from A to Z – Real-Time DeploymentCrossing Minds

Benefits of a Homemade ML PlatformGetInData

Dirty Data? Clean it up! - Rocky Mountain DataCon 2016Dan Lynn

MongoDB World 2019: MongoDB in Data Science: How to Build a Scalable Product ...MongoDB

#RADC4L16: An API-First Archives Approach at NPRCamille Salas

LODFlow: Workflow Management System for Linked Data ProcessingIvan Ermilov

BSSML16 L10. Summary Day 2 SessionsBigML, Inc

Intro to ember.jsLeo Hernandez

Moodle performance optimizationsJan Meier

Open source ml systems that need to be builtNikhil Garg

Devday @ Sahaj - Domain Specific NLP PipelinesRajesh Muppalla

Similar to Efficient Top-N Recommendation by Linear Regression (20)

Parallel analytics as a service

Production-Ready BIG ML Workflows - from zero to hero

Data Science Salon: A Journey of Deploying a Data Science Engine to Production

Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15

10 more lessons learned from building Machine Learning systems - MLConf

10 more lessons learned from building Machine Learning systems

Triantafyllia Voulibasi

Data ops in practice - Swedish style

Document Clustering using LDA | Haridas Narayanaswamy [Pramati]

Recommender Systems from A to Z – Real-Time Deployment

Benefits of a Homemade ML Platform

Dirty Data? Clean it up! - Rocky Mountain DataCon 2016

MongoDB World 2019: MongoDB in Data Science: How to Build a Scalable Product ...

#RADC4L16: An API-First Archives Approach at NPR

LODFlow: Workflow Management System for Linked Data Processing

BSSML16 L10. Summary Day 2 Sessions

Intro to ember.js

Moodle performance optimizations

Open source ml systems that need to be built

Devday @ Sahaj - Domain Specific NLP Pipelines

Recently uploaded

Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3

Data governance with Unity Catalog PresentationKnoldus Inc.

The State of Passkeys with FIDO Alliance.pptxLoriGlavin3

Time Series Foundation Models - current state and future directionsNathaniel Shimoni

A Journey Into the Emotions of Software DevelopersNicole Novielli

Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica

How to write a Business Continuity PlanDatabarracks

TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey

Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica

MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar

Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA

React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech

The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3

Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen

Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda

Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple

Recently uploaded (20)

Potential of AI (Generative AI) in Business: Learnings and Insights

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx

Data governance with Unity Catalog Presentation

The State of Passkeys with FIDO Alliance.pptx

Time Series Foundation Models - current state and future directions

A Journey Into the Emotions of Software Developers

Zeshan Sattar- Assessing the skill requirements and industry expectations for...

How to write a Business Continuity Plan

TeamStation AI System Report LATAM IT Salaries 2024

Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...

MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes

Long journey of Ruby standard library at RubyConf AU 2024

React Native vs Ionic - The Best Mobile App Framework

The Ultimate Guide to Choosing WordPress Pros and Cons

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx

Testing tools and AI - ideas what to try with some tool examples

Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger

Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...

Efficient Top-N Recommendation by Linear Regression

1. Efficient Top-N Recommendation by Linear Regression Mark Levy Mendeley

2. What and why? ● Social network products @Mendeley

3. What and why? ● Social network products @Mendeley

4. What and why? ● Social network products @Mendeley

5. What and why? ● Social network products @Mendeley ● ERASM Eurostars project ● Make newsfeed more interesting ● First task: who to follow

6. What and why? ● Multiple datasets – 2M users, many active – ~100M documents – author keywords, social tags – 50M physical pdfs – 15M <user,document> per month – User-document matrix currently ~250M non-zeros

7. What and why? ● Approach as item similarity problem – ● with a constrained target item set Possible state of the art: – – matrix factorization and then neighbourhood – something dynamic (?) – ● pimped old skool neighbourhood method SLIM Use some side data

8. SLIM ● Learn sparse item similarity weights ● No explicit neighbourhood or metric ● L1 regularization gives sparsity ● Bound-constrained least squares:

9. SLIM users R W R x ≈ rj items wj model.fit(R,rj) wj = model.coef_

10. SLIM Good: – Outperforms MF methods on implicit ratings data [1] – Easy extensions to include side data [2] Not so good: – Reported to be slow beyond small datasets [1] [1] X. Ning and G. Karypis, SLIM: Sparse Linear Methods for Top-N Recommender Systems, Proc. IEEE ICDM, 2011. [2] X. Ning and G. Karypis, Sparse Linear Methods with Side Information for Top-N Recommendations, Proc. ACM RecSys, 2012.

11. From SLIM to regression ● Avoid constraints ● Regularized regression, learn with SGD ● Easy to implement ● Faster on large datasets

12. From SLIM to regression users R´ W R x ≈ items rj rj wj model.fit(R´,rj) wj = model.coef_

13. Results on reference datasets

14. Results on reference datasets

15. Results on reference datasets

16. Results on Mendeley data ● Stacked readership counts, keyword counts ● 5M docs/keywords, 1M users, 140M non-zeros ● Constrained to ~100k target users ● Python implementation on top of scikit-learn – – ● Trivially parallelized with IPython eats CPU but easy on AWS 10% improvement over nearest neighbours

17. Our use case R´ W R x ≈ all users j items items readership keywords target users all users j

18. Tuning regularization constants ● Relate directly to business logic – want sparsest similarity lists that are not too sparse – “too sparse” = # items with < k similar items ● Grid search with a small sample of items ● Empirically corresponds well to optimising recommendation accuracy of a validation set – but faster and easier

19. Software release: mrec ● Wrote our own small framework ● Includes parallelized SLIM ● Support for evaluation ● BSD licence ● https://github.com/mendeley/mrec ● Please use it or contribute!

20. Thanks for listening mark.levy@mendeley.com @gamboviol https://github.com/gamboviol https://github.com/mendeley/mrec

21. Real World Requirements ● Cheap to compute ● Explicable ● Easy to tweak + combine with business rules ● Not a black box ● Work without ratings ● Handle multiple data sources ● Offer something to anon users

22. Cold Start? ● Can't work miracles for new items ● Serve new users asap ● Fine to run a special system for either case ● Most content leaks ● … or is leaked ● All serious commercial content is annotated

23. Degrees of Freedom for k-NN ● Input numbers from mining logs ● Temporal “modelling” (e.g. fake users) ● Data pruning (scalability, popularity bias, quality) ● Preprocessing (tf-idf, log/sqrt, …) ● Hand crafted similarity metric ● Hand crafted aggregation formula ● Postprocessing (popularity matching) ● Diversification ● Attention profile

Efficient Top-N Recommendation by Linear Regression

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Efficient Top-N Recommendation by Linear Regression

Similar to Efficient Top-N Recommendation by Linear Regression (20)

Recently uploaded

Recently uploaded (20)

Efficient Top-N Recommendation by Linear Regression