SlideShare a Scribd company logo
1 of 13
Download to read offline
Semantic Equivalence of
e-Commerce Queries
Aritra Mandal and Daniel Tunkelang and Zhe Wu
eBay Inc.
Search query != search intent.
● Information retrieval researchers worry about queries that map to multiple intents.
jaguar or ?
● Practitioners worry more about multiple queries that map to the same intent.
lightning to 3.5mm
iphone to aux
Equivalent queries should yield equivalent experiences.
Recall?
CTR?
Conversion Rate?
...
?
or
Opportunity to increase recall while preserving precision.
Similar but not
equivalent intent.
High-level strategy to leverage query equivalence.
Map queries to vectors.
Store in nearest-neighbor database.
(i.e., optimize for user
or business outcome)
Two strategies for recognizing equivalent queries.
● Surface Similarity
○ Variation in inflection, word order, compounding, noise words.
black tshirts for men = mens black t-shirt =
● Behavioral Similarity
○ Queries lead to engagement with equivalent or similar results.
lightning to 3.5mm = iphone to aux =
Introducing the “bag of documents” model.
Query vectors are centroids of associated product vectors
►
►
[0.13, 0.81, … ]
[0.09, 0.75, … ]
…
►
[0.11, 0.79, … ]
[0.13, 0.81, … ]
[0.09, 0.77, … ]
…
►
[0.12, 0.78, … ]
►
cos > 0.98
black tshirts for men mens black t-shirt
Works well, but only for head and torso queries.
● Offline approach works for queries with enough engagement history.
● Would be expensive to compute aggregates of result vectors online.
● Still, head and torso queries tend to represent a large fraction of traffic.
Train online sentence transformer model for tail queries.
● Train using (query1, query2, similarity) triples from offline model.
● Oversample similar query pairs to increase sensitivity where it matters.
● Fine-tune a pre-trained micro-BERT sentence transformer model.
● Concatenate the output of a query classifier to the query keywords.
Architecture for Online Query Similarity Model
Results
Model Dataset Name Pearson’s correlation
query-sim-ecom eBay Internal 0.87
query-sim-ecom ESCI query-query 0.85
all-MiniLM-L12-v2 ESCI query-query 0.68
Query 1 Query 2 cosine
hdmi to galaxy s8 s9 hdmi 0.9993
movie money prop money 0.9995
cassette adapter for iphone tape to aux 0.9993
Examples from ESCI
of queries with low
surface but high
behavioral similarity:
Summary
● Queries with equivalent intent should yield equivalent experiences.
● Query similarity can increase recall while preserving precision.
● Signals can come from either surface or behavioral similarity.
● Offline bag-of-documents model: queries as means of product vectors.
● Fine-tune online Micro-BERT sentence transformer model for tail queries.
● It just works!

More Related Content

Similar to Semantic Equivalence of e-Commerce Queries

Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
Jean Silva
 
Scott Clark, CEO, SigOpt, at MLconf Seattle 2017
Scott Clark, CEO, SigOpt, at MLconf Seattle 2017Scott Clark, CEO, SigOpt, at MLconf Seattle 2017
Scott Clark, CEO, SigOpt, at MLconf Seattle 2017
MLconf
 
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Sri Ambati
 

Similar to Semantic Equivalence of e-Commerce Queries (20)

Comparable entity mining from comparative questions
Comparable entity mining from comparative questionsComparable entity mining from comparative questions
Comparable entity mining from comparative questions
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Sentiment Analysis: A comparative study of Deep Learning and Machine Learning
Sentiment Analysis: A comparative study of Deep Learning and Machine LearningSentiment Analysis: A comparative study of Deep Learning and Machine Learning
Sentiment Analysis: A comparative study of Deep Learning and Machine Learning
 
Scott Clark, CEO, SigOpt, at MLconf Seattle 2017
Scott Clark, CEO, SigOpt, at MLconf Seattle 2017Scott Clark, CEO, SigOpt, at MLconf Seattle 2017
Scott Clark, CEO, SigOpt, at MLconf Seattle 2017
 
MLconf 2017 Seattle Lunch Talk - Using Optimal Learning to tune Deep Learning...
MLconf 2017 Seattle Lunch Talk - Using Optimal Learning to tune Deep Learning...MLconf 2017 Seattle Lunch Talk - Using Optimal Learning to tune Deep Learning...
MLconf 2017 Seattle Lunch Talk - Using Optimal Learning to tune Deep Learning...
 
Extraction of Data Using Comparable Entity Mining
Extraction of Data Using Comparable Entity MiningExtraction of Data Using Comparable Entity Mining
Extraction of Data Using Comparable Entity Mining
 
E017252831
E017252831E017252831
E017252831
 
Demystifying Machine Learning
Demystifying Machine LearningDemystifying Machine Learning
Demystifying Machine Learning
 
Helping Searchers Satisfice through Query Understanding
Helping Searchers Satisfice through Query UnderstandingHelping Searchers Satisfice through Query Understanding
Helping Searchers Satisfice through Query Understanding
 
How to fine-tune and develop your own large language model.pptx
How to fine-tune and develop your own large language model.pptxHow to fine-tune and develop your own large language model.pptx
How to fine-tune and develop your own large language model.pptx
 
VMworld vBrownbag vmtn6739e - machine learning (ai) for workload analytics an...
VMworld vBrownbag vmtn6739e - machine learning (ai) for workload analytics an...VMworld vBrownbag vmtn6739e - machine learning (ai) for workload analytics an...
VMworld vBrownbag vmtn6739e - machine learning (ai) for workload analytics an...
 
Movie Recommendation System.pptx
Movie Recommendation System.pptxMovie Recommendation System.pptx
Movie Recommendation System.pptx
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
ChatGPT-and-Generative-AI-Landscape Working of generative ai search
ChatGPT-and-Generative-AI-Landscape Working of generative ai searchChatGPT-and-Generative-AI-Landscape Working of generative ai search
ChatGPT-and-Generative-AI-Landscape Working of generative ai search
 
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
 
Interpretable Machine Learning
Interpretable Machine LearningInterpretable Machine Learning
Interpretable Machine Learning
 
AWS_Meetup_BLR_July_22_Social.pdf
AWS_Meetup_BLR_July_22_Social.pdfAWS_Meetup_BLR_July_22_Social.pdf
AWS_Meetup_BLR_July_22_Social.pdf
 
Deep learning Introduction and Basics
Deep learning  Introduction and BasicsDeep learning  Introduction and Basics
Deep learning Introduction and Basics
 
Everything you need to know about AutoML
Everything you need to know about AutoMLEverything you need to know about AutoML
Everything you need to know about AutoML
 

More from Daniel Tunkelang

Enterprise Intelligence
Enterprise IntelligenceEnterprise Intelligence
Enterprise Intelligence
Daniel Tunkelang
 
My Three Ex’s: A Data Science Approach for Applied Machine Learning
My Three Ex’s: A Data Science Approach for Applied Machine LearningMy Three Ex’s: A Data Science Approach for Applied Machine Learning
My Three Ex’s: A Data Science Approach for Applied Machine Learning
Daniel Tunkelang
 
Web science - How is it different?
Web science - How is it different?Web science - How is it different?
Web science - How is it different?
Daniel Tunkelang
 
Find and be Found: Information Retrieval at LinkedIn
Find and be Found: Information Retrieval at LinkedInFind and be Found: Information Retrieval at LinkedIn
Find and be Found: Information Retrieval at LinkedIn
Daniel Tunkelang
 
Search as Communication: Lessons from a Personal Journey
Search as Communication: Lessons from a Personal JourneySearch as Communication: Lessons from a Personal Journey
Search as Communication: Lessons from a Personal Journey
Daniel Tunkelang
 
Enterprise Search: How do we get there from here?
Enterprise Search: How do we get there from here?Enterprise Search: How do we get there from here?
Enterprise Search: How do we get there from here?
Daniel Tunkelang
 
Big Data, We Have a Communication Problem
Big Data, We Have a Communication Problem Big Data, We Have a Communication Problem
Big Data, We Have a Communication Problem
Daniel Tunkelang
 
Data By The People, For The People
Data By The People, For The PeopleData By The People, For The People
Data By The People, For The People
Daniel Tunkelang
 

More from Daniel Tunkelang (20)

MMM, Search!
MMM, Search!MMM, Search!
MMM, Search!
 
Enterprise Intelligence
Enterprise IntelligenceEnterprise Intelligence
Enterprise Intelligence
 
Query Understanding: A Manifesto
Query Understanding: A ManifestoQuery Understanding: A Manifesto
Query Understanding: A Manifesto
 
Where should you put your data scientists?
Where should you put your data scientists?Where should you put your data scientists?
Where should you put your data scientists?
 
Data Science: A Mindset for Productivity
Data Science: A Mindset for ProductivityData Science: A Mindset for Productivity
Data Science: A Mindset for Productivity
 
My Three Ex’s: A Data Science Approach for Applied Machine Learning
My Three Ex’s: A Data Science Approach for Applied Machine LearningMy Three Ex’s: A Data Science Approach for Applied Machine Learning
My Three Ex’s: A Data Science Approach for Applied Machine Learning
 
Web science - How is it different?
Web science - How is it different?Web science - How is it different?
Web science - How is it different?
 
Better Search Through Query Understanding
Better Search Through Query UnderstandingBetter Search Through Query Understanding
Better Search Through Query Understanding
 
Social Search in a Professional Context
Social Search in a Professional ContextSocial Search in a Professional Context
Social Search in a Professional Context
 
Find and be Found: Information Retrieval at LinkedIn
Find and be Found: Information Retrieval at LinkedInFind and be Found: Information Retrieval at LinkedIn
Find and be Found: Information Retrieval at LinkedIn
 
Search as Communication: Lessons from a Personal Journey
Search as Communication: Lessons from a Personal JourneySearch as Communication: Lessons from a Personal Journey
Search as Communication: Lessons from a Personal Journey
 
Enterprise Search: How do we get there from here?
Enterprise Search: How do we get there from here?Enterprise Search: How do we get there from here?
Enterprise Search: How do we get there from here?
 
Big Data, We Have a Communication Problem
Big Data, We Have a Communication Problem Big Data, We Have a Communication Problem
Big Data, We Have a Communication Problem
 
How to Interview a Data Scientist
How to Interview a Data ScientistHow to Interview a Data Scientist
How to Interview a Data Scientist
 
Information, Attention, and Trust: A Hierarchy of Needs
Information, Attention, and Trust: A Hierarchy of NeedsInformation, Attention, and Trust: A Hierarchy of Needs
Information, Attention, and Trust: A Hierarchy of Needs
 
Data By The People, For The People
Data By The People, For The PeopleData By The People, For The People
Data By The People, For The People
 
Content, Connections, and Context
Content, Connections, and ContextContent, Connections, and Context
Content, Connections, and Context
 
Scale, Structure, and Semantics
Scale, Structure, and SemanticsScale, Structure, and Semantics
Scale, Structure, and Semantics
 
Strata 2012: Humans, Machines, and the Dimensions of Microwork
Strata 2012: Humans, Machines, and the Dimensions of MicroworkStrata 2012: Humans, Machines, and the Dimensions of Microwork
Strata 2012: Humans, Machines, and the Dimensions of Microwork
 
Recommendations as a Conversation with the User
Recommendations as a Conversation with the UserRecommendations as a Conversation with the User
Recommendations as a Conversation with the User
 

Recently uploaded

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 

Recently uploaded (20)

AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
SHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions PresentationSHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions Presentation
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 

Semantic Equivalence of e-Commerce Queries

  • 1. Semantic Equivalence of e-Commerce Queries Aritra Mandal and Daniel Tunkelang and Zhe Wu eBay Inc.
  • 2. Search query != search intent. ● Information retrieval researchers worry about queries that map to multiple intents. jaguar or ? ● Practitioners worry more about multiple queries that map to the same intent. lightning to 3.5mm iphone to aux
  • 3. Equivalent queries should yield equivalent experiences. Recall? CTR? Conversion Rate? ... ? or
  • 4. Opportunity to increase recall while preserving precision. Similar but not equivalent intent.
  • 5. High-level strategy to leverage query equivalence. Map queries to vectors. Store in nearest-neighbor database. (i.e., optimize for user or business outcome)
  • 6. Two strategies for recognizing equivalent queries. ● Surface Similarity ○ Variation in inflection, word order, compounding, noise words. black tshirts for men = mens black t-shirt = ● Behavioral Similarity ○ Queries lead to engagement with equivalent or similar results. lightning to 3.5mm = iphone to aux =
  • 7. Introducing the “bag of documents” model.
  • 8. Query vectors are centroids of associated product vectors ► ► [0.13, 0.81, … ] [0.09, 0.75, … ] … ► [0.11, 0.79, … ] [0.13, 0.81, … ] [0.09, 0.77, … ] … ► [0.12, 0.78, … ] ► cos > 0.98 black tshirts for men mens black t-shirt
  • 9. Works well, but only for head and torso queries. ● Offline approach works for queries with enough engagement history. ● Would be expensive to compute aggregates of result vectors online. ● Still, head and torso queries tend to represent a large fraction of traffic.
  • 10. Train online sentence transformer model for tail queries. ● Train using (query1, query2, similarity) triples from offline model. ● Oversample similar query pairs to increase sensitivity where it matters. ● Fine-tune a pre-trained micro-BERT sentence transformer model. ● Concatenate the output of a query classifier to the query keywords.
  • 11. Architecture for Online Query Similarity Model
  • 12. Results Model Dataset Name Pearson’s correlation query-sim-ecom eBay Internal 0.87 query-sim-ecom ESCI query-query 0.85 all-MiniLM-L12-v2 ESCI query-query 0.68 Query 1 Query 2 cosine hdmi to galaxy s8 s9 hdmi 0.9993 movie money prop money 0.9995 cassette adapter for iphone tape to aux 0.9993 Examples from ESCI of queries with low surface but high behavioral similarity:
  • 13. Summary ● Queries with equivalent intent should yield equivalent experiences. ● Query similarity can increase recall while preserving precision. ● Signals can come from either surface or behavioral similarity. ● Offline bag-of-documents model: queries as means of product vectors. ● Fine-tune online Micro-BERT sentence transformer model for tail queries. ● It just works!