SlideShare a Scribd company logo
1 of 57
Retrieval and Feedback Models for Blog Feed Search SIGIR 2008 Singapore Jonathan Elsas, Jaime Arguello, Jamie Callan & Jaime Carbonell LTI/SCS/CMU
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Background
What is a Blog?
What is a Feed? <xml> <feed> <entry> <author>Peter …</> <title>Good, Evil…</> <content>I’ve said…</> </entry> <entry> <author>Peter …</> <title>Agreeing…</> <content>Some peo…</> </entry> …
Blog-Feed Correspondence Blog Feed Post Entry HTML XML
Why are Blogs important? ,[object Object],[http://www.technorati.com/about/]
The Task
Feed Search at TREC ,[object Object],[object Object],[object Object],(a.k.a. Blog Distillation)
Feed Search at TREC ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Represent Ongoing Information  Needs Frequently Very General
Challenges in Feed Search
Challenges in Feed Search ,[object Object],entries time feed
[object Object],[object Object],Challenges in Feed Search entries time feed
Challenges in Feed Search ,[object Object],time Space Exploration topic NASA China’s plans for the moon shuttle launch My dog Mars rover Boeing
Challenges in Feed Search ,[object Object],[object Object],Space Exploration time topic
Challenges in Feed Search ,[object Object],[object Object],time
Challenges in Feed Search ,[object Object],[Mac] [Music] [Food] [Wine] …  post regularly about new  products ,  features , or  application software  of Apple Mac computers. …  describing  songs ,  biographies  of musicians, musical  styles  and their  influences  of music on people are discussed. … such as  tastings ,  reviews , food  matching  or  pairing , and  oenophile news  and  events . …  describing experiences  eating  cuisines,  culinary delights , recipes ,  nutrition plans .
Our Approach
Feeds: ,[object Object],[object Object],[object Object],Information Needs: General & Ongoing Challenges Our Approach Retrieval Models Feedback Models
Retrieval Models ,[object Object],[object Object],[object Object]
Large Document (Feed) Model [Q] <?xml… … </…> `<?xml… … </…> <?xml… … </…> <?xml… <feed> <entry> <entry> <entry> <entry> <entry> … </…> <?xml… … </…> <?xml… … </…> <?xml… … </…> <?xml… <feed> <entry> <entry> <entry> <entry> <entry> … </…> Feed Document  Collection Ranked Feeds Rank by Indri’s standard retrieval model [Metzler and Croft, 2004; 2005]
Large Document (Feed) Model ,[object Object],[object Object],[object Object],[object Object],[object Object],Feed Entry E E Entry Entry E
Small Document (Entry) Model Ranked Entries [Q] <entry> <entry> <entry> <entry> <?xml… <entry> Entry Document  Collection <entry> <entry> <entry> <entry> <?xml… <entry> <entry> <entry> <entry> <entry> <?xml… <entry> <entry> <entry> <entry> <entry> <?xml… <entry> <entry> <entry> <entry> <entry> <?xml… <entry> <entry> <entry> <entry> <entry> <?xml… <entry> <entry> <entry> <entry> <entry> <?xml… <entry> Ranked Feeds document = entry Apply some rank aggregation function Rank By
Small Document (Entry) Model ,[object Object],[object Object],[object Object],ReDDE Federated Search Algortihm [Si & Callan, 2003]
Entry Centrality ,[object Object],[object Object],time topic
Small Document (Entry) Model ,[object Object],[object Object],[object Object],[object Object],[object Object],Not only improves speed,  Also performance Q
Retrieval Model Results
Retrieval Model Results ,[object Object],[object Object],[object Object]
Retrieval Model Results Mean Average Precision Large Document (Feed) Model Small Document (Entry) Models
Retrieval Model Results Mean Average Precision Uniform Log(Feed Length) Uniform Log Prior Map 0.188
Retrieval Model Results Mean Average Precision Uniform Log(Feed Length) Uniform n/a
Feedback Models ,[object Object],[object Object],[object Object]
Query Expansion (PRF) [Q] BLOG06 Collection Related Terms from top K documents [Q + Terms] [Lavrenko & Croft, 2001]
Query Expansion Example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[Photography] PRF photography nude erotic art girl free teen fashion women
Feedback Model Results Mean Average Precision None PRF
Query Expansion (Wikipedia PRF) [Q] BLOG06 Collection [Q + Terms] [Lavrenko & Croft, 2001] Wikipedia [Diaz & Metzler, 2006] Related Terms from top K documents
Query Expansion Example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[Photography] PRF photography nude erotic art girl free teen fashion women Wikipedia PRF photography director special film art camera music cinematographer photographic
Feedback Model Results Mean Average Precision None PRF Wiki. PRF
Query Expansion (Wikipedia Link) [Q] BLOG06 Collection [Q + Terms] Wikipedia Related Terms from  link structure
Wikipedia Link-Based Query Expansion
Wikipedia Link-Based Expansion Wikipedia … Q
Wikipedia Link-Based Expansion … Relevance Set,  Top R = 100 Working Set,  Top W = 1000 Q Wikipedia
Wikipedia Link-Based Expansion … Wikipedia Q Relevance Set,  Top R = 100 Working Set,  Top W = 1000
Wikipedia Link-Based Expansion Relevance Set,  Top R = 100 Working Set,  Top W = 1000 … Wikipedia Extract anchor text from Working Set  that link to the  Relevance Set . Q
Wikipedia Link-Based Expansion Relevance Set,  Top R = 500 Working Set,  Top W = 1000 … Wikipedia Extract anchor text from Working Set  that link to the  Relevance Set . Q Combines relevance and popularity Relevance: An anchor phrase that links to a high ranked article gets a high score Popularity: An anchor phrase that links many times to a mid-ranked articles also gets high score
Query Expansion Example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[Photography] PRF photography nude erotic art girl free teen fashion women Ideal digital photography depth of field photographic film photojournalism cinematography
Feedback Model Results Mean Average Precision None PRF Wiki. PRF Wiki. Link
Conclusion ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Thank You! Student Travel Grant funding from:    ACM SIGIR,    Amit Singhal,    Microsoft Research
Entry Centrality GM Derivation where Entry Generation Likelihood: |E|
Query Expansion Examples ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[Music] PRF Music Country Download Free MP3 Mp3andmore Lyric Listen Song
Query Expansion Examples ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[Scottish Independence] PRF scotland independence party convention politics snp national people scot
Query Expansion Examples ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[Machine Learning] PRF learn machine credit card karaoke journal sex model sew
Query Generality Characteristics ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Relevance Set Cohesiveness … Relevance Set,  Top R = 100 Wikipedia Cohesiveness = |  L in  | |  L in  U  L out  |
Relevant Set Cohesiveness
Is it the Queries? ,[object Object],[object Object],[object Object],But, none of these measures predict whether wikipedia expansions helps…

More Related Content

Similar to Retrieval and Feedback Models for Blog Feed Search

Word embeddings as a service - PyData NYC 2015
Word embeddings as a service -  PyData NYC 2015Word embeddings as a service -  PyData NYC 2015
Word embeddings as a service - PyData NYC 2015François Scharffe
 
Семантический поиск - что это, как работает и чем отличается от просто поиска
Семантический поиск - что это, как работает и чем отличается от просто поискаСемантический поиск - что это, как работает и чем отличается от просто поиска
Семантический поиск - что это, как работает и чем отличается от просто поискаVitebsk Miniq
 
Data scientist enablement dse 400 week 3 roadmap
Data scientist enablement   dse 400   week 3 roadmapData scientist enablement   dse 400   week 3 roadmap
Data scientist enablement dse 400 week 3 roadmapDr. Mohan K. Bavirisetty
 
Scalable Learning Technologies for Big Data Mining
Scalable Learning Technologies for Big Data MiningScalable Learning Technologies for Big Data Mining
Scalable Learning Technologies for Big Data MiningGerard de Melo
 
CM UTaipei Kaggle Share
CM UTaipei Kaggle ShareCM UTaipei Kaggle Share
CM UTaipei Kaggle Share志明 陳
 
AI and Python: Developing a Conversational Interface using Python
AI and Python: Developing a Conversational Interface using PythonAI and Python: Developing a Conversational Interface using Python
AI and Python: Developing a Conversational Interface using Pythonamyiris
 
Web Scale Information Extraction tutorial ecml2013
Web Scale Information Extraction tutorial ecml2013Web Scale Information Extraction tutorial ecml2013
Web Scale Information Extraction tutorial ecml2013Anna Lisa Gentile
 
data-science-pdf-16588.pdf
data-science-pdf-16588.pdfdata-science-pdf-16588.pdf
data-science-pdf-16588.pdfvkharish18
 
1 ASSIGNMENT 1 REVIEWING RESEARCH AND MAKIN.docx
1  ASSIGNMENT 1   REVIEWING RESEARCH AND MAKIN.docx1  ASSIGNMENT 1   REVIEWING RESEARCH AND MAKIN.docx
1 ASSIGNMENT 1 REVIEWING RESEARCH AND MAKIN.docxoswald1horne84988
 
Dynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & StatisticsDynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & StatisticsPaul Hofmann
 
Data scientist enablement dse 400 week 4 roadmap
Data scientist enablement   dse 400   week 4 roadmap Data scientist enablement   dse 400   week 4 roadmap
Data scientist enablement dse 400 week 4 roadmap Dr. Mohan K. Bavirisetty
 
Elsevier/Maryland Publishing Connect - 14_0331 (pdf)
Elsevier/Maryland Publishing Connect - 14_0331 (pdf)Elsevier/Maryland Publishing Connect - 14_0331 (pdf)
Elsevier/Maryland Publishing Connect - 14_0331 (pdf)jeffreylancaster
 
Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)Avkash Chauhan
 
Data-Driven Growth: Lies, Lawyers & Outsized Results
Data-Driven Growth: Lies, Lawyers & Outsized ResultsData-Driven Growth: Lies, Lawyers & Outsized Results
Data-Driven Growth: Lies, Lawyers & Outsized ResultsHull
 
Ed Fry — Data-Driven Growth: Lies, Lawyers & Outsized Results (Turing Fest 2018)
Ed Fry — Data-Driven Growth: Lies, Lawyers & Outsized Results (Turing Fest 2018)Ed Fry — Data-Driven Growth: Lies, Lawyers & Outsized Results (Turing Fest 2018)
Ed Fry — Data-Driven Growth: Lies, Lawyers & Outsized Results (Turing Fest 2018)Turing Fest
 
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...Daniel Zivkovic
 
MT and Post-Editing User-Generated Content AMTA 2014
MT and Post-Editing User-Generated Content AMTA 2014MT and Post-Editing User-Generated Content AMTA 2014
MT and Post-Editing User-Generated Content AMTA 2014Welocalize
 
QALL-ME: Ontology and Semantic Web
QALL-ME: Ontology and Semantic WebQALL-ME: Ontology and Semantic Web
QALL-ME: Ontology and Semantic WebConstantin Orasan
 

Similar to Retrieval and Feedback Models for Blog Feed Search (20)

Word embeddings as a service - PyData NYC 2015
Word embeddings as a service -  PyData NYC 2015Word embeddings as a service -  PyData NYC 2015
Word embeddings as a service - PyData NYC 2015
 
Семантический поиск - что это, как работает и чем отличается от просто поиска
Семантический поиск - что это, как работает и чем отличается от просто поискаСемантический поиск - что это, как работает и чем отличается от просто поиска
Семантический поиск - что это, как работает и чем отличается от просто поиска
 
Data scientist enablement dse 400 week 3 roadmap
Data scientist enablement   dse 400   week 3 roadmapData scientist enablement   dse 400   week 3 roadmap
Data scientist enablement dse 400 week 3 roadmap
 
Scalable Learning Technologies for Big Data Mining
Scalable Learning Technologies for Big Data MiningScalable Learning Technologies for Big Data Mining
Scalable Learning Technologies for Big Data Mining
 
Fiddling with flickr
Fiddling with flickrFiddling with flickr
Fiddling with flickr
 
CM UTaipei Kaggle Share
CM UTaipei Kaggle ShareCM UTaipei Kaggle Share
CM UTaipei Kaggle Share
 
AI and Python: Developing a Conversational Interface using Python
AI and Python: Developing a Conversational Interface using PythonAI and Python: Developing a Conversational Interface using Python
AI and Python: Developing a Conversational Interface using Python
 
Pratical Deep Dive into the Semantic Web - #smconnect
Pratical Deep Dive into the Semantic Web - #smconnectPratical Deep Dive into the Semantic Web - #smconnect
Pratical Deep Dive into the Semantic Web - #smconnect
 
Web Scale Information Extraction tutorial ecml2013
Web Scale Information Extraction tutorial ecml2013Web Scale Information Extraction tutorial ecml2013
Web Scale Information Extraction tutorial ecml2013
 
data-science-pdf-16588.pdf
data-science-pdf-16588.pdfdata-science-pdf-16588.pdf
data-science-pdf-16588.pdf
 
1 ASSIGNMENT 1 REVIEWING RESEARCH AND MAKIN.docx
1  ASSIGNMENT 1   REVIEWING RESEARCH AND MAKIN.docx1  ASSIGNMENT 1   REVIEWING RESEARCH AND MAKIN.docx
1 ASSIGNMENT 1 REVIEWING RESEARCH AND MAKIN.docx
 
Dynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & StatisticsDynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & Statistics
 
Data scientist enablement dse 400 week 4 roadmap
Data scientist enablement   dse 400   week 4 roadmap Data scientist enablement   dse 400   week 4 roadmap
Data scientist enablement dse 400 week 4 roadmap
 
Elsevier/Maryland Publishing Connect - 14_0331 (pdf)
Elsevier/Maryland Publishing Connect - 14_0331 (pdf)Elsevier/Maryland Publishing Connect - 14_0331 (pdf)
Elsevier/Maryland Publishing Connect - 14_0331 (pdf)
 
Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Creating AnswerBot with Keras and TensorFlow (TensorBeat)
 
Data-Driven Growth: Lies, Lawyers & Outsized Results
Data-Driven Growth: Lies, Lawyers & Outsized ResultsData-Driven Growth: Lies, Lawyers & Outsized Results
Data-Driven Growth: Lies, Lawyers & Outsized Results
 
Ed Fry — Data-Driven Growth: Lies, Lawyers & Outsized Results (Turing Fest 2018)
Ed Fry — Data-Driven Growth: Lies, Lawyers & Outsized Results (Turing Fest 2018)Ed Fry — Data-Driven Growth: Lies, Lawyers & Outsized Results (Turing Fest 2018)
Ed Fry — Data-Driven Growth: Lies, Lawyers & Outsized Results (Turing Fest 2018)
 
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
 
MT and Post-Editing User-Generated Content AMTA 2014
MT and Post-Editing User-Generated Content AMTA 2014MT and Post-Editing User-Generated Content AMTA 2014
MT and Post-Editing User-Generated Content AMTA 2014
 
QALL-ME: Ontology and Semantic Web
QALL-ME: Ontology and Semantic WebQALL-ME: Ontology and Semantic Web
QALL-ME: Ontology and Semantic Web
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 

Recently uploaded (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

Retrieval and Feedback Models for Blog Feed Search

  • 1. Retrieval and Feedback Models for Blog Feed Search SIGIR 2008 Singapore Jonathan Elsas, Jaime Arguello, Jamie Callan & Jaime Carbonell LTI/SCS/CMU
  • 2.
  • 4. What is a Blog?
  • 5. What is a Feed? <xml> <feed> <entry> <author>Peter …</> <title>Good, Evil…</> <content>I’ve said…</> </entry> <entry> <author>Peter …</> <title>Agreeing…</> <content>Some peo…</> </entry> …
  • 6. Blog-Feed Correspondence Blog Feed Post Entry HTML XML
  • 7.
  • 9.
  • 10.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 19.
  • 20.
  • 21. Large Document (Feed) Model [Q] <?xml… … </…> `<?xml… … </…> <?xml… … </…> <?xml… <feed> <entry> <entry> <entry> <entry> <entry> … </…> <?xml… … </…> <?xml… … </…> <?xml… … </…> <?xml… <feed> <entry> <entry> <entry> <entry> <entry> … </…> Feed Document Collection Ranked Feeds Rank by Indri’s standard retrieval model [Metzler and Croft, 2004; 2005]
  • 22.
  • 23. Small Document (Entry) Model Ranked Entries [Q] <entry> <entry> <entry> <entry> <?xml… <entry> Entry Document Collection <entry> <entry> <entry> <entry> <?xml… <entry> <entry> <entry> <entry> <entry> <?xml… <entry> <entry> <entry> <entry> <entry> <?xml… <entry> <entry> <entry> <entry> <entry> <?xml… <entry> <entry> <entry> <entry> <entry> <?xml… <entry> <entry> <entry> <entry> <entry> <?xml… <entry> Ranked Feeds document = entry Apply some rank aggregation function Rank By
  • 24.
  • 25.
  • 26.
  • 28.
  • 29. Retrieval Model Results Mean Average Precision Large Document (Feed) Model Small Document (Entry) Models
  • 30. Retrieval Model Results Mean Average Precision Uniform Log(Feed Length) Uniform Log Prior Map 0.188
  • 31. Retrieval Model Results Mean Average Precision Uniform Log(Feed Length) Uniform n/a
  • 32.
  • 33. Query Expansion (PRF) [Q] BLOG06 Collection Related Terms from top K documents [Q + Terms] [Lavrenko & Croft, 2001]
  • 34.
  • 35. Feedback Model Results Mean Average Precision None PRF
  • 36. Query Expansion (Wikipedia PRF) [Q] BLOG06 Collection [Q + Terms] [Lavrenko & Croft, 2001] Wikipedia [Diaz & Metzler, 2006] Related Terms from top K documents
  • 37.
  • 38. Feedback Model Results Mean Average Precision None PRF Wiki. PRF
  • 39. Query Expansion (Wikipedia Link) [Q] BLOG06 Collection [Q + Terms] Wikipedia Related Terms from link structure
  • 42. Wikipedia Link-Based Expansion … Relevance Set, Top R = 100 Working Set, Top W = 1000 Q Wikipedia
  • 43. Wikipedia Link-Based Expansion … Wikipedia Q Relevance Set, Top R = 100 Working Set, Top W = 1000
  • 44. Wikipedia Link-Based Expansion Relevance Set, Top R = 100 Working Set, Top W = 1000 … Wikipedia Extract anchor text from Working Set that link to the Relevance Set . Q
  • 45. Wikipedia Link-Based Expansion Relevance Set, Top R = 500 Working Set, Top W = 1000 … Wikipedia Extract anchor text from Working Set that link to the Relevance Set . Q Combines relevance and popularity Relevance: An anchor phrase that links to a high ranked article gets a high score Popularity: An anchor phrase that links many times to a mid-ranked articles also gets high score
  • 46.
  • 47. Feedback Model Results Mean Average Precision None PRF Wiki. PRF Wiki. Link
  • 48.
  • 49. Thank You! Student Travel Grant funding from: ACM SIGIR, Amit Singhal, Microsoft Research
  • 50. Entry Centrality GM Derivation where Entry Generation Likelihood: |E|
  • 51.
  • 52.
  • 53.
  • 54.
  • 55. Relevance Set Cohesiveness … Relevance Set, Top R = 100 Wikipedia Cohesiveness = | L in | | L in U L out |
  • 57.