SlideShare a Scribd company logo
1 of 32
Introduction to basics of
Search and Relevancy
with Apache Solr

 FEATURING:



              Mark Bennett, CTO
Agenda

            • Prerequisites: Browser Tricks
            • Web “Command Line”
            • The DisMax Parser
            • Boosting Formula
            • Explaining “Explain”
            • Check Your Index!
            • Q&A
            • Resources / About NIE


12/2/2009                         Lucid Imagination, Inc.   2
Prerequisite:
            Some Browser Tricks




12/2/2009           Lucid Imagination, Inc.   3
Browsers Matter – install them all!


Firefox:                                                 IE and Safari:
             • Default XML Rendering                               • Better “Explain”
               • (also some versions of IE)
                                                                     copy & paste

             • Lots of Plugins                                       maintains line
                                                                     breaks
                                                                   • Better table copy
                                                                     and paste




 12/2/2009                               Lucid Imagination, Inc.                         4
Larger Firefox “Command Line”




Customize the Firefox
URL box as a command
line in 3 easy steps
 1. Toolbar: Right Click
 2. Customize… Add New Toolbar
 3. URL bar ->CLICK and DRAG




                                 Lucid Imagination, Inc.   5
Turn off Solr HTTP Caching



            • Change in solrconfig.xml
              • Disable the http304 section
            • Turn it back on before you deploy!




12/2/2009                          Lucid Imagination, Inc.   6
Understanding Solr’s
            “Web Command Line”




12/2/2009            Lucid Imagination, Inc.   7
The “Web Command Line”
      CLI CONCEPT                                     SOLR EQUIVALENT
  • Command Prompt                                     URL bar
  • -o or --foo bar                                        ? or & and =
  • (spaces)                                               +
  • some punctuation                                       %nn
  • output                                                 XML or HTML
  • Command line “adapter”                                 Curl
        • Script files can
          call URLs
        • Not built into
          Windows – try cygwin

12/2/2009                        Lucid Imagination, Inc.                  8
Solr “Command Line”

            • Typical Base URL
              • http://localhost:8983/solr/select?...
            • Basic Input (not counting dismax)
              • q = query, fq = filter query
              • df = default field
              • qt = query type (standard / dismax)
            • Controlling Output (lots more!!!)
              •   debugQuery = true
              •   wt = “what type” (actually “writer type”)
              •   standard/XML, xslt (with tr=), javabin, json…
              •   fl = *,score (which fields)

12/2/2009                              Lucid Imagination, Inc.    9
Example: search for “solr”

 http://localhost:8983/solr/select?q=solr&debugQuery=true
With
Firefox
you get XML
output you
can expand
and collapse



 With MSIE* and Safari,
 not so much




              * Some versions

  12/2/2009                     Lucid Imagination, Inc.     10
Detailed Debug & Explain Output


http://localhost:8983/solr/select?q=solr&debugQuery=true
              <str name="parsedquery">text:solr</str>
              …
            <lst name="explain">
             <str name="SOLR1000">
               0.6368716 = (MATCH) fieldWeight(text:solr in 13), product of:
                     1.4142135 = tf(termFreq(text:solr)=2)
                     3.6026897 = idf(docFreq=1, numDocs=26)
               0.125 = fieldNorm(field=text, doc=13)
             </str>
            </lst>



12/2/2009                                  Lucid Imagination, Inc.             11
A look at the
            DisMax query parser




12/2/2009           Lucid Imagination, Inc.   12
Solr DisMax: Defined

            • What is it?
               • Dis-joint text (Multiple fields)
               • Max-imum match (score)
            • How do you get it?
              •   Configured in:
                    •   solrconfig.xml and schema.xml
              •   Called with:
                    •   qt=dismax
              •   Adjusted with:
                    •   mm, bf, qf, pf, qs, ps, tie


12/2/2009                                Lucid Imagination, Inc.   13
Solr DisMax: Pros and Cons

   General Benefits
     • Multiple Fields
     • Multiple Relevancy Rules
     • Great for Freshness / Popularity
   Issues to be Aware of
     • Tie-in between schema.xml & solrconfig.xml
     • Trouble with some CJK (Chinese, Japanese, Korean)
     • Limited wildcard / field / range support
     • Difficult to customize and debug
     • Trouble with shingles
     • Understand mm!

                               Lucid Imagination, Inc.     14
About the “dis” and the “max”

   Distributed across multiple fields
     • Breakup query into words
     • Each part becomes field clause
     • Like an OR but with extra credit
   Takes the Maximum of each set
     • Word 1 had highest score in Title
     • Word 2 very dense in the doc body
     • Adds in Tie breaker if in multiple fields




                                Lucid Imagination, Inc.   15
Coming soon: Extended DisMax

   Improvements
     • Flexible case Boolean ops: AND/and, OR/or
     • Auto-escape punctuation & -> &, etc.
     • Improved Proximity Boosting (via word bigrams)
     • Other changes in stop words, relevancy calc, URL arguments
   How to get it
     • Post 1.4 patch, planned for 1.5
     • Details + Patch in JIRA: SOLR-1553
          http://issues.apache.org/jira/browse/SOLR-1553
     • TBD: change URL option qt=edismax (or qt=dismax )


                                Lucid Imagination, Inc.             16
Boosting Formulas




12/2/2009          Lucid Imagination, Inc.   17
Boost Functions in Dismax
   High Level Feature
     • Numeric functions for scoring
        • sum(), product(), sqrt(), log(), etc.

     • Boost on recent dates, user popularity
   Good Combination: Reverse-Ordinal & Reciprocal
     • Position in index : ord(), reverse is: rord()
     • Larger y for smaller x: recip()
   How to get it
     • URL parameter bf = “boost function”
     • Configured in solrconfig.xml
     • See http://wiki.apache.org/solr/FunctionQuery

                                    Lucid Imagination, Inc.   18
“Freshness”: Boosting Recent Dates
                                mx+c         a / mx+c             WIKI EXAMPLE:
          Position N-Position    Linear
     Date    ord()     rord()   (x,m,c) recip(x,m,a,c)
                                                                  recip( rord(creationDate), 1, 1000, 1000 )
                                                                        slope      m      1
 1/1/2000        1       120     1120         0.89286
                                                                  numerator        a   1000
 2/1/2000        2       119     1119         0.89366
                                                                    intercept      c   1000   (aka "b")
 3/1/2000        3       118     1118         0.89445
                                                          1.000
       …        …          …         …              …
 1/1/2005       61        60     1060         0.94340
                                                          0.980
       …        …          …         …              …
 1/1/2009     109         12     1012         0.98814     0.960
 2/1/2009     110         11     1011         0.98912
 3/1/2009     111         10     1010         0.99010     0.940

 4/1/2009     112          9     1009         0.99108
                                                          0.920
 5/1/2009     113          8     1008         0.99206
 6/1/2009     114          7     1007         0.99305
                                                          0.900
 7/1/2009     115          6     1006         0.99404
 8/1/2009     116          5     1005         0.99502     0.880
 9/1/2009     117          4     1004         0.99602
10/1/2009     118          3     1003         0.99701
11/1/2009     119          2     1002         0.99800
12/1/2009     120          1     1001         0.99900
                                                         Lucid Imagination, Inc.                               19
Sifting through
            Solr’s “Explain” output




12/2/2009            Lucid Imagination, Inc.   20
DisMax Example for “solr”
INPUT:
http://localhost:8983/solr
/select?q=solr&debugQuery=true&qt=dismax

DEBUG OUTPUT: (1 OF 2)

   <str name="parsedquery">
   +DisjunctionMaxQuery((id:solr^10.0 | text:solr^0.5 | cat:solr^1.4 |
      manu:solr^1.1 | name:solr^1.2 | features:solr | sku:solr^1.5)~0.01)
      DisjunctionMaxQuery((manu_exact:solr^1.9 | features:solr^1.1 |
      text:solr^0.2 | manu:solr^1.4 | name:solr^1.5)~0.01)
      FunctionQuery((top(ord(popularity)))^0.5)
      FunctionQuery((1000.0/(1.0*float(top(rord(price)))+1000.0))^0.3)
   </str>

 12/2/2009                         Lucid Imagination, Inc.                  21
DisMax explain output
        for a single word query
<lst name="explain">                                               3.6026897 = (MATCH) fieldWeight(sku:solr in 13), product of:     0.125 = fieldNorm(field=text, doc=13)
 <str name="SOLR1000">                                                 1.0 = tf(termFreq(sku:solr)=1)                                 0.22260013 = (MATCH) weight(name:solr^1.5
0.74609417 = (MATCH) sum of:                                           3.6026897 = idf(docFreq=1, numDocs=26)                               in 13), product of:
 0.4476144 = (MATCH) max plus 0.01 times others of:                    1.0 = fieldNorm(field=sku, doc=13)                              0.12357441 = queryWeight(name:solr^1.5),
  0.026233677 = (MATCH) weight(text:solr^0.5 in 13), product of: 1.0 = tf(termFreq(features:solr)=1)                                        product of:
    0.04119147 = queryWeight(text:solr^0.5), product of:               3.6026897 = idf(docFreq=1, numDocs=26)                           1.5 = boost
     0.5 = boost                                                       0.125 = fieldNorm(field=features, doc=13)                        3.6026897 = idf(docFreq=1, numDocs=26)
     3.6026897 = idf(docFreq=1, numDocs=26)                          0.44520026 = (MATCH) weight(sku:solr^1.5 in 13), product of:       0.022867065 = queryNorm
     0.022867065 = queryNorm                                          0.12357441 = queryWeight(sku:solr^1.5), product of:              1.8013449 = (MATCH) fieldWeight(name:solr
    0.6368716 = (MATCH) fieldWeight(text:solr in 13), product of:      1.5 = boost                                                          in 13), product of:
     1.4142135 = tf(termFreq(text:solr)=2)                             3.6026897 = idf(docFreq=1, numDocs=26)                           1.0 = tf(termFreq(name:solr)=1)
     3.6026897 = idf(docFreq=1, numDocs=26)                            0.022867065 = queryNorm                                          3.6026897 = idf(docFreq=1, numDocs=26)
     0.125 = fieldNorm(field=text, doc=13)                            3.6026897 = (MATCH) fieldWeight(sku:solr in 13), product of:      0.5 = fieldNorm(field=name, doc=13)
  0.17808011 = (MATCH) weight(name:solr^1.2 in 13), product of:        1.0 = tf(termFreq(sku:solr)=1)                                0.06860119 = (MATCH)
    0.09885953 = queryWeight(name:solr^1.2), product of:               3.6026897 = idf(docFreq=1, numDocs=26)                               FunctionQuery(top(ord(popularity))),
     1.2 = boost                                                       1.0 = fieldNorm(field=sku, doc=13)                                   product of:
     3.6026897 = idf(docFreq=1, numDocs=26)                         0.22311316 = (MATCH) max plus 0.01 times others of:               6.0 = ord(popularity)=6
     0.022867065 = queryNorm                                         0.040810023 = (MATCH) weight(features:solr^1.1 in 13),           0.5 = boost
    1.8013449 = (MATCH) fieldWeight(name:solr in 13), product of:         product of:                                                 0.022867065 = queryNorm
     1.0 = tf(termFreq(name:solr)=1)                                  0.09062123 = queryWeight(features:solr^1.1), product of:       0.0067654043 = (MATCH)
     3.6026897 = idf(docFreq=1, numDocs=26)                            1.1 = boost                                                          FunctionQuery(1000.0/(1.0*float(top(ror
     0.5 = fieldNorm(field=name, doc=13)                               3.6026897 = idf(docFreq=1, numDocs=26)                               d(price)))+1000.0)), product of:
  0.03710002 = (MATCH) weight(features:solr in 13), product of:        0.022867065 = queryNorm                                        0.9861933 =
    0.08238294 = queryWeight(features:solr), product of:              0.45033622 = (MATCH) fieldWeight(features:solr in 13),                1000.0/(1.0*float(rord(price)=14)+1000.0
     3.6026897 = idf(docFreq=1, numDocs=26)                               product of:                                                       )
     0.022867065 = queryNorm                                           1.0 = tf(termFreq(features:solr)=1)                            0.3 = boost
    0.45033622 = (MATCH) fieldWeight(features:solr in 13), product of: 3.6026897 = idf(docFreq=1, numDocs=26)                         0.022867065 = queryNorm
        1.0 = tf(termFreq(features:solr)=1)                            0.125 = fieldNorm(field=features, doc=13)                     </str>
     3.6026897 = idf(docFreq=1, numDocs=26)                          0.01049347 = (MATCH) weight(text:solr^0.2 in 13), product of: </lst>
     0.125 = fieldNorm(field=features, doc=13)                        0.016476588 = queryWeight(text:solr^0.2), product of:
  0.44520026 = (MATCH) weight(sku:solr^1.5 in 13), product of:         0.2 = boost
    0.12357441 = queryWeight(sku:solr^1.5), product of:                3.6026897 = idf(docFreq=1, numDocs=26)
     1.5 = boost                                                       0.022867065 = queryNorm
     3.6026897 = idf(docFreq=1, numDocs=26)                           0.6368716 = (MATCH) fieldWeight(text:solr in 13), product of:
     0.022867065 = queryNorm                                           1.4142135 = tf(termFreq(text:solr)=2)
                                                                       3.6026897 = idf(docFreq=1, numDocs=26)



          12/2/2009                                                                    Lucid Imagination, Inc.                                                                         22
“Explain” example:

...
0.026233677 = (MATCH) weight(text:solr^0.5 in 13), product of:
  0.04119147 = queryWeight(text:solr^0.5), product of:
    0.5 = boost
    3.6026897 = idf(docFreq=1, numDocs=26)
    0.022867065 = queryNorm
  0.6368716 = (MATCH) fieldWeight(text:solr in 13), product of:
    1.4142135 = tf(termFreq(text:solr)=2)
                                                                       tf (termFreq(text:solr )=2)
    3.6026897 = idf(docFreq=1, numDocs=26)
    0.125 = fieldNorm(field=text, doc=13)
0.17808011 = (MATCH) weight(name:solr^1.2 in 13), product of:
                                                                       idf (docFreq=1,numDocs=26)
  0.09885953 = queryWeight(name:solr^1.2), product of:
    1.2 = boost
    3.6026897 = idf(docFreq=1, numDocs=26)
    0.022867065 = queryNorm
  1.8013449 = (MATCH) fieldWeight(name:solr in 13), product of:
    1.0 = tf(termFreq(name:solr)=1)
    3.6026897 = idf(docFreq=1, numDocs=26)
    0.5 = fieldNorm(field=name, doc=13)
0.03710002 = (MATCH) weight(features:solr in 13), product of:
  0.08238294 = queryWeight(features:solr), product of:
    3.6026897 = idf(docFreq=1, numDocs=26)
    0.022867065 = queryNorm
  0.45033622 = (MATCH) fieldWeight(features:solr in 13), product of:
    1.0 = tf(termFreq(features:solr)=1)
    3.6026897 = idf(docFreq=1, numDocs=26)
    0.125 = fieldNorm(field=features, doc=13)
...


12/2/2009                                                                Lucid Imagination, Inc.     23
Solr’s XSLT “debugger”
            http://localhost:8983/solr/select?
               q=solr
               &debugQuery=true
               &wt=xslt
               &tr=example.xsl
               &fl=*,score
               &qt=dismax




12/2/2009                                        Lucid Imagination, Inc.   24
Another way to view Explain data


   • Solr1.4 has Solritas
       • Various features, including toggle explain display
       • “Some assembly required…”

   http://www.lucidimagination.com/blog/2009/11/04/solritas-solr-1-4s-hidden-gem/




                                          Lucid Imagination, Inc.                   25
Checking your Index and IDF




12/2/2009               Lucid Imagination, Inc.   26
Checking what got Indexed

   Bad Index = Bad Search
     • Check Upper / lower case and Punctuation
     • Bad Fields / Meta Data = Bad Facets, Filters, Sorting
   Use built-in Schema Browser:
     • Check each field
     • Common words =
        • IDF “Inverse Document Frequency”




                                Lucid Imagination, Inc.        27
Check IDF w/ the Schema Browser
Start at the Admin Screen:
http://localhost:8983/solr/admin


Schema Browser
   • select a field
   • change #
     to see more




                                   Lucid Imagination, Inc.
About NIE
            New Idea Engineering




12/2/2009           Lucid Imagination, Inc.   29
NIE Resources




Newsletter & Whitepapers:       Search Dev Newsgroup:
www.ideaeng.com/current         www.SearchDev.org

Blogs:
EnterpriseSearchBlog.com
SearchComponentsOnline.com




 12/2/2009                   Lucid Imagination, Inc.    30
Finish Line / Q & A

            Review & Questions




                  Mark Bennett mbennett@ideaeng.com
                          main 408-446-3460
                           cell 408-829-6513



12/2/2009                   Lucid Imagination, Inc.   31
Q&A

            These slides and a recorded presentation are available at

            bit.ly/SolrRelevancy
12/2/2009                         Lucid Imagination, Inc.

More Related Content

What's hot

Mastering solr
Mastering solrMastering solr
Mastering solrjurcello
 
Solr Application Development Tutorial
Solr Application Development TutorialSolr Application Development Tutorial
Solr Application Development TutorialErik Hatcher
 
Solr Query Parsing
Solr Query ParsingSolr Query Parsing
Solr Query ParsingErik Hatcher
 
Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!Murshed Ahmmad Khan
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with SolrErik Hatcher
 
Make your gui shine with ajax solr
Make your gui shine with ajax solrMake your gui shine with ajax solr
Make your gui shine with ajax solrlucenerevolution
 
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...Lucidworks
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache SolrChristos Manios
 
Rapid Solr Schema Development (Phone directory)
Rapid Solr Schema Development (Phone directory)Rapid Solr Schema Development (Phone directory)
Rapid Solr Schema Development (Phone directory)Alexandre Rafalovitch
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr WorkshopJSGB
 
code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)Erik Hatcher
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr DevelopersErik Hatcher
 
Tutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component pluginTutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component pluginsearchbox-com
 
Enterprise Search Solution: Apache SOLR. What's available and why it's so cool
Enterprise Search Solution: Apache SOLR. What's available and why it's so coolEnterprise Search Solution: Apache SOLR. What's available and why it's so cool
Enterprise Search Solution: Apache SOLR. What's available and why it's so coolEcommerce Solution Provider SysIQ
 
Apache Solr - Enterprise search platform
Apache Solr - Enterprise search platformApache Solr - Enterprise search platform
Apache Solr - Enterprise search platformTommaso Teofili
 
Solr Indexing and Analysis Tricks
Solr Indexing and Analysis TricksSolr Indexing and Analysis Tricks
Solr Indexing and Analysis TricksErik Hatcher
 
New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache SolrEdureka!
 

What's hot (20)

Mastering solr
Mastering solrMastering solr
Mastering solr
 
Solr Application Development Tutorial
Solr Application Development TutorialSolr Application Development Tutorial
Solr Application Development Tutorial
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
 
Solr Query Parsing
Solr Query ParsingSolr Query Parsing
Solr Query Parsing
 
Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!
 
JSON in Solr: from top to bottom
JSON in Solr: from top to bottomJSON in Solr: from top to bottom
JSON in Solr: from top to bottom
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Make your gui shine with ajax solr
Make your gui shine with ajax solrMake your gui shine with ajax solr
Make your gui shine with ajax solr
 
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
Rapid Solr Schema Development (Phone directory)
Rapid Solr Schema Development (Phone directory)Rapid Solr Schema Development (Phone directory)
Rapid Solr Schema Development (Phone directory)
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
 
code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 
Tutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component pluginTutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component plugin
 
Enterprise Search Solution: Apache SOLR. What's available and why it's so cool
Enterprise Search Solution: Apache SOLR. What's available and why it's so coolEnterprise Search Solution: Apache SOLR. What's available and why it's so cool
Enterprise Search Solution: Apache SOLR. What's available and why it's so cool
 
Solr Presentation
Solr PresentationSolr Presentation
Solr Presentation
 
Apache Solr - Enterprise search platform
Apache Solr - Enterprise search platformApache Solr - Enterprise search platform
Apache Solr - Enterprise search platform
 
Solr Indexing and Analysis Tricks
Solr Indexing and Analysis TricksSolr Indexing and Analysis Tricks
Solr Indexing and Analysis Tricks
 
New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache Solr
 

Viewers also liked

Practical Text Mining with SQL using Relational Databases
Practical Text Mining with SQL using Relational DatabasesPractical Text Mining with SQL using Relational Databases
Practical Text Mining with SQL using Relational DatabasesRalph Winters
 
Understanding and visualizing solr explain information - Rafal Kuc
Understanding and visualizing solr explain information - Rafal KucUnderstanding and visualizing solr explain information - Rafal Kuc
Understanding and visualizing solr explain information - Rafal Kuclucenerevolution
 
OUG Ireland Meet-up - Updates from Oracle Open World 2016
OUG Ireland Meet-up - Updates from Oracle Open World 2016OUG Ireland Meet-up - Updates from Oracle Open World 2016
OUG Ireland Meet-up - Updates from Oracle Open World 2016Brendan Tierney
 
OUG Ireland Meet-up 12th January
OUG Ireland Meet-up 12th JanuaryOUG Ireland Meet-up 12th January
OUG Ireland Meet-up 12th JanuaryBrendan Tierney
 
Proposal for nested document support in Lucene
Proposal for nested document support in LuceneProposal for nested document support in Lucene
Proposal for nested document support in LuceneMark Harwood
 
Grouping and Joining in Lucene/Solr
Grouping and Joining in Lucene/SolrGrouping and Joining in Lucene/Solr
Grouping and Joining in Lucene/Solrlucenerevolution
 
apprenticeship-levy-summary-5may2016 (1)
apprenticeship-levy-summary-5may2016 (1)apprenticeship-levy-summary-5may2016 (1)
apprenticeship-levy-summary-5may2016 (1)David Ritchie
 
Justin J. Dunne Resume
Justin J. Dunne ResumeJustin J. Dunne Resume
Justin J. Dunne ResumeJustin Dunne
 
shared-ownership-21_FINAL
shared-ownership-21_FINALshared-ownership-21_FINAL
shared-ownership-21_FINALChristoph Sinn
 
Introduction to map reduce
Introduction to map reduceIntroduction to map reduce
Introduction to map reduceBhupesh Chawda
 
Overview of running R in the Oracle Database
Overview of running R in the Oracle DatabaseOverview of running R in the Oracle Database
Overview of running R in the Oracle DatabaseBrendan Tierney
 
An Introduction to Map/Reduce with MongoDB
An Introduction to Map/Reduce with MongoDBAn Introduction to Map/Reduce with MongoDB
An Introduction to Map/Reduce with MongoDBRainforest QA
 
Innovate Analytics with Oracle Data Mining & Oracle R
Innovate Analytics with Oracle Data Mining & Oracle RInnovate Analytics with Oracle Data Mining & Oracle R
Innovate Analytics with Oracle Data Mining & Oracle RCapgemini
 
Oracle Performance Tools of the Trade
Oracle Performance Tools of the TradeOracle Performance Tools of the Trade
Oracle Performance Tools of the TradeCarlos Sierra
 
Oracle Advanced Analytics
Oracle Advanced AnalyticsOracle Advanced Analytics
Oracle Advanced Analyticsaghosh_us
 
Overview of Hadoop and HDFS
Overview of Hadoop and HDFSOverview of Hadoop and HDFS
Overview of Hadoop and HDFSBrendan Tierney
 
Gitora, Version Control for PL/SQL
Gitora, Version Control for PL/SQLGitora, Version Control for PL/SQL
Gitora, Version Control for PL/SQLGerger
 

Viewers also liked (20)

Practical Text Mining with SQL using Relational Databases
Practical Text Mining with SQL using Relational DatabasesPractical Text Mining with SQL using Relational Databases
Practical Text Mining with SQL using Relational Databases
 
Understanding and visualizing solr explain information - Rafal Kuc
Understanding and visualizing solr explain information - Rafal KucUnderstanding and visualizing solr explain information - Rafal Kuc
Understanding and visualizing solr explain information - Rafal Kuc
 
OUG Ireland Meet-up - Updates from Oracle Open World 2016
OUG Ireland Meet-up - Updates from Oracle Open World 2016OUG Ireland Meet-up - Updates from Oracle Open World 2016
OUG Ireland Meet-up - Updates from Oracle Open World 2016
 
OUG Ireland Meet-up 12th January
OUG Ireland Meet-up 12th JanuaryOUG Ireland Meet-up 12th January
OUG Ireland Meet-up 12th January
 
Proposal for nested document support in Lucene
Proposal for nested document support in LuceneProposal for nested document support in Lucene
Proposal for nested document support in Lucene
 
Grouping and Joining in Lucene/Solr
Grouping and Joining in Lucene/SolrGrouping and Joining in Lucene/Solr
Grouping and Joining in Lucene/Solr
 
apprenticeship-levy-summary-5may2016 (1)
apprenticeship-levy-summary-5may2016 (1)apprenticeship-levy-summary-5may2016 (1)
apprenticeship-levy-summary-5may2016 (1)
 
Justin J. Dunne Resume
Justin J. Dunne ResumeJustin J. Dunne Resume
Justin J. Dunne Resume
 
shared-ownership-21_FINAL
shared-ownership-21_FINALshared-ownership-21_FINAL
shared-ownership-21_FINAL
 
Introduction to map reduce
Introduction to map reduceIntroduction to map reduce
Introduction to map reduce
 
1z0 591
1z0 5911z0 591
1z0 591
 
Final Paper
Final PaperFinal Paper
Final Paper
 
An Introduction To Map-Reduce
An Introduction To Map-ReduceAn Introduction To Map-Reduce
An Introduction To Map-Reduce
 
Overview of running R in the Oracle Database
Overview of running R in the Oracle DatabaseOverview of running R in the Oracle Database
Overview of running R in the Oracle Database
 
An Introduction to Map/Reduce with MongoDB
An Introduction to Map/Reduce with MongoDBAn Introduction to Map/Reduce with MongoDB
An Introduction to Map/Reduce with MongoDB
 
Innovate Analytics with Oracle Data Mining & Oracle R
Innovate Analytics with Oracle Data Mining & Oracle RInnovate Analytics with Oracle Data Mining & Oracle R
Innovate Analytics with Oracle Data Mining & Oracle R
 
Oracle Performance Tools of the Trade
Oracle Performance Tools of the TradeOracle Performance Tools of the Trade
Oracle Performance Tools of the Trade
 
Oracle Advanced Analytics
Oracle Advanced AnalyticsOracle Advanced Analytics
Oracle Advanced Analytics
 
Overview of Hadoop and HDFS
Overview of Hadoop and HDFSOverview of Hadoop and HDFS
Overview of Hadoop and HDFS
 
Gitora, Version Control for PL/SQL
Gitora, Version Control for PL/SQLGitora, Version Control for PL/SQL
Gitora, Version Control for PL/SQL
 

Similar to An Introduction to Basics of Search and Relevancy with Apache Solr

MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsServer Density
 
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systexJames Chen
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...MongoDB
 
How we got to 1 millisecond latency in 99% under repair, compaction, and flus...
How we got to 1 millisecond latency in 99% under repair, compaction, and flus...How we got to 1 millisecond latency in 99% under repair, compaction, and flus...
How we got to 1 millisecond latency in 99% under repair, compaction, and flus...ScyllaDB
 
Performance & Scalability Improvements in Perforce
Performance & Scalability Improvements in PerforcePerformance & Scalability Improvements in Perforce
Performance & Scalability Improvements in PerforcePerforce
 
Using Solr in Online Travel Shopping to Improve User Experience
Using Solr in Online Travel Shopping to Improve User ExperienceUsing Solr in Online Travel Shopping to Improve User Experience
Using Solr in Online Travel Shopping to Improve User ExperienceLucidworks (Archived)
 
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...ScyllaDB
 
Solr Powered Lucene
Solr Powered LuceneSolr Powered Lucene
Solr Powered LuceneErik Hatcher
 
Solr on Windows: Does it Work? Does it Scale? - Teun Duynstee
Solr on Windows: Does it Work? Does it Scale? - Teun DuynsteeSolr on Windows: Does it Work? Does it Scale? - Teun Duynstee
Solr on Windows: Does it Work? Does it Scale? - Teun Duynsteelucenerevolution
 
Apache Solr 1.4 – Faster, Easier, and More Versatile than Ever
Apache Solr 1.4 – Faster, Easier, and More Versatile than EverApache Solr 1.4 – Faster, Easier, and More Versatile than Ever
Apache Solr 1.4 – Faster, Easier, and More Versatile than EverLucidworks (Archived)
 
Designing High Performance RTC Signaling Servers
Designing High Performance RTC Signaling ServersDesigning High Performance RTC Signaling Servers
Designing High Performance RTC Signaling ServersDaniel-Constantin Mierla
 
Drupalcon2007 Sun
Drupalcon2007 SunDrupalcon2007 Sun
Drupalcon2007 Sunsmattoon
 
Operating MongoDB in the Cloud
Operating MongoDB in the CloudOperating MongoDB in the Cloud
Operating MongoDB in the CloudMongoDB
 
DevLOVE Beautiful Development - 第一幕 陽の巻
DevLOVE Beautiful Development - 第一幕 陽の巻DevLOVE Beautiful Development - 第一幕 陽の巻
DevLOVE Beautiful Development - 第一幕 陽の巻都元ダイスケ Miyamoto
 
Famo.us introduction
Famo.us introductionFamo.us introduction
Famo.us introductionAllen Wu
 
MongoDB at Scale
MongoDB at ScaleMongoDB at Scale
MongoDB at ScaleMongoDB
 
Gustavo Garnica: Evolución de la Plataforma Java y lo que Significa para Ti
Gustavo Garnica: Evolución de la Plataforma Java y lo que Significa para TiGustavo Garnica: Evolución de la Plataforma Java y lo que Significa para Ti
Gustavo Garnica: Evolución de la Plataforma Java y lo que Significa para TiSoftware Guru
 
Simplify your integrations with Apache Camel
Simplify your integrations with Apache CamelSimplify your integrations with Apache Camel
Simplify your integrations with Apache CamelKenneth Peeples
 

Similar to An Introduction to Basics of Search and Relevancy with Apache Solr (20)

MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
 
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
 
How we got to 1 millisecond latency in 99% under repair, compaction, and flus...
How we got to 1 millisecond latency in 99% under repair, compaction, and flus...How we got to 1 millisecond latency in 99% under repair, compaction, and flus...
How we got to 1 millisecond latency in 99% under repair, compaction, and flus...
 
Performance & Scalability Improvements in Perforce
Performance & Scalability Improvements in PerforcePerformance & Scalability Improvements in Perforce
Performance & Scalability Improvements in Perforce
 
Using Solr in Online Travel Shopping to Improve User Experience
Using Solr in Online Travel Shopping to Improve User ExperienceUsing Solr in Online Travel Shopping to Improve User Experience
Using Solr in Online Travel Shopping to Improve User Experience
 
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
 
Solr Powered Lucene
Solr Powered LuceneSolr Powered Lucene
Solr Powered Lucene
 
Drupal 7 ninja theming
Drupal 7 ninja themingDrupal 7 ninja theming
Drupal 7 ninja theming
 
Solr on Windows: Does it Work? Does it Scale? - Teun Duynstee
Solr on Windows: Does it Work? Does it Scale? - Teun DuynsteeSolr on Windows: Does it Work? Does it Scale? - Teun Duynstee
Solr on Windows: Does it Work? Does it Scale? - Teun Duynstee
 
Apache Solr 1.4 – Faster, Easier, and More Versatile than Ever
Apache Solr 1.4 – Faster, Easier, and More Versatile than EverApache Solr 1.4 – Faster, Easier, and More Versatile than Ever
Apache Solr 1.4 – Faster, Easier, and More Versatile than Ever
 
Solr 3.1 and beyond
Solr 3.1 and beyondSolr 3.1 and beyond
Solr 3.1 and beyond
 
Designing High Performance RTC Signaling Servers
Designing High Performance RTC Signaling ServersDesigning High Performance RTC Signaling Servers
Designing High Performance RTC Signaling Servers
 
Drupalcon2007 Sun
Drupalcon2007 SunDrupalcon2007 Sun
Drupalcon2007 Sun
 
Operating MongoDB in the Cloud
Operating MongoDB in the CloudOperating MongoDB in the Cloud
Operating MongoDB in the Cloud
 
DevLOVE Beautiful Development - 第一幕 陽の巻
DevLOVE Beautiful Development - 第一幕 陽の巻DevLOVE Beautiful Development - 第一幕 陽の巻
DevLOVE Beautiful Development - 第一幕 陽の巻
 
Famo.us introduction
Famo.us introductionFamo.us introduction
Famo.us introduction
 
MongoDB at Scale
MongoDB at ScaleMongoDB at Scale
MongoDB at Scale
 
Gustavo Garnica: Evolución de la Plataforma Java y lo que Significa para Ti
Gustavo Garnica: Evolución de la Plataforma Java y lo que Significa para TiGustavo Garnica: Evolución de la Plataforma Java y lo que Significa para Ti
Gustavo Garnica: Evolución de la Plataforma Java y lo que Significa para Ti
 
Simplify your integrations with Apache Camel
Simplify your integrations with Apache CamelSimplify your integrations with Apache Camel
Simplify your integrations with Apache Camel
 

More from Lucidworks (Archived)

Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Lucidworks (Archived)
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and SolrLucidworks (Archived)
 
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessLucidworks (Archived)
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceLucidworks (Archived)
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineLucidworks (Archived)
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchLucidworks (Archived)
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrLucidworks (Archived)
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchLucidworks (Archived)
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Lucidworks (Archived)
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...Lucidworks (Archived)
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Lucidworks (Archived)
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCLucidworks (Archived)
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCLucidworks (Archived)
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCLucidworks (Archived)
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCLucidworks (Archived)
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCLucidworks (Archived)
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKLucidworks (Archived)
 

More from Lucidworks (Archived) (20)

Integrating Hadoop & Solr
Integrating Hadoop & SolrIntegrating Hadoop & Solr
Integrating Hadoop & Solr
 
The Data-Driven Paradigm
The Data-Driven ParadigmThe Data-Driven Paradigm
The Data-Driven Paradigm
 
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
 
What's new in solr june 2014
What's new in solr june 2014What's new in solr june 2014
What's new in solr june 2014
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DC
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLK
 

Recently uploaded

React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 

Recently uploaded (20)

React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 

An Introduction to Basics of Search and Relevancy with Apache Solr

  • 1. Introduction to basics of Search and Relevancy with Apache Solr FEATURING: Mark Bennett, CTO
  • 2. Agenda • Prerequisites: Browser Tricks • Web “Command Line” • The DisMax Parser • Boosting Formula • Explaining “Explain” • Check Your Index! • Q&A • Resources / About NIE 12/2/2009 Lucid Imagination, Inc. 2
  • 3. Prerequisite: Some Browser Tricks 12/2/2009 Lucid Imagination, Inc. 3
  • 4. Browsers Matter – install them all! Firefox: IE and Safari: • Default XML Rendering • Better “Explain” • (also some versions of IE) copy & paste • Lots of Plugins maintains line breaks • Better table copy and paste 12/2/2009 Lucid Imagination, Inc. 4
  • 5. Larger Firefox “Command Line” Customize the Firefox URL box as a command line in 3 easy steps 1. Toolbar: Right Click 2. Customize… Add New Toolbar 3. URL bar ->CLICK and DRAG Lucid Imagination, Inc. 5
  • 6. Turn off Solr HTTP Caching • Change in solrconfig.xml • Disable the http304 section • Turn it back on before you deploy! 12/2/2009 Lucid Imagination, Inc. 6
  • 7. Understanding Solr’s “Web Command Line” 12/2/2009 Lucid Imagination, Inc. 7
  • 8. The “Web Command Line” CLI CONCEPT SOLR EQUIVALENT • Command Prompt URL bar • -o or --foo bar ? or & and = • (spaces) + • some punctuation %nn • output XML or HTML • Command line “adapter” Curl • Script files can call URLs • Not built into Windows – try cygwin 12/2/2009 Lucid Imagination, Inc. 8
  • 9. Solr “Command Line” • Typical Base URL • http://localhost:8983/solr/select?... • Basic Input (not counting dismax) • q = query, fq = filter query • df = default field • qt = query type (standard / dismax) • Controlling Output (lots more!!!) • debugQuery = true • wt = “what type” (actually “writer type”) • standard/XML, xslt (with tr=), javabin, json… • fl = *,score (which fields) 12/2/2009 Lucid Imagination, Inc. 9
  • 10. Example: search for “solr” http://localhost:8983/solr/select?q=solr&debugQuery=true With Firefox you get XML output you can expand and collapse With MSIE* and Safari, not so much * Some versions 12/2/2009 Lucid Imagination, Inc. 10
  • 11. Detailed Debug & Explain Output http://localhost:8983/solr/select?q=solr&debugQuery=true <str name="parsedquery">text:solr</str> … <lst name="explain"> <str name="SOLR1000"> 0.6368716 = (MATCH) fieldWeight(text:solr in 13), product of: 1.4142135 = tf(termFreq(text:solr)=2) 3.6026897 = idf(docFreq=1, numDocs=26) 0.125 = fieldNorm(field=text, doc=13) </str> </lst> 12/2/2009 Lucid Imagination, Inc. 11
  • 12. A look at the DisMax query parser 12/2/2009 Lucid Imagination, Inc. 12
  • 13. Solr DisMax: Defined • What is it? • Dis-joint text (Multiple fields) • Max-imum match (score) • How do you get it? • Configured in: • solrconfig.xml and schema.xml • Called with: • qt=dismax • Adjusted with: • mm, bf, qf, pf, qs, ps, tie 12/2/2009 Lucid Imagination, Inc. 13
  • 14. Solr DisMax: Pros and Cons General Benefits • Multiple Fields • Multiple Relevancy Rules • Great for Freshness / Popularity Issues to be Aware of • Tie-in between schema.xml & solrconfig.xml • Trouble with some CJK (Chinese, Japanese, Korean) • Limited wildcard / field / range support • Difficult to customize and debug • Trouble with shingles • Understand mm! Lucid Imagination, Inc. 14
  • 15. About the “dis” and the “max” Distributed across multiple fields • Breakup query into words • Each part becomes field clause • Like an OR but with extra credit Takes the Maximum of each set • Word 1 had highest score in Title • Word 2 very dense in the doc body • Adds in Tie breaker if in multiple fields Lucid Imagination, Inc. 15
  • 16. Coming soon: Extended DisMax Improvements • Flexible case Boolean ops: AND/and, OR/or • Auto-escape punctuation & -> &, etc. • Improved Proximity Boosting (via word bigrams) • Other changes in stop words, relevancy calc, URL arguments How to get it • Post 1.4 patch, planned for 1.5 • Details + Patch in JIRA: SOLR-1553 http://issues.apache.org/jira/browse/SOLR-1553 • TBD: change URL option qt=edismax (or qt=dismax ) Lucid Imagination, Inc. 16
  • 17. Boosting Formulas 12/2/2009 Lucid Imagination, Inc. 17
  • 18. Boost Functions in Dismax High Level Feature • Numeric functions for scoring • sum(), product(), sqrt(), log(), etc. • Boost on recent dates, user popularity Good Combination: Reverse-Ordinal & Reciprocal • Position in index : ord(), reverse is: rord() • Larger y for smaller x: recip() How to get it • URL parameter bf = “boost function” • Configured in solrconfig.xml • See http://wiki.apache.org/solr/FunctionQuery Lucid Imagination, Inc. 18
  • 19. “Freshness”: Boosting Recent Dates mx+c a / mx+c WIKI EXAMPLE: Position N-Position Linear Date ord() rord() (x,m,c) recip(x,m,a,c) recip( rord(creationDate), 1, 1000, 1000 ) slope m 1 1/1/2000 1 120 1120 0.89286 numerator a 1000 2/1/2000 2 119 1119 0.89366 intercept c 1000 (aka "b") 3/1/2000 3 118 1118 0.89445 1.000 … … … … … 1/1/2005 61 60 1060 0.94340 0.980 … … … … … 1/1/2009 109 12 1012 0.98814 0.960 2/1/2009 110 11 1011 0.98912 3/1/2009 111 10 1010 0.99010 0.940 4/1/2009 112 9 1009 0.99108 0.920 5/1/2009 113 8 1008 0.99206 6/1/2009 114 7 1007 0.99305 0.900 7/1/2009 115 6 1006 0.99404 8/1/2009 116 5 1005 0.99502 0.880 9/1/2009 117 4 1004 0.99602 10/1/2009 118 3 1003 0.99701 11/1/2009 119 2 1002 0.99800 12/1/2009 120 1 1001 0.99900 Lucid Imagination, Inc. 19
  • 20. Sifting through Solr’s “Explain” output 12/2/2009 Lucid Imagination, Inc. 20
  • 21. DisMax Example for “solr” INPUT: http://localhost:8983/solr /select?q=solr&debugQuery=true&qt=dismax DEBUG OUTPUT: (1 OF 2) <str name="parsedquery"> +DisjunctionMaxQuery((id:solr^10.0 | text:solr^0.5 | cat:solr^1.4 | manu:solr^1.1 | name:solr^1.2 | features:solr | sku:solr^1.5)~0.01) DisjunctionMaxQuery((manu_exact:solr^1.9 | features:solr^1.1 | text:solr^0.2 | manu:solr^1.4 | name:solr^1.5)~0.01) FunctionQuery((top(ord(popularity)))^0.5) FunctionQuery((1000.0/(1.0*float(top(rord(price)))+1000.0))^0.3) </str> 12/2/2009 Lucid Imagination, Inc. 21
  • 22. DisMax explain output for a single word query <lst name="explain"> 3.6026897 = (MATCH) fieldWeight(sku:solr in 13), product of: 0.125 = fieldNorm(field=text, doc=13) <str name="SOLR1000"> 1.0 = tf(termFreq(sku:solr)=1) 0.22260013 = (MATCH) weight(name:solr^1.5 0.74609417 = (MATCH) sum of: 3.6026897 = idf(docFreq=1, numDocs=26) in 13), product of: 0.4476144 = (MATCH) max plus 0.01 times others of: 1.0 = fieldNorm(field=sku, doc=13) 0.12357441 = queryWeight(name:solr^1.5), 0.026233677 = (MATCH) weight(text:solr^0.5 in 13), product of: 1.0 = tf(termFreq(features:solr)=1) product of: 0.04119147 = queryWeight(text:solr^0.5), product of: 3.6026897 = idf(docFreq=1, numDocs=26) 1.5 = boost 0.5 = boost 0.125 = fieldNorm(field=features, doc=13) 3.6026897 = idf(docFreq=1, numDocs=26) 3.6026897 = idf(docFreq=1, numDocs=26) 0.44520026 = (MATCH) weight(sku:solr^1.5 in 13), product of: 0.022867065 = queryNorm 0.022867065 = queryNorm 0.12357441 = queryWeight(sku:solr^1.5), product of: 1.8013449 = (MATCH) fieldWeight(name:solr 0.6368716 = (MATCH) fieldWeight(text:solr in 13), product of: 1.5 = boost in 13), product of: 1.4142135 = tf(termFreq(text:solr)=2) 3.6026897 = idf(docFreq=1, numDocs=26) 1.0 = tf(termFreq(name:solr)=1) 3.6026897 = idf(docFreq=1, numDocs=26) 0.022867065 = queryNorm 3.6026897 = idf(docFreq=1, numDocs=26) 0.125 = fieldNorm(field=text, doc=13) 3.6026897 = (MATCH) fieldWeight(sku:solr in 13), product of: 0.5 = fieldNorm(field=name, doc=13) 0.17808011 = (MATCH) weight(name:solr^1.2 in 13), product of: 1.0 = tf(termFreq(sku:solr)=1) 0.06860119 = (MATCH) 0.09885953 = queryWeight(name:solr^1.2), product of: 3.6026897 = idf(docFreq=1, numDocs=26) FunctionQuery(top(ord(popularity))), 1.2 = boost 1.0 = fieldNorm(field=sku, doc=13) product of: 3.6026897 = idf(docFreq=1, numDocs=26) 0.22311316 = (MATCH) max plus 0.01 times others of: 6.0 = ord(popularity)=6 0.022867065 = queryNorm 0.040810023 = (MATCH) weight(features:solr^1.1 in 13), 0.5 = boost 1.8013449 = (MATCH) fieldWeight(name:solr in 13), product of: product of: 0.022867065 = queryNorm 1.0 = tf(termFreq(name:solr)=1) 0.09062123 = queryWeight(features:solr^1.1), product of: 0.0067654043 = (MATCH) 3.6026897 = idf(docFreq=1, numDocs=26) 1.1 = boost FunctionQuery(1000.0/(1.0*float(top(ror 0.5 = fieldNorm(field=name, doc=13) 3.6026897 = idf(docFreq=1, numDocs=26) d(price)))+1000.0)), product of: 0.03710002 = (MATCH) weight(features:solr in 13), product of: 0.022867065 = queryNorm 0.9861933 = 0.08238294 = queryWeight(features:solr), product of: 0.45033622 = (MATCH) fieldWeight(features:solr in 13), 1000.0/(1.0*float(rord(price)=14)+1000.0 3.6026897 = idf(docFreq=1, numDocs=26) product of: ) 0.022867065 = queryNorm 1.0 = tf(termFreq(features:solr)=1) 0.3 = boost 0.45033622 = (MATCH) fieldWeight(features:solr in 13), product of: 3.6026897 = idf(docFreq=1, numDocs=26) 0.022867065 = queryNorm 1.0 = tf(termFreq(features:solr)=1) 0.125 = fieldNorm(field=features, doc=13) </str> 3.6026897 = idf(docFreq=1, numDocs=26) 0.01049347 = (MATCH) weight(text:solr^0.2 in 13), product of: </lst> 0.125 = fieldNorm(field=features, doc=13) 0.016476588 = queryWeight(text:solr^0.2), product of: 0.44520026 = (MATCH) weight(sku:solr^1.5 in 13), product of: 0.2 = boost 0.12357441 = queryWeight(sku:solr^1.5), product of: 3.6026897 = idf(docFreq=1, numDocs=26) 1.5 = boost 0.022867065 = queryNorm 3.6026897 = idf(docFreq=1, numDocs=26) 0.6368716 = (MATCH) fieldWeight(text:solr in 13), product of: 0.022867065 = queryNorm 1.4142135 = tf(termFreq(text:solr)=2) 3.6026897 = idf(docFreq=1, numDocs=26) 12/2/2009 Lucid Imagination, Inc. 22
  • 23. “Explain” example: ... 0.026233677 = (MATCH) weight(text:solr^0.5 in 13), product of: 0.04119147 = queryWeight(text:solr^0.5), product of: 0.5 = boost 3.6026897 = idf(docFreq=1, numDocs=26) 0.022867065 = queryNorm 0.6368716 = (MATCH) fieldWeight(text:solr in 13), product of: 1.4142135 = tf(termFreq(text:solr)=2) tf (termFreq(text:solr )=2) 3.6026897 = idf(docFreq=1, numDocs=26) 0.125 = fieldNorm(field=text, doc=13) 0.17808011 = (MATCH) weight(name:solr^1.2 in 13), product of: idf (docFreq=1,numDocs=26) 0.09885953 = queryWeight(name:solr^1.2), product of: 1.2 = boost 3.6026897 = idf(docFreq=1, numDocs=26) 0.022867065 = queryNorm 1.8013449 = (MATCH) fieldWeight(name:solr in 13), product of: 1.0 = tf(termFreq(name:solr)=1) 3.6026897 = idf(docFreq=1, numDocs=26) 0.5 = fieldNorm(field=name, doc=13) 0.03710002 = (MATCH) weight(features:solr in 13), product of: 0.08238294 = queryWeight(features:solr), product of: 3.6026897 = idf(docFreq=1, numDocs=26) 0.022867065 = queryNorm 0.45033622 = (MATCH) fieldWeight(features:solr in 13), product of: 1.0 = tf(termFreq(features:solr)=1) 3.6026897 = idf(docFreq=1, numDocs=26) 0.125 = fieldNorm(field=features, doc=13) ... 12/2/2009 Lucid Imagination, Inc. 23
  • 24. Solr’s XSLT “debugger” http://localhost:8983/solr/select? q=solr &debugQuery=true &wt=xslt &tr=example.xsl &fl=*,score &qt=dismax 12/2/2009 Lucid Imagination, Inc. 24
  • 25. Another way to view Explain data • Solr1.4 has Solritas • Various features, including toggle explain display • “Some assembly required…” http://www.lucidimagination.com/blog/2009/11/04/solritas-solr-1-4s-hidden-gem/ Lucid Imagination, Inc. 25
  • 26. Checking your Index and IDF 12/2/2009 Lucid Imagination, Inc. 26
  • 27. Checking what got Indexed Bad Index = Bad Search • Check Upper / lower case and Punctuation • Bad Fields / Meta Data = Bad Facets, Filters, Sorting Use built-in Schema Browser: • Check each field • Common words = • IDF “Inverse Document Frequency” Lucid Imagination, Inc. 27
  • 28. Check IDF w/ the Schema Browser Start at the Admin Screen: http://localhost:8983/solr/admin Schema Browser • select a field • change # to see more Lucid Imagination, Inc.
  • 29. About NIE New Idea Engineering 12/2/2009 Lucid Imagination, Inc. 29
  • 30. NIE Resources Newsletter & Whitepapers: Search Dev Newsgroup: www.ideaeng.com/current www.SearchDev.org Blogs: EnterpriseSearchBlog.com SearchComponentsOnline.com 12/2/2009 Lucid Imagination, Inc. 30
  • 31. Finish Line / Q & A Review & Questions Mark Bennett mbennett@ideaeng.com main 408-446-3460 cell 408-829-6513 12/2/2009 Lucid Imagination, Inc. 31
  • 32. Q&A These slides and a recorded presentation are available at bit.ly/SolrRelevancy 12/2/2009 Lucid Imagination, Inc.