Presented at DrupalCamp Atlanta by Brad Blake, Senior Developer at Phase2 Technology. Brad walks you through using ApacheSolr in this technically detailed presentation.
2. The
Good
➡ Ships with Drupal
➡ AND/OR, exact phrase matching
➡ Many extension modules, including
facets, word stemming
➡ Indexes default display, but can
configure search_index display
3. The
Bad
➡ Memory/CPU intensive for searching
and indexing
➡ Doesn’t scale well
➡ Dead End ( advanced search, but not
intuitive )
➡ Exact keyword matching
4. What
is
SOLR?
➡ Open-Source Search Platform
➡ Apache Lucene search library
➡ Standalone Search Server
➡ Runs within servlet container ( Tomcat )
➡ XML/JSON API’s
5. Why
Use
SOLR?
➡ Performance
➡ Optimized for search
➡ Distributed Search / Index Replication
➡ Reduces load on Drupal site
➡ Faster = Better
6. Facets
➡ Way to filter search results by
categorized information
➡ Open-Ended search
➡ More clicks, more relevant
➡ Blocks: Context
7. Why
Use
SOLR?
➡ Document Handling ( PDF / Word, etc)
➡ Better Search Algorithms
➡ Full-text search
➡ Word Stemming, Splitting, etc
➡ Highly Configurable
22. Theme
➡ Default search-result.tpl.php
➡ $result contains a lot of information
➡ Contains data valid at the time of
indexing
➡ Not all fields are present
23. Theme
➡ Can add fields to result object through
query_alter
➡ Find names at /admin/reports/
apachesolr
25. Attachments
➡ Install Apache Tika locally or in SOLR
➡ http://drupal.org/project/
apachesolr_attachments
➡ Indexes attachments and extracts text if
possible.
➡ Read the README
26. Attachments
➡ Creates File attachment ‘node’ in the
index.
➡ Can theme these to link back to parent
node.
➡ hook_apachesolr_attachment_index_alte
r($document, $node, $file) {}
27. Search
API
➡ http://drupal.org/project/search_api
➡ Search framework to search on any
entity, with any backend
28. The
Good
➡ Abstraction of search engine layer ( DB /
Mongo / SOLR, etc )
➡ Views integration
➡ Faceted search
➡ Ability for multiple backends
➡ Set up additional search pages
29. The
Good
➡ Many extension modules ( AJAX,
Autocomplete, Attachments, etc )
➡ Control over which fields are indexed
( Index level, not Content Type level )
30. The
Bad
➡ Not compatible with all versions of
SOLR
➡ Sorts module not as good as apachesolr
module’s
➡ Performance depends on backend used
➡ Field weights not as granular
➡ Takes getting used to