"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
Apache Solr Changes the Way You Build Sites
1. APACHE SOLR CHANGES THE
WAY YOU BUILD SITES
How to build dynamic navigation for dynamic content
Jacob Singh and Peter Wolanin
Drupalcon Paris, September 3rd 2009
7. INFORMATION ARCHITECTURE
•Isthe science and art of guessing
what your users want to see or do
on your site and helping them get
there
•Often done without actually
consulting visitors or proper
understanding of the target market
63. A TAILOR MADE NAVIGATION
FOR EVERY USER
You want it, right?
64. The Apache Solr Project
• Stable and proven.
– Used by Netflix, CNET, CitySearch, StubHub!, GameSpot, AOL
– Full time maintainers
– VERY Active mailing list (about 1k messages per month)
• Fast: written in Java.
• Uses Lucene: the top open source search
library.
• Distibuted: scales out in multiple directions.
65. Apache Solr Search Integration
• Very active project on drupal.org.
• Takes advantage of latest Solr features.
• Exposes an API to modify search and display
behavior.
• Supported by engineers at Acquia.
• All Acquia code improvements have been
contributed back to the Drupal.org project.
• Many of the Drupalcon sponsors and attendees
are already involved and using it.
66. Feaure highlights
• Taxonomy, user, and language facets.
• Node type faceting, weighting, and exclusion.
• Node property (e.g. sticky) and date weighting.
• Date facets on content creation or change.
• OG facets (optional sub-module).
• Node access respected (optional sub-module).
• More-like-this content recommendations.
• Customizable (see drupal.org module browsing
features).
68. You Can Run Yourself. Easy!
1. Get a dedicated server or a VPS and get Solr loaded on it.
2. Find a Java server administrator or get some books.
3. Get the Drupal module, install the PHP library, and configure it.
4. Replace the stock Solr configuration files with Drupal ones.
5. Learn about Solr replication and configuring it.
6. Set up log management, alerting, monitoring, etc.
7. Implement regular upgrades or patches to Solr which will requiring getting your Java
development set up and building from source sometimes.
8. Keep up to date with the Drupal module.
9. Implement a security regime to protect data transfer (i.e. so spammers can’t add Viagra ads
to your search results)
10. Harden your servers, setup firewalls and IP-based, password-based, or other security.
11. Figure out how handle updates and versioning of Solr and your schema.
12. *Recommended: Get on the solr-user and solr-developer mailing lists to get updates and
alerts on the Apache Solr project. Don’t worry, it’s only a 50 or so mails a day if you don’t
count the commit messages.
69. Or... use Acquia Search
• Sign up on acquia.com.
• Free 30 day trial subscriptions for anyone.
• You must be running a Drupal 6.x site, with PHP
5.2.0+ (5.1.4+ possible as well).
• Use Acquia Drupal or install our search module
package.
• It leaves Drupal core search intact, so you can
go back anytime.
• Convert your site and start impressing users!
• We will worry about everything else.
70. How Acquia Search Works
Search master server
authenticated
Your webserver request
content to index
index
SSL, HMAC replication
authenticated
search request Acquia
Network results
Search slave servers
71. Proving the platform
• Benchmarking our servers, on the search server
itself, most searches run in < 200 ms, even
under high load.
72. Who Is This For?
• Small and medium size sites - easy access to
enterprise search for every Drupal site.
– No hardware, no experience, fast setup, low cost.
• Large sites and Acquia partners - the same
solution you’d deploy, but faster and easier.
– Don’t consume your engineering resources.
– Why load your own servers?
– We handle the security and availability.
– Impress your users and clients.
I&#x2019;m going to talk about ApacheSolr, a revolutionary search technology
Which provides relevant and fast search results
Which provides relevant and fast search results
Which provides relevant and fast search results
that can be filtered
that can be filtered
that can be filtered
and sorted
and sorted
and sorted
and provides brilliant content recommendations
Changing the way you think about
and provides brilliant content recommendations
Changing the way you think about
and provides brilliant content recommendations
Changing the way you think about
Classic Information Architecture and how you structure your menus and site navigation
I&#x2019;ve built a few websites
Usually, we start with something called IA
What is IA?
* Start with Who your visitors are and what they want to do
* take a look at the inventory of content the site provides
group that content into categories which make sense.
Often called cart sorting, mapping, etc
Name the groups, and they become menus and highlights (Navigation)
What&#x2019;s wrong with these conventions? Our forefathers have used them since time immemorial (1996)
The web is a lot more complicated now.
Content comes from a lot more sources (other sites and users)
And websites do more things
Also...
It isn&#x2019;t 1996 anymore
In short time your Content may start looking like... {flip}
It isn&#x2019;t 1996 anymore
In short time your Content may start looking like... {flip}
And your menus become more like {flip}
And now your site is hard to organize and impossible to use.
You made up archetypes, but your real visitor base is more varied and unique than that.
Did you think of the user who wanted handmade paper?
Which category do you think handmade paper is in?
It literally took me 7 clicks to find Drupal on Dmoz
No one except for someone desperately trying to prove a point during a presentation will spend this long
This is the most important
And the main reason I&#x2019;m speaking to you today.
To deal with this paradigm shift, this generation of the internet has a few new devices / patterns to address the bloat of content. They all seek to handle the issue of unpredictable content and unknown users.
Search is Web 0.5
Remember, Yahoo made millions with lists of websites to visit
Google made billions letting people find the websites they wanted
Most people building websites just think of search as a checkbox on their requirements list
I&#x2019;ve taken a totally biased sampling from Dries&#x2019;s blog of large and newish Drupal sites
I took a sample of some of the top sites on the web.
Search was largely abandoned by site owners because the quality of results was not good enough.
It was too slow.
Users learned not to trust it, and in turn, site builders learned not to prioritize it.
If the same word doesn&#x2019;t mean the same thing, your intentions, delivery and content are worthless.
When a user wants something from your website, they are looking for a keyword.
You, as a site builder tried to think of them, and make links
Users leave only when the content they want doesn&#x2019;t exist. Never before
In this case, the user choose to search for Drupal Work, not Drupal Jobs...
The disadvantage of most search engines is that this doesn&#x2019;t work. The user is just presented with a deluge of information they are not looking for and no obvious way to refine the set.
In this case, the user choose to search for Drupal Work, not Drupal Jobs...
The disadvantage of most search engines is that this doesn&#x2019;t work. The user is just presented with a deluge of information they are not looking for and no obvious way to refine the set.
The advantage of a good search engine is that the user can use whatever vocabulary they want and will find what they are looking for.
What you see here is called &#x201C;Faceted Search.&#x201D; Because Solr is aware not just of the text of your nodes, but all of their metadata, it can provide a much richer way to filter down to just what you are looking for.
Newer is better in this case. No one wants taken jobs.
Solr allows you to sort
Just as vocabulary is important, spelling is to.
Solr Spellcheck is not from dictionary
uses actual content in your index
If you have a funny name for your product
Solr Spellcheck is not from dictionary
uses actual content in your index
If you have a funny name for your product
Just the tip of the iceberg in terms of customization
Buytaert.net
reduce width of browser + increase font
Let&#x2019;s stop trying to think for our users
Let&#x2019;s give them tools that allow them to think they way they want
AND find what they are looking for.
Now, I&#x2019;m going to hand over the floor to Peter Wolanin who has been the driving force behind recent development of the Apache Solr module. He and I are Acquia&#x2019;s experts on the Solr server.
He&#x2019;ll be speaking to you about the Solr project in a little more detail, the communities involved and show you some of the really amazing features we&#x2019;ve got planned.
All the Drupal module code is on drupal.org and available to everyone.
Solr is an Apache Foundation project, avaialble free under the Apache 2.0 license.
Yes, it&#x2019;s doable, but using Acquia hosted service allows:
1. Small to medium sites to get rolling in 15 minutes with no special knowhow or hardware
2. Large sites to not worry about scaling or securing yet another service and the opportunity cost that comes with it.
The Acquia Search module configures the Apache Solr Drupal module and handles authentication.
The Acquia Search module configures the Apache Solr Drupal module and handles authentication.
The Acquia Search module configures the Apache Solr Drupal module and handles authentication.
The Acquia Search module configures the Apache Solr Drupal module and handles authentication.
The Acquia Search module configures the Apache Solr Drupal module and handles authentication.
The Acquia Search module configures the Apache Solr Drupal module and handles authentication.
The Acquia Search module configures the Apache Solr Drupal module and handles authentication.
If you have an Acquia Subscription, you will have search.
The Beta which starts today is totally free. Also, ApacheSolr does not stop core search, which means you can fallback to standard drupal search any time. No Risk!
Setup takes about 5-10 minutes. What are you waiting for!?
Put up pictures of us up. Bring your laptop. Right Now!
Learn more about the admin interface the boosting features