SlideShare a Scribd company logo
1 of 44
Open Source Search for the
Enterprise
Charlie Hull
Managing Director, Flax
3rd
November 2010
OVUM Briefing, Search Across the Enterprise
charlie@flax.co.uk
www.flax.co.uk/blog
+44 (0) 8700 118334
Twitter: @FlaxSearch
Search engine specialists with decades of experience
Developers, innovators and strategists
Based in Cambridge, UK
Technology agnostic – but open source exponents
Recently selected as UK Authorized Partner by Lucid
Imagination
Customers include Mydeco, NLA, Durrants Ltd, Financial
Times, MediaMiser, MySkreen, Accenture, University of
Cambridge
Recently asked to present at British Computer Society
and Lucene Revolution conferences
Who are Flax?
“Open-source software (OSS) is computer
software that is available in source code form
for which the source code and certain other
rights normally reserved for copyright holders
are provided under a software license that
permits users to study, change, and improve
the software. […] Some open source software is
available within the public domain” (Wikipedia)
What is open source?
“Open-source software (OSS) is computer
software that is available in source code form
for which the source code and certain other
rights normally reserved for copyright holders
are provided under a software license that
permits users to study, change, and improve
the software. […] Some open source software is
available within the public domain” (Wikipedia)
What is open source?
It's the work of amateur developers
Myths about open source
It's the work of amateur developers
If I use open source, I have to open up my
software/servers/network to all and sundry
Myths about open source
It's the work of amateur developers
If I use open source, I have to open up my
software/servers/network to all and sundry
Open source software isn't reliable or
scalable
Myths about open source
It's the work of amateur developers
If I use open source, I have to open up my
software/servers/network to all and sundry
Open source software isn't reliable or
scalable
It's free
Myths about open source
It's the work of amateur developers
If I use open source, I have to open up my
software/servers/network to all and sundry
Open source software isn't reliable or
scalable
It's free
It's unsupported
Myths about open source
Open source search software
Apache Lucene and Solr are trademarks of The Apache Software Foundation
- Flexible licensing
- Vector space model
- Java and other languages
- Well known and supportedApache Lucene and Solr are trademarks of The Apache Software Foundation
Open source search software
Apache Lucene and Solr are trademarks of The Apache Software Foundation
- The successor to Muscat
- Bayesian probabilistic ranking
- C/C++ with language bindings
- Highly accurate & scalable
- Flexible licensing
- Vector space model
- Java and other languages
- Well known and supportedApache Lucene and Solr are trademarks of The Apache Software Foundation
Open source search software
Apache Lucene and Solr are trademarks of The Apache Software Foundation
- The successor to Muscat
- Bayesian probabilistic ranking
- C/C++ with language bindings
- Highly accurate & scalable
- Flexible licensing
- Vector space model
- Java and other languages
- Well known and supported
And more....
Apache Lucene and Solr are trademarks of The Apache Software Foundation
Some examples
http://www.nla-clipshare.com
Newspaper Licensing Agency – NLA Clipshare
20 million newspaper stories
6500 users
Content from every major newspaper (and
most regionals)
Used by journalists, clippings agencies,
media monitors
Replacing internal systems at major
newspapers
Some examples
http://www.nla-clipshare.com
Newspaper Licensing Agency – NLA Clipshare
20 million newspaper stories
6500 users
Content from every major newspaper (and
most regionals)
Used by journalists, clippings agencies,
media monitors
Replacing internal systems at major
newspapers
One of very few ways to search content
from all the papers within hours of
publication
Some examples
Financial Times – press cuttings
Web Service for easy integration
XML source data
Faceted search
Area filters (whole article, body, headline,
byline or any combination)
Synonyms, spelling suggestions
http://presscuttings.ft.com
Some examples
Financial Times – press cuttings
Web Service for easy integration
XML source data
Faceted search
Area filters (whole article, body, headline,
byline or any combination)
Synonyms, spelling suggestions
Built from scratch in a fortnight
Designed as a prototype, scaled to
production use without significant change
http://presscuttings.ft.com
Some examples
Durrants Ltd. Media monitoring platform
Thousands of client search profiles
Hundreds of thousands of articles per day
Complex publication heirarchy
Established pipeline
Solution
Flexible query language allows OCR
errors, punctuation, fuzzy matching,
weighting
Supports features of previous engine
Scalable master-slave architecture
Some examples
Durrants Ltd. Media monitoring platform
Thousands of client search profiles
Hundreds of thousands of articles per day
Complex publication heirarchy
Established pipeline
Solution
Flexible query language allows OCR
errors, punctuation, fuzzy matching,
weighting
Supports features of previous engine
Scalable master-slave architecture
Accuracy improved in some cases from 95%
rejected to 95% accepted
Hardware budget 15% of previous system
Some examples
(Unnamed multinational radio suppliers)
Intranet search
12 million documents
Multiple formats – Office, PDF, HTML...
User and group-based security (LDAP)
Faceted search
Users can 'tag' interesting documents – for
example to identify a 'reference' version
Some examples
(Unnamed multinational radio suppliers)
Intranet search
12 million documents
Multiple formats – Office, PDF, HTML...
User and group-based security (LDAP)
Faceted search
Users can 'tag' interesting documents – for
example to identify a 'reference' version
Open source chosen because of significant
cost advantage – commercial solutions
uneconomic at this scale
A look at Lucene & Solr
Among the top 15 open source projects
Installations at over 4,000 companies
Downloads have grown nearly 10x over the past three
years
Over 7,000 downloads a day.
A look at Lucene & Solr
Among the top 15 open source projects
Installations at over 4,000 companies
Downloads have grown nearly 10x over the past three
years
Over 7,000 downloads a day.
USA based
Employs 9 out of 15 top Lucene committers
Offers training, consulting and up to 24x7
support
Developing value-add software
A look at Lucene & Solr
Among the top 15 open source projects
Installations at over 4,000 companies
Downloads have grown nearly 10x over the past three
years
Over 7,000 downloads a day.
USA based
Employs 9 out of 15 top Lucene committers
Offers training, consulting and up to 24x7
support
Developing value-add software
Flax are UK partners & resellers
Lucid Works Enterprise
Who are Lucid working with?
Some Lucene & Solr numbers
LinkedIn – 30 million users
Internet Archive – a billion indexed pages
Salesforce.com – 8 terabytes of searchable data
Twitter – a billion queries a day
Why open source search?
Flexible, extendable
Why open source search?
Flexible, extendable
Powerful & scalable
Why open source search?
Flexible, extendable
Powerful & scalable
Lower cost, especially when planning for growth
Why open source search?
Flexible, extendable
Powerful & scalable
Lower cost, especially when planning for growth
Commercial support available as necessary
Why open source search?
Flexible, extendable
Powerful & scalable
Lower cost, especially when planning for growth
Commercial support available as necessary
- Freedom to innovate
Looking to the future
Looking to the future
More and more content including social media
Looking to the future
More and more content including social media
Multiple delivery platforms
Looking to the future
More and more content including social media
Multiple delivery platforms
Search-powered applications
Looking to the future
More and more content including social media
Multiple delivery platforms
Search-powered applications
Cloud computing
Looking to the future
More and more content including social media
Multiple delivery platforms
Search-powered applications
Cloud computing
More use of entity extraction & sentiment analysis
Looking to the future
More and more content including social media
Multiple delivery platforms
Search-powered applications
Cloud computing
More use of entity extraction & sentiment analysis
Search no longer a bolt-on, but a
platform for innovation
Looking to the future
More and more content including social media
Multiple delivery platforms
Search-powered applications
Cloud computing
More use of entity extraction & sentiment analysis
Search no longer a bolt-on, but a
platform for innovation
Open source no longer an outsider,
but the obvious choice
Thankyou!
Any questions?
charlie@flax.co.uk
www.flax.co.uk/blog
+44 (0) 8700 118334
Twitter: @FlaxSearch

More Related Content

What's hot

GraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL QueriesGraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL QueriesMarin Dimitrov
 
A whirlwind tour of graph databases
A whirlwind tour of graph databasesA whirlwind tour of graph databases
A whirlwind tour of graph databasesjexp
 
Fire kit ios (r-baldwin)
Fire kit ios (r-baldwin)Fire kit ios (r-baldwin)
Fire kit ios (r-baldwin)DevDays
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudMarin Dimitrov
 
Schema.org: Where did that come from!
Schema.org: Where did that come from!Schema.org: Where did that come from!
Schema.org: Where did that come from!Richard Wallis
 
GraphQL - The new "Lingua Franca" for API-Development
GraphQL - The new "Lingua Franca" for API-DevelopmentGraphQL - The new "Lingua Franca" for API-Development
GraphQL - The new "Lingua Franca" for API-Developmentjexp
 
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4jNeo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4jNeo4j
 
Intro to Cypher
Intro to CypherIntro to Cypher
Intro to CypherNeo4j
 
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFiReal-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFiTimothy Spann
 
Contextual Computing - Knowledge Graphs & Web of Entities
Contextual Computing - Knowledge Graphs & Web of EntitiesContextual Computing - Knowledge Graphs & Web of Entities
Contextual Computing - Knowledge Graphs & Web of EntitiesRichard Wallis
 
Text Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceText Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceMarin Dimitrov
 
Infrastructure crossroads... and the way we walked them in DKPro
Infrastructure crossroads... and the way we walked them in DKProInfrastructure crossroads... and the way we walked them in DKPro
Infrastructure crossroads... and the way we walked them in DKProopenminted_eu
 
Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...
Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...
Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...Kevin Dias
 
Visual Ontology Modeling for Domain Experts and Business Users with metaphactory
Visual Ontology Modeling for Domain Experts and Business Users with metaphactoryVisual Ontology Modeling for Domain Experts and Business Users with metaphactory
Visual Ontology Modeling for Domain Experts and Business Users with metaphactoryPeter Haase
 
CodeOne 2018 - Microservices in action at the Dutch National Police
CodeOne 2018 - Microservices in action at the Dutch National PoliceCodeOne 2018 - Microservices in action at the Dutch National Police
CodeOne 2018 - Microservices in action at the Dutch National PoliceBert Jan Schrijver
 
Talis Platform: A Linked Data Engine
Talis Platform: A Linked Data EngineTalis Platform: A Linked Data Engine
Talis Platform: A Linked Data EngineLeigh Dodds
 
Ephedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federationEphedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federationPeter Haase
 
The Kasabi Information Marketplace
The Kasabi Information MarketplaceThe Kasabi Information Marketplace
The Kasabi Information MarketplaceKnud Möller
 
Full Stack Graph in the Cloud
Full Stack Graph in the CloudFull Stack Graph in the Cloud
Full Stack Graph in the CloudNeo4j
 

What's hot (20)

Linked Data and OCLC
Linked Data and OCLCLinked Data and OCLC
Linked Data and OCLC
 
GraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL QueriesGraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL Queries
 
A whirlwind tour of graph databases
A whirlwind tour of graph databasesA whirlwind tour of graph databases
A whirlwind tour of graph databases
 
Fire kit ios (r-baldwin)
Fire kit ios (r-baldwin)Fire kit ios (r-baldwin)
Fire kit ios (r-baldwin)
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the Cloud
 
Schema.org: Where did that come from!
Schema.org: Where did that come from!Schema.org: Where did that come from!
Schema.org: Where did that come from!
 
GraphQL - The new "Lingua Franca" for API-Development
GraphQL - The new "Lingua Franca" for API-DevelopmentGraphQL - The new "Lingua Franca" for API-Development
GraphQL - The new "Lingua Franca" for API-Development
 
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4jNeo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
 
Intro to Cypher
Intro to CypherIntro to Cypher
Intro to Cypher
 
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFiReal-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
 
Contextual Computing - Knowledge Graphs & Web of Entities
Contextual Computing - Knowledge Graphs & Web of EntitiesContextual Computing - Knowledge Graphs & Web of Entities
Contextual Computing - Knowledge Graphs & Web of Entities
 
Text Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceText Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-Service
 
Infrastructure crossroads... and the way we walked them in DKPro
Infrastructure crossroads... and the way we walked them in DKProInfrastructure crossroads... and the way we walked them in DKPro
Infrastructure crossroads... and the way we walked them in DKPro
 
Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...
Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...
Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...
 
Visual Ontology Modeling for Domain Experts and Business Users with metaphactory
Visual Ontology Modeling for Domain Experts and Business Users with metaphactoryVisual Ontology Modeling for Domain Experts and Business Users with metaphactory
Visual Ontology Modeling for Domain Experts and Business Users with metaphactory
 
CodeOne 2018 - Microservices in action at the Dutch National Police
CodeOne 2018 - Microservices in action at the Dutch National PoliceCodeOne 2018 - Microservices in action at the Dutch National Police
CodeOne 2018 - Microservices in action at the Dutch National Police
 
Talis Platform: A Linked Data Engine
Talis Platform: A Linked Data EngineTalis Platform: A Linked Data Engine
Talis Platform: A Linked Data Engine
 
Ephedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federationEphedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federation
 
The Kasabi Information Marketplace
The Kasabi Information MarketplaceThe Kasabi Information Marketplace
The Kasabi Information Marketplace
 
Full Stack Graph in the Cloud
Full Stack Graph in the CloudFull Stack Graph in the Cloud
Full Stack Graph in the Cloud
 

Similar to Flax ovum search-across_the_enterprise

Republica 2014 open-source_in_the_wild
Republica 2014 open-source_in_the_wildRepublica 2014 open-source_in_the_wild
Republica 2014 open-source_in_the_wildAcquia
 
National Archives of Norway - AtoM and Archivematica intro workshop
National Archives of Norway - AtoM and Archivematica intro workshopNational Archives of Norway - AtoM and Archivematica intro workshop
National Archives of Norway - AtoM and Archivematica intro workshopArtefactual Systems - AtoM
 
Open Source Movement
Open Source MovementOpen Source Movement
Open Source MovementMesut Yılmaz
 
Why Open Source, I have Microsoft ?
Why Open Source, I have Microsoft ?Why Open Source, I have Microsoft ?
Why Open Source, I have Microsoft ?Neeraj Agarwal
 
Why Open Source, I have Microsoft ?
Why Open Source, I have Microsoft ?Why Open Source, I have Microsoft ?
Why Open Source, I have Microsoft ?Neeraj Agarwal
 
20080602 Microsoft and Open Source
20080602 Microsoft and Open Source20080602 Microsoft and Open Source
20080602 Microsoft and Open SourceDavid Chou
 
Prospero: A Web-based Document Delivery System
Prospero: A Web-based Document Delivery SystemProspero: A Web-based Document Delivery System
Prospero: A Web-based Document Delivery SystemEric Schnell
 
Koha Presentation at Uttara University
Koha Presentation at Uttara UniversityKoha Presentation at Uttara University
Koha Presentation at Uttara UniversityNur Ahammad
 
Opensource development and apache software foundation
Opensource development and apache software foundationOpensource development and apache software foundation
Opensource development and apache software foundationEran Chinthaka Withana
 
Open source 101
Open source 101Open source 101
Open source 101Tom Rieger
 
Cilip Seminar 6th October - Integrating With Open Source
Cilip Seminar 6th October - Integrating With Open SourceCilip Seminar 6th October - Integrating With Open Source
Cilip Seminar 6th October - Integrating With Open SourceJonathan Field
 
Day 2-presentation
Day 2-presentationDay 2-presentation
Day 2-presentationDeb Forsten
 
Open Source Software R
Open Source Software ROpen Source Software R
Open Source Software Rmsimanau7824
 

Similar to Flax ovum search-across_the_enterprise (20)

Republica 2014 open-source_in_the_wild
Republica 2014 open-source_in_the_wildRepublica 2014 open-source_in_the_wild
Republica 2014 open-source_in_the_wild
 
Workshop slides - Introduction to AtoM and Archivematica
Workshop slides - Introduction to AtoM and ArchivematicaWorkshop slides - Introduction to AtoM and Archivematica
Workshop slides - Introduction to AtoM and Archivematica
 
National Archives of Norway - AtoM and Archivematica intro workshop
National Archives of Norway - AtoM and Archivematica intro workshopNational Archives of Norway - AtoM and Archivematica intro workshop
National Archives of Norway - AtoM and Archivematica intro workshop
 
Opensource
OpensourceOpensource
Opensource
 
Open Source Movement
Open Source MovementOpen Source Movement
Open Source Movement
 
Why we need oa infrastructure - STM Association Beyond Open Access Seminar
Why we need oa infrastructure - STM Association Beyond Open Access SeminarWhy we need oa infrastructure - STM Association Beyond Open Access Seminar
Why we need oa infrastructure - STM Association Beyond Open Access Seminar
 
Artefactual and Open Source Development
Artefactual and Open Source DevelopmentArtefactual and Open Source Development
Artefactual and Open Source Development
 
Why Open Source, I have Microsoft ?
Why Open Source, I have Microsoft ?Why Open Source, I have Microsoft ?
Why Open Source, I have Microsoft ?
 
Why Open Source, I have Microsoft ?
Why Open Source, I have Microsoft ?Why Open Source, I have Microsoft ?
Why Open Source, I have Microsoft ?
 
20080602 Microsoft and Open Source
20080602 Microsoft and Open Source20080602 Microsoft and Open Source
20080602 Microsoft and Open Source
 
Open Source & Open Development
Open Source & Open Development Open Source & Open Development
Open Source & Open Development
 
Open Source Software: A Study
Open Source Software: A StudyOpen Source Software: A Study
Open Source Software: A Study
 
Prospero: A Web-based Document Delivery System
Prospero: A Web-based Document Delivery SystemProspero: A Web-based Document Delivery System
Prospero: A Web-based Document Delivery System
 
Koha Presentation at Uttara University
Koha Presentation at Uttara UniversityKoha Presentation at Uttara University
Koha Presentation at Uttara University
 
Opensource development and apache software foundation
Opensource development and apache software foundationOpensource development and apache software foundation
Opensource development and apache software foundation
 
Open source 101
Open source 101Open source 101
Open source 101
 
Cilip Seminar 6th October - Integrating With Open Source
Cilip Seminar 6th October - Integrating With Open SourceCilip Seminar 6th October - Integrating With Open Source
Cilip Seminar 6th October - Integrating With Open Source
 
Day 2-presentation
Day 2-presentationDay 2-presentation
Day 2-presentation
 
Open source: Making connections by Sunny Pai
Open source: Making connections by Sunny PaiOpen source: Making connections by Sunny Pai
Open source: Making connections by Sunny Pai
 
Open Source Software R
Open Source Software ROpen Source Software R
Open Source Software R
 

More from Charlie Hull

Lucene, Solr and java 9 - opportunities and challenges
Lucene, Solr and java 9 - opportunities and challengesLucene, Solr and java 9 - opportunities and challenges
Lucene, Solr and java 9 - opportunities and challengesCharlie Hull
 
Making sense of big data
Making sense of big dataMaking sense of big data
Making sense of big dataCharlie Hull
 
Search Solutions 2015: Towards a new model of search relevance testing
Search Solutions 2015:  Towards a new model of search relevance testingSearch Solutions 2015:  Towards a new model of search relevance testing
Search Solutions 2015: Towards a new model of search relevance testingCharlie Hull
 
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015Charlie Hull
 
Bio solr building a better search for bioinformatics
Bio solr   building a better search for bioinformaticsBio solr   building a better search for bioinformatics
Bio solr building a better search for bioinformaticsCharlie Hull
 
Solr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance studySolr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance studyCharlie Hull
 

More from Charlie Hull (6)

Lucene, Solr and java 9 - opportunities and challenges
Lucene, Solr and java 9 - opportunities and challengesLucene, Solr and java 9 - opportunities and challenges
Lucene, Solr and java 9 - opportunities and challenges
 
Making sense of big data
Making sense of big dataMaking sense of big data
Making sense of big data
 
Search Solutions 2015: Towards a new model of search relevance testing
Search Solutions 2015:  Towards a new model of search relevance testingSearch Solutions 2015:  Towards a new model of search relevance testing
Search Solutions 2015: Towards a new model of search relevance testing
 
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
 
Bio solr building a better search for bioinformatics
Bio solr   building a better search for bioinformaticsBio solr   building a better search for bioinformatics
Bio solr building a better search for bioinformatics
 
Solr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance studySolr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance study
 

Recently uploaded

The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 

Recently uploaded (20)

The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 

Flax ovum search-across_the_enterprise

  • 1. Open Source Search for the Enterprise Charlie Hull Managing Director, Flax 3rd November 2010 OVUM Briefing, Search Across the Enterprise charlie@flax.co.uk www.flax.co.uk/blog +44 (0) 8700 118334 Twitter: @FlaxSearch
  • 2. Search engine specialists with decades of experience Developers, innovators and strategists Based in Cambridge, UK Technology agnostic – but open source exponents Recently selected as UK Authorized Partner by Lucid Imagination Customers include Mydeco, NLA, Durrants Ltd, Financial Times, MediaMiser, MySkreen, Accenture, University of Cambridge Recently asked to present at British Computer Society and Lucene Revolution conferences Who are Flax?
  • 3. “Open-source software (OSS) is computer software that is available in source code form for which the source code and certain other rights normally reserved for copyright holders are provided under a software license that permits users to study, change, and improve the software. […] Some open source software is available within the public domain” (Wikipedia) What is open source?
  • 4. “Open-source software (OSS) is computer software that is available in source code form for which the source code and certain other rights normally reserved for copyright holders are provided under a software license that permits users to study, change, and improve the software. […] Some open source software is available within the public domain” (Wikipedia) What is open source?
  • 5. It's the work of amateur developers Myths about open source
  • 6. It's the work of amateur developers If I use open source, I have to open up my software/servers/network to all and sundry Myths about open source
  • 7. It's the work of amateur developers If I use open source, I have to open up my software/servers/network to all and sundry Open source software isn't reliable or scalable Myths about open source
  • 8. It's the work of amateur developers If I use open source, I have to open up my software/servers/network to all and sundry Open source software isn't reliable or scalable It's free Myths about open source
  • 9. It's the work of amateur developers If I use open source, I have to open up my software/servers/network to all and sundry Open source software isn't reliable or scalable It's free It's unsupported Myths about open source
  • 10. Open source search software Apache Lucene and Solr are trademarks of The Apache Software Foundation - Flexible licensing - Vector space model - Java and other languages - Well known and supportedApache Lucene and Solr are trademarks of The Apache Software Foundation
  • 11. Open source search software Apache Lucene and Solr are trademarks of The Apache Software Foundation - The successor to Muscat - Bayesian probabilistic ranking - C/C++ with language bindings - Highly accurate & scalable - Flexible licensing - Vector space model - Java and other languages - Well known and supportedApache Lucene and Solr are trademarks of The Apache Software Foundation
  • 12. Open source search software Apache Lucene and Solr are trademarks of The Apache Software Foundation - The successor to Muscat - Bayesian probabilistic ranking - C/C++ with language bindings - Highly accurate & scalable - Flexible licensing - Vector space model - Java and other languages - Well known and supported And more.... Apache Lucene and Solr are trademarks of The Apache Software Foundation
  • 13. Some examples http://www.nla-clipshare.com Newspaper Licensing Agency – NLA Clipshare 20 million newspaper stories 6500 users Content from every major newspaper (and most regionals) Used by journalists, clippings agencies, media monitors Replacing internal systems at major newspapers
  • 14. Some examples http://www.nla-clipshare.com Newspaper Licensing Agency – NLA Clipshare 20 million newspaper stories 6500 users Content from every major newspaper (and most regionals) Used by journalists, clippings agencies, media monitors Replacing internal systems at major newspapers One of very few ways to search content from all the papers within hours of publication
  • 15.
  • 16.
  • 17.
  • 18. Some examples Financial Times – press cuttings Web Service for easy integration XML source data Faceted search Area filters (whole article, body, headline, byline or any combination) Synonyms, spelling suggestions http://presscuttings.ft.com
  • 19. Some examples Financial Times – press cuttings Web Service for easy integration XML source data Faceted search Area filters (whole article, body, headline, byline or any combination) Synonyms, spelling suggestions Built from scratch in a fortnight Designed as a prototype, scaled to production use without significant change http://presscuttings.ft.com
  • 20.
  • 21. Some examples Durrants Ltd. Media monitoring platform Thousands of client search profiles Hundreds of thousands of articles per day Complex publication heirarchy Established pipeline Solution Flexible query language allows OCR errors, punctuation, fuzzy matching, weighting Supports features of previous engine Scalable master-slave architecture
  • 22. Some examples Durrants Ltd. Media monitoring platform Thousands of client search profiles Hundreds of thousands of articles per day Complex publication heirarchy Established pipeline Solution Flexible query language allows OCR errors, punctuation, fuzzy matching, weighting Supports features of previous engine Scalable master-slave architecture Accuracy improved in some cases from 95% rejected to 95% accepted Hardware budget 15% of previous system
  • 23. Some examples (Unnamed multinational radio suppliers) Intranet search 12 million documents Multiple formats – Office, PDF, HTML... User and group-based security (LDAP) Faceted search Users can 'tag' interesting documents – for example to identify a 'reference' version
  • 24. Some examples (Unnamed multinational radio suppliers) Intranet search 12 million documents Multiple formats – Office, PDF, HTML... User and group-based security (LDAP) Faceted search Users can 'tag' interesting documents – for example to identify a 'reference' version Open source chosen because of significant cost advantage – commercial solutions uneconomic at this scale
  • 25. A look at Lucene & Solr Among the top 15 open source projects Installations at over 4,000 companies Downloads have grown nearly 10x over the past three years Over 7,000 downloads a day.
  • 26. A look at Lucene & Solr Among the top 15 open source projects Installations at over 4,000 companies Downloads have grown nearly 10x over the past three years Over 7,000 downloads a day. USA based Employs 9 out of 15 top Lucene committers Offers training, consulting and up to 24x7 support Developing value-add software
  • 27. A look at Lucene & Solr Among the top 15 open source projects Installations at over 4,000 companies Downloads have grown nearly 10x over the past three years Over 7,000 downloads a day. USA based Employs 9 out of 15 top Lucene committers Offers training, consulting and up to 24x7 support Developing value-add software Flax are UK partners & resellers
  • 29. Who are Lucid working with?
  • 30. Some Lucene & Solr numbers LinkedIn – 30 million users Internet Archive – a billion indexed pages Salesforce.com – 8 terabytes of searchable data Twitter – a billion queries a day
  • 31. Why open source search? Flexible, extendable
  • 32. Why open source search? Flexible, extendable Powerful & scalable
  • 33. Why open source search? Flexible, extendable Powerful & scalable Lower cost, especially when planning for growth
  • 34. Why open source search? Flexible, extendable Powerful & scalable Lower cost, especially when planning for growth Commercial support available as necessary
  • 35. Why open source search? Flexible, extendable Powerful & scalable Lower cost, especially when planning for growth Commercial support available as necessary - Freedom to innovate
  • 36. Looking to the future
  • 37. Looking to the future More and more content including social media
  • 38. Looking to the future More and more content including social media Multiple delivery platforms
  • 39. Looking to the future More and more content including social media Multiple delivery platforms Search-powered applications
  • 40. Looking to the future More and more content including social media Multiple delivery platforms Search-powered applications Cloud computing
  • 41. Looking to the future More and more content including social media Multiple delivery platforms Search-powered applications Cloud computing More use of entity extraction & sentiment analysis
  • 42. Looking to the future More and more content including social media Multiple delivery platforms Search-powered applications Cloud computing More use of entity extraction & sentiment analysis Search no longer a bolt-on, but a platform for innovation
  • 43. Looking to the future More and more content including social media Multiple delivery platforms Search-powered applications Cloud computing More use of entity extraction & sentiment analysis Search no longer a bolt-on, but a platform for innovation Open source no longer an outsider, but the obvious choice