Elasticsearch est un moteur de recherche Open Source très puissant basé sur
Apache Lucene. Il permet l'indexation de millions de données, leur recherche et leur
analyse en temps réel. Les outils Elascticsearch sont déjà utilisés par des acteurs de
référence tels que FourSquare, GitHub, OpenDataSoft ou encore Dailymotion.
Alter Way et Elasticsearch vous convient à venir découvrir la suite Elasticsearch
enfin disponible en version 1.0 et prête pour la production !
6. Elasticsearch in 1 slide
• More than 6 million downloads
• 450,000 New Downloads per Month
• 1000s of Mission Critical Implementations
• Top Investors: Benchmark Capital, Index
Ventures
• Seasoned Executive Team
– Founded by Creator of Elasticsearch
– Seasoned Executives from SpringSource
8. Big Data in Todayʼ’s Business and Technology
Environment : some significant figures
• 2.7 Zetabytes of data exist in the digital universe today. (=1 billion Terabytes)
• 235 Terabytes of data has been collected by the U.S. Library of Congress in April
2011.
• Facebook stores, accesses, and analyzes 30+ Petabytes of user generated data.
• Akamai analyzes 75 million events per day to better target advertisements.
• Walmart handles more than 1 million customer transactions every hour, which is
imported into databases estimated to contain more than 2.5 petabytes of data.
• The largest AT&T database boasts titles including the largest volume of data in one
unique database (312 terabytes) and the second largest number of rows in a unique
database (1.9 trillion), which comprises AT&Tʼ’s extensive calling records.
• Hadoop :
– 94% of Hadoop users perform analytics on large volumes of data not possible
before
– 88% analyze data in greater detail;
– while 82% can now retain more of their data.
9. The Rapid Growth of Unstructured Data
• YouTube users upload 48 hours of new video every minute of the
day.
• 500+ new websites are created every minute of the day.
• Brands and organizations on Facebook receive 34,722 Likes every
minute of the day.
• 100 terabytes of data uploaded daily to Facebook.
• According to Twitterʼ’s own research in early 2012, it sees roughly
175 million tweets every day, and has more than 465 million
accounts.
• 30 Billion pieces of content shared on Facebook every month.
Data production will be 44 times greater in 2020 than it was in 2009. .
10. Big Data & Real Business Issues
• 25+ % of decision‐makers surveyed predict that data volumes in their
companies will rise by more than 60% by the end of 2014, with the
average of all respondents anticipating a growth of no less than 42 %.
• 40% projected growth in global data generated per year vs. 5% growth in
global IT spending.
• According to estimates, the volume of business data worldwide, across all
companies, doubles every 1.2 years.
– Poor data can cost businesses 20%–35% of their operating revenue.
– Bad data or poor data quality costs US businesses $600 billion annually.
• 75+ % of decision-makers surveyed anticipate significant impacts in the
domain of storage systems as a result of the “Big Data” phenomenon.
• We anticipate a new challenge : to be able to Search and Analyse all
those datas … in real time !
12. StartUp
search = like % ?
SELECT
doc.*, pays.*
FROM
doc, pays
WHERE
doc.pays_code = pays.code AND
doc.date_doc > to_date('2011-12', 'yyyy-mm') AND
doc.date_doc < to_date('2012-01', 'yyyy-mm') AND
lower(pays.libelle) = 'france' AND
lower(doc.commentaire) LIKE ‘%produit%' AND
lower(doc.commentaire) LIKE ‘%david%';
30. Elasticsearch 1.0 : une
solution prête pour la
production !
Revolutionizing Data Search
and Analytics
31. Purpose of Elasticsearch
• Organize data and make it easily accessible
– Through powerful search and analytics
– Easily consumable (even for non-data scientists)
– Elegantly handles extremely large data volumes
– Delivers results in real time
• Technology stack agnostic
• Used across all market verticals
32. Features of Elasticsearch
• Structured & unstructured search
• Advanced analytics capabilities
• Unmatched performance
• Real-time results
• Highly scalable
• User friendly installation and maintenance
42. User Raves
Chris Cowan @uhduh
I’m in love with @elasticsearch! I want to use it for everything right now!
Alain Richardt @alaincxs
Moving ffrom #solr to # Elasticsearch is like upgrading from a Reliant Robin to a McLaren
F1
Pete Connolly @peteconnolly
Two really useful and productive days of training from @kimchy and @uboness all
about #elasticsearch. Best training course in years
Cyril Lacôte @clacote
#ElasticSearch is the s*&t. Amazingly simple and powerful. Open source is awesome.
That's made my day.
Logan Lowell @fractaloop
Tweaking @elasticsearch for huge indexes can be fun. I'm very glad the IRC channel is
so helpful too.
44. 1: Training
Core Elasticsearch Training
• Two day classroom training
• Delivered by Elasticsearch developers
1. Worldwide Public Courses
2. Onsite Training Course
62. • REST based
• Memory and I/O efficient
• Adaptive I/O
• Map/Reduce API support
• Pig support
• Hive support
• = elasticsearch-hadoop
Combining Hadoop & ElasticSearch
63. It’s up to you to decide what to build with ES
72. Conclusion
• Il est temps de révolutionner la façon dont vous valorisez
vos données : offrez Elasticsearch à vos applicatifs !
• La stack ELK (Elasticsearch, Logstash, Kibana) en
version 1.0 est prête pour la production !
• Faites vous accompagner pour bénéficier des bonnes
pratiques et du support à tous les stades de votre projet :
conception, développement, production