SlideShare a Scribd company logo
1 of 22
Download to read offline
Night Owl
Log Monitoring using Elasticsearch and Hadoop

Boyd Meier (bmeier@pros.com)
Hadoop Meetup – October 16, 2013

© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Problem

© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Application Performance Monitoring
● Many servers
● Many applications
● Many log formats
● Many places to go look for information
● What if we could just look in one place and see everything?

© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Advanced Analysis
● The logs are too low-level
● The servers need the existing capacity
● The amount of data to be analyzed is huge
● Some analysis needs to be across multiple servers
● What if we want to change the analysis algorithms?
● How we can do analysis in the most flexible way possible?

© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Proactive Support
● See problems coming before they become crises
● Watch for errors and exceptions
● Track performance of the application
● Track usage of the application
● Enable checks we haven’t thought of yet

© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Some Analysis Questions
● What errors happen, and how often?
● Who did what, when?
● How long did it take to do a task?
● What else was happening on the server?

© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Constraints
● Very little budget – as much free stuff as possible
● Can’t use client machines
● Communications need to be secure
● Large amounts of data (Gb/day/client)
● Minimize support’s dependence on client IT

© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Approach

© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Hadoop
● We have a lot of data (~2 GB day with 3 clients)
● We need to process it in reasonable time
● We can’t afford a big machine for this
● We have lots of old machines lying around
● Sounds like a job for the elephant! But what about query?

© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Elasticsearch
● Query performance on base Hadoop is painful
● Ad-hoc queries are required
● Hadoop integration
● Cluster deployment
● Looks promising! How do we get the data into the server?

© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Logstash
● Handle many sources, not just logs
● Fan-in architecture to server
● Compressed, SSL encrypted data
● Can offload some logic on the client if desired
● Massively configurable
● Output to Elasticsearch
● Great! Now how about visualization?

© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Kibana
● Backed by Elasticsearch
● Supports dynamic queries
● View information over time
● Built-in support for Logstash
● Configurable, shareable dashboards

© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Hadoop Processing
● Pig scripts process the data
● Wonderdog from InfoChimps to integrate Pig and Elasticsearch

– There are issues:
• Cluster stability using Wonderdog
• Wonderdog Pig interface has not been updated in a while
• Currently evaluating elasticsearch-hadoop project from Elasticsearch.org

● Analysis results are stored in Elasticsearch for ease of access

© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Demo

© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Configuration Details

© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Software
● Ubuntu 12.04.2 LTS (Precise)
● Cloudera CDH 4.3.1

– Hadoop 2.0.0
– Hbase 0.94
– Hive 0.10
– Pig 0.11
● Elasticsearch 0.90.3
● Logstash 1.1.12
● Kibana 3 M3

© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Hardware Architecture
● 27 node cluster of commodity machines
● 42 TB of disk space
● Connected via 10 gigabit switch
● Each machine has:

– 8 GB RAM
– 2 TB SATA HDD
– Gigabit Ethernet

© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Performance
● Over the month of September:

– 188 million events ingested from 3 clients
– 57.5 GB storage used (1.92 GB / day)
● At that rate, 42 TB is enough space for:

– 142 billion events
– 60 years of data from these clients
– 1 year of data from 180 clients at the same volume per client

© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
Resources
● Elasticsearch - http://www.elasticsearch.org/overview/
• http://github.com/elasticsearch/elasticsearch

● Logstash - http://www.elasticsearch.org/overview/logstash/
• https://github.com/logstash/logstash

● Kibana - http://www.elasticsearch.org/overview/kibana/
• https://github.com/elasticsearch/kibana

● ES – Hadoop - http://www.elasticsearch.org/overview/hadoop/
• http://github.com/elasticsearch/elasticsearch-hadoop

● Cloudera - http://www.cloudera.com/

© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
World Headquarters
3100 Main Street, Suite #900
Houston, TX 77002
Phone: +1 713-335-5151
Sales: +1 855-846-0641
Fax: +1 713-335-8144

PROS Germany GmbH
Feringastrasse 6
85774 Unterfoehring
Munich
Tel.: +49 89 99216 270
Fax: +49 89 99216 200

European Headquarters - United Kingdom
Lakeside House
1 Furzeground Way
Stockley Park
Heathrow
UB11 1BD
Phone: +44 (0) 208 622 3555
Fax: +44 208 622 3230

Regional Office - Austin, TX
3600 Parmer Lane, Suite 205
Austin, Texas 78727
Regional Office - Cary, North Carolina
1000 Centre Green Way, #200
Cary, NC 27513
Phone:+1 919-228-6334

© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY

More Related Content

What's hot

MongoDB .local Chicago 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Chicago 2019: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local Chicago 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Chicago 2019: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
Nagios Conference 2014 - Luke Groschen - Using Nagios Network Analyzer and NS...
Nagios Conference 2014 - Luke Groschen - Using Nagios Network Analyzer and NS...Nagios Conference 2014 - Luke Groschen - Using Nagios Network Analyzer and NS...
Nagios Conference 2014 - Luke Groschen - Using Nagios Network Analyzer and NS...Nagios
 
Turning Evidence into Insights: How NCIS Leverages Elastic
Turning Evidence into Insights: How NCIS Leverages Elastic Turning Evidence into Insights: How NCIS Leverages Elastic
Turning Evidence into Insights: How NCIS Leverages Elastic Elasticsearch
 
Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"
Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"
Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"Rommel Garcia
 
Accumulo Summit 2014: Accumulo with Distributed SQL queries
Accumulo Summit 2014: Accumulo with Distributed SQL queriesAccumulo Summit 2014: Accumulo with Distributed SQL queries
Accumulo Summit 2014: Accumulo with Distributed SQL queriesAccumulo Summit
 
Introducing MongoDB Stitch, Backend-as-a-Service from MongoDB
Introducing MongoDB Stitch, Backend-as-a-Service from MongoDBIntroducing MongoDB Stitch, Backend-as-a-Service from MongoDB
Introducing MongoDB Stitch, Backend-as-a-Service from MongoDBMongoDB
 
Protecting the Data Lake
Protecting the Data LakeProtecting the Data Lake
Protecting the Data LakeAshutosh Narkar
 
New Features in MongoDB Atlas
New Features in MongoDB AtlasNew Features in MongoDB Atlas
New Features in MongoDB AtlasMongoDB
 
MongoDB .local Munich 2019: Mastering MongoDB on Kubernetes – MongoDB Enterpr...
MongoDB .local Munich 2019: Mastering MongoDB on Kubernetes – MongoDB Enterpr...MongoDB .local Munich 2019: Mastering MongoDB on Kubernetes – MongoDB Enterpr...
MongoDB .local Munich 2019: Mastering MongoDB on Kubernetes – MongoDB Enterpr...MongoDB
 
IPC Global Big Data To Decision Solution Overview
IPC Global Big Data To Decision Solution OverviewIPC Global Big Data To Decision Solution Overview
IPC Global Big Data To Decision Solution Overviewpzybrick
 
Elastic at Procter & Gamble: A Network Story
Elastic at Procter & Gamble: A Network StoryElastic at Procter & Gamble: A Network Story
Elastic at Procter & Gamble: A Network StoryElasticsearch
 
Presto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix ContainersPresto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix Containerskbajda
 
WSO2Con ASIA 2016: IoT Analytics
WSO2Con ASIA 2016: IoT AnalyticsWSO2Con ASIA 2016: IoT Analytics
WSO2Con ASIA 2016: IoT AnalyticsWSO2
 
Gdg cloud london 2017 kappa architecture 2.0 copia
Gdg cloud london 2017   kappa architecture 2.0 copiaGdg cloud london 2017   kappa architecture 2.0 copia
Gdg cloud london 2017 kappa architecture 2.0 copiaJuantomás García Molina
 
Presto Summit 2018 - 10 - Qubole
Presto Summit 2018  - 10 - QubolePresto Summit 2018  - 10 - Qubole
Presto Summit 2018 - 10 - Qubolekbajda
 
1Spatial: Cardiff FME World Tour: A database for every occasion
1Spatial: Cardiff FME World Tour: A database for every occasion1Spatial: Cardiff FME World Tour: A database for every occasion
1Spatial: Cardiff FME World Tour: A database for every occasion1Spatial
 
Zenko: Enabling Data Control in a Multi-cloud World
Zenko: Enabling Data Control in a Multi-cloud WorldZenko: Enabling Data Control in a Multi-cloud World
Zenko: Enabling Data Control in a Multi-cloud WorldScality
 
MongoDB .local Bengaluru 2019: The Journey of Migration from Oracle to MongoD...
MongoDB .local Bengaluru 2019: The Journey of Migration from Oracle to MongoD...MongoDB .local Bengaluru 2019: The Journey of Migration from Oracle to MongoD...
MongoDB .local Bengaluru 2019: The Journey of Migration from Oracle to MongoD...MongoDB
 

What's hot (20)

MongoDB .local Chicago 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Chicago 2019: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local Chicago 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Chicago 2019: MongoDB Atlas Data Lake Technical Deep Dive
 
Apache Ignite
Apache IgniteApache Ignite
Apache Ignite
 
Nagios Conference 2014 - Luke Groschen - Using Nagios Network Analyzer and NS...
Nagios Conference 2014 - Luke Groschen - Using Nagios Network Analyzer and NS...Nagios Conference 2014 - Luke Groschen - Using Nagios Network Analyzer and NS...
Nagios Conference 2014 - Luke Groschen - Using Nagios Network Analyzer and NS...
 
Turning Evidence into Insights: How NCIS Leverages Elastic
Turning Evidence into Insights: How NCIS Leverages Elastic Turning Evidence into Insights: How NCIS Leverages Elastic
Turning Evidence into Insights: How NCIS Leverages Elastic
 
Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"
Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"
Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"
 
Google Cloud DNS
Google Cloud DNSGoogle Cloud DNS
Google Cloud DNS
 
Accumulo Summit 2014: Accumulo with Distributed SQL queries
Accumulo Summit 2014: Accumulo with Distributed SQL queriesAccumulo Summit 2014: Accumulo with Distributed SQL queries
Accumulo Summit 2014: Accumulo with Distributed SQL queries
 
Introducing MongoDB Stitch, Backend-as-a-Service from MongoDB
Introducing MongoDB Stitch, Backend-as-a-Service from MongoDBIntroducing MongoDB Stitch, Backend-as-a-Service from MongoDB
Introducing MongoDB Stitch, Backend-as-a-Service from MongoDB
 
Protecting the Data Lake
Protecting the Data LakeProtecting the Data Lake
Protecting the Data Lake
 
New Features in MongoDB Atlas
New Features in MongoDB AtlasNew Features in MongoDB Atlas
New Features in MongoDB Atlas
 
MongoDB .local Munich 2019: Mastering MongoDB on Kubernetes – MongoDB Enterpr...
MongoDB .local Munich 2019: Mastering MongoDB on Kubernetes – MongoDB Enterpr...MongoDB .local Munich 2019: Mastering MongoDB on Kubernetes – MongoDB Enterpr...
MongoDB .local Munich 2019: Mastering MongoDB on Kubernetes – MongoDB Enterpr...
 
IPC Global Big Data To Decision Solution Overview
IPC Global Big Data To Decision Solution OverviewIPC Global Big Data To Decision Solution Overview
IPC Global Big Data To Decision Solution Overview
 
Elastic at Procter & Gamble: A Network Story
Elastic at Procter & Gamble: A Network StoryElastic at Procter & Gamble: A Network Story
Elastic at Procter & Gamble: A Network Story
 
Presto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix ContainersPresto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix Containers
 
WSO2Con ASIA 2016: IoT Analytics
WSO2Con ASIA 2016: IoT AnalyticsWSO2Con ASIA 2016: IoT Analytics
WSO2Con ASIA 2016: IoT Analytics
 
Gdg cloud london 2017 kappa architecture 2.0 copia
Gdg cloud london 2017   kappa architecture 2.0 copiaGdg cloud london 2017   kappa architecture 2.0 copia
Gdg cloud london 2017 kappa architecture 2.0 copia
 
Presto Summit 2018 - 10 - Qubole
Presto Summit 2018  - 10 - QubolePresto Summit 2018  - 10 - Qubole
Presto Summit 2018 - 10 - Qubole
 
1Spatial: Cardiff FME World Tour: A database for every occasion
1Spatial: Cardiff FME World Tour: A database for every occasion1Spatial: Cardiff FME World Tour: A database for every occasion
1Spatial: Cardiff FME World Tour: A database for every occasion
 
Zenko: Enabling Data Control in a Multi-cloud World
Zenko: Enabling Data Control in a Multi-cloud WorldZenko: Enabling Data Control in a Multi-cloud World
Zenko: Enabling Data Control in a Multi-cloud World
 
MongoDB .local Bengaluru 2019: The Journey of Migration from Oracle to MongoD...
MongoDB .local Bengaluru 2019: The Journey of Migration from Oracle to MongoD...MongoDB .local Bengaluru 2019: The Journey of Migration from Oracle to MongoD...
MongoDB .local Bengaluru 2019: The Journey of Migration from Oracle to MongoD...
 

Viewers also liked

Oil and gas big data edition
Oil and gas  big data editionOil and gas  big data edition
Oil and gas big data editionMark Kerzner
 
Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup
Hadoop as a service presented by Ajay Jha at Houston Hadoop MeetupHadoop as a service presented by Ajay Jha at Houston Hadoop Meetup
Hadoop as a service presented by Ajay Jha at Houston Hadoop MeetupMark Kerzner
 
Porting your hadoop app to horton works hdp
Porting your hadoop app to horton works hdpPorting your hadoop app to horton works hdp
Porting your hadoop app to horton works hdpMark Kerzner
 
Introduction to pig
Introduction to pigIntroduction to pig
Introduction to pigRavi Mutyala
 
Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)
Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)
Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)Mark Kerzner
 
Witsml data processing with kafka and spark streaming
Witsml data processing with kafka and spark streamingWitsml data processing with kafka and spark streaming
Witsml data processing with kafka and spark streamingMark Kerzner
 
Launching your career in Big Data
Launching your career in Big DataLaunching your career in Big Data
Launching your career in Big DataSujee Maniyam
 
Intro to Apache Spark by Marco Vasquez
Intro to Apache Spark by Marco VasquezIntro to Apache Spark by Marco Vasquez
Intro to Apache Spark by Marco VasquezMapR Technologies
 
Joe Witt presentation on Apache NiFi
Joe Witt presentation on Apache NiFiJoe Witt presentation on Apache NiFi
Joe Witt presentation on Apache NiFiMark Kerzner
 

Viewers also liked (14)

Oil and gas big data edition
Oil and gas  big data editionOil and gas  big data edition
Oil and gas big data edition
 
Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup
Hadoop as a service presented by Ajay Jha at Houston Hadoop MeetupHadoop as a service presented by Ajay Jha at Houston Hadoop Meetup
Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup
 
Porting your hadoop app to horton works hdp
Porting your hadoop app to horton works hdpPorting your hadoop app to horton works hdp
Porting your hadoop app to horton works hdp
 
Introduction to pig
Introduction to pigIntroduction to pig
Introduction to pig
 
Zeta architecture -2015
Zeta architecture -2015Zeta architecture -2015
Zeta architecture -2015
 
Toorcamp 2016
Toorcamp 2016Toorcamp 2016
Toorcamp 2016
 
Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)
Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)
Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)
 
Cloudera search
Cloudera searchCloudera search
Cloudera search
 
Witsml data processing with kafka and spark streaming
Witsml data processing with kafka and spark streamingWitsml data processing with kafka and spark streaming
Witsml data processing with kafka and spark streaming
 
Launching your career in Big Data
Launching your career in Big DataLaunching your career in Big Data
Launching your career in Big Data
 
Hadoop to spark_v2
Hadoop to spark_v2Hadoop to spark_v2
Hadoop to spark_v2
 
Intro to Apache Spark by Marco Vasquez
Intro to Apache Spark by Marco VasquezIntro to Apache Spark by Marco Vasquez
Intro to Apache Spark by Marco Vasquez
 
SHMcloud vision
SHMcloud visionSHMcloud vision
SHMcloud vision
 
Joe Witt presentation on Apache NiFi
Joe Witt presentation on Apache NiFiJoe Witt presentation on Apache NiFi
Joe Witt presentation on Apache NiFi
 

Similar to Night owl by Boyd Meyer of PROS

Getting more into GCP.pdf
Getting more into GCP.pdfGetting more into GCP.pdf
Getting more into GCP.pdfKnoldus Inc.
 
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthUSENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthNicolas Brousse
 
Ridwan Fadjar Septian PyCon ID 2021 Regular Talk - django application monitor...
Ridwan Fadjar Septian PyCon ID 2021 Regular Talk - django application monitor...Ridwan Fadjar Septian PyCon ID 2021 Regular Talk - django application monitor...
Ridwan Fadjar Septian PyCon ID 2021 Regular Talk - django application monitor...Ridwan Fadjar
 
Serverless Big Data Architecture on Google Cloud Platform at Credit OK
Serverless Big Data Architecture on Google Cloud Platform at Credit OKServerless Big Data Architecture on Google Cloud Platform at Credit OK
Serverless Big Data Architecture on Google Cloud Platform at Credit OKKriangkrai Chaonithi
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native PlatformSunil Govindan
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native PlatformSunil Govindan
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1Ruslan Meshenberg
 
Zenko @Cloud Native Foundation London Meetup March 6th 2018
Zenko @Cloud Native Foundation London Meetup March 6th 2018Zenko @Cloud Native Foundation London Meetup March 6th 2018
Zenko @Cloud Native Foundation London Meetup March 6th 2018Laure Vergeron
 
Automating using Ansible
Automating using AnsibleAutomating using Ansible
Automating using AnsibleAlok Patra
 
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...Anna Ossowski
 
Big data at scrapinghub
Big data at scrapinghubBig data at scrapinghub
Big data at scrapinghubDana Brophy
 
Easy Microservices with JHipster - Devoxx BE 2017
Easy Microservices with JHipster - Devoxx BE 2017Easy Microservices with JHipster - Devoxx BE 2017
Easy Microservices with JHipster - Devoxx BE 2017Deepu K Sasidharan
 
Devoxx Belgium 2017 - easy microservices with JHipster
Devoxx Belgium 2017 - easy microservices with JHipsterDevoxx Belgium 2017 - easy microservices with JHipster
Devoxx Belgium 2017 - easy microservices with JHipsterJulien Dubois
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
How we leveraged Drupal to build a leading SaaS product
How we leveraged Drupal to build a leading SaaS product How we leveraged Drupal to build a leading SaaS product
How we leveraged Drupal to build a leading SaaS product Invotra
 
AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned Omid Vahdaty
 
MySQL Applier for Apache Hadoop: Real-Time Event Streaming to HDFS
MySQL Applier for Apache Hadoop: Real-Time Event Streaming to HDFSMySQL Applier for Apache Hadoop: Real-Time Event Streaming to HDFS
MySQL Applier for Apache Hadoop: Real-Time Event Streaming to HDFSMats Kindahl
 
PostgreSQL-as-a-Service with Crunchy PostgreSQL for PKS
PostgreSQL-as-a-Service with Crunchy PostgreSQL for PKSPostgreSQL-as-a-Service with Crunchy PostgreSQL for PKS
PostgreSQL-as-a-Service with Crunchy PostgreSQL for PKSCarlos Andrés García
 

Similar to Night owl by Boyd Meyer of PROS (20)

Getting more into GCP.pdf
Getting more into GCP.pdfGetting more into GCP.pdf
Getting more into GCP.pdf
 
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthUSENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
 
Ridwan Fadjar Septian PyCon ID 2021 Regular Talk - django application monitor...
Ridwan Fadjar Septian PyCon ID 2021 Regular Talk - django application monitor...Ridwan Fadjar Septian PyCon ID 2021 Regular Talk - django application monitor...
Ridwan Fadjar Septian PyCon ID 2021 Regular Talk - django application monitor...
 
Serverless Big Data Architecture on Google Cloud Platform at Credit OK
Serverless Big Data Architecture on Google Cloud Platform at Credit OKServerless Big Data Architecture on Google Cloud Platform at Credit OK
Serverless Big Data Architecture on Google Cloud Platform at Credit OK
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native Platform
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native Platform
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
 
Zenko @Cloud Native Foundation London Meetup March 6th 2018
Zenko @Cloud Native Foundation London Meetup March 6th 2018Zenko @Cloud Native Foundation London Meetup March 6th 2018
Zenko @Cloud Native Foundation London Meetup March 6th 2018
 
RubiX
RubiXRubiX
RubiX
 
Automating using Ansible
Automating using AnsibleAutomating using Ansible
Automating using Ansible
 
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
 
Big data at scrapinghub
Big data at scrapinghubBig data at scrapinghub
Big data at scrapinghub
 
Easy Microservices with JHipster - Devoxx BE 2017
Easy Microservices with JHipster - Devoxx BE 2017Easy Microservices with JHipster - Devoxx BE 2017
Easy Microservices with JHipster - Devoxx BE 2017
 
Devoxx Belgium 2017 - easy microservices with JHipster
Devoxx Belgium 2017 - easy microservices with JHipsterDevoxx Belgium 2017 - easy microservices with JHipster
Devoxx Belgium 2017 - easy microservices with JHipster
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
How we leveraged Drupal to build a leading SaaS product
How we leveraged Drupal to build a leading SaaS product How we leveraged Drupal to build a leading SaaS product
How we leveraged Drupal to build a leading SaaS product
 
Geode Meetup Apachecon
Geode Meetup ApacheconGeode Meetup Apachecon
Geode Meetup Apachecon
 
AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned
 
MySQL Applier for Apache Hadoop: Real-Time Event Streaming to HDFS
MySQL Applier for Apache Hadoop: Real-Time Event Streaming to HDFSMySQL Applier for Apache Hadoop: Real-Time Event Streaming to HDFS
MySQL Applier for Apache Hadoop: Real-Time Event Streaming to HDFS
 
PostgreSQL-as-a-Service with Crunchy PostgreSQL for PKS
PostgreSQL-as-a-Service with Crunchy PostgreSQL for PKSPostgreSQL-as-a-Service with Crunchy PostgreSQL for PKS
PostgreSQL-as-a-Service with Crunchy PostgreSQL for PKS
 

More from Mark Kerzner

IBM Strategy for Spark
IBM Strategy for SparkIBM Strategy for Spark
IBM Strategy for SparkMark Kerzner
 
Hadoop Hadoop & Spark meetup - Altiscale
Hadoop Hadoop & Spark meetup - AltiscaleHadoop Hadoop & Spark meetup - Altiscale
Hadoop Hadoop & Spark meetup - AltiscaleMark Kerzner
 
FreeEed popcorn overview
FreeEed popcorn overviewFreeEed popcorn overview
FreeEed popcorn overviewMark Kerzner
 
FreeEed presentation
FreeEed presentationFreeEed presentation
FreeEed presentationMark Kerzner
 
Automated Hadoop Cluster Construction on EC2
Automated Hadoop Cluster Construction on EC2Automated Hadoop Cluster Construction on EC2
Automated Hadoop Cluster Construction on EC2Mark Kerzner
 
Open source e_discovery
Open source e_discoveryOpen source e_discovery
Open source e_discoveryMark Kerzner
 
FreEed - Open Source eDiscovery
FreEed - Open Source eDiscoveryFreEed - Open Source eDiscovery
FreEed - Open Source eDiscoveryMark Kerzner
 
Houston Hadoop Meetup Presentation by Vikram Oberoi of Cloudera
Houston Hadoop Meetup Presentation by Vikram Oberoi of ClouderaHouston Hadoop Meetup Presentation by Vikram Oberoi of Cloudera
Houston Hadoop Meetup Presentation by Vikram Oberoi of ClouderaMark Kerzner
 
Google Office in Zurich, Switzerland
Google Office in Zurich, SwitzerlandGoogle Office in Zurich, Switzerland
Google Office in Zurich, SwitzerlandMark Kerzner
 
Fun art with fruit and vegetable
Fun art with fruit and vegetableFun art with fruit and vegetable
Fun art with fruit and vegetableMark Kerzner
 
Carnavale de Venice
Carnavale de VeniceCarnavale de Venice
Carnavale de VeniceMark Kerzner
 
Holocaust Memorial Tato
Holocaust Memorial TatoHolocaust Memorial Tato
Holocaust Memorial TatoMark Kerzner
 
Venice views with music
Venice views with musicVenice views with music
Venice views with musicMark Kerzner
 
Cities of the world
Cities of the worldCities of the world
Cities of the worldMark Kerzner
 
Great Views of Nature
Great Views of NatureGreat Views of Nature
Great Views of NatureMark Kerzner
 

More from Mark Kerzner (20)

IBM Strategy for Spark
IBM Strategy for SparkIBM Strategy for Spark
IBM Strategy for Spark
 
Hadoop Hadoop & Spark meetup - Altiscale
Hadoop Hadoop & Spark meetup - AltiscaleHadoop Hadoop & Spark meetup - Altiscale
Hadoop Hadoop & Spark meetup - Altiscale
 
FreeEed popcorn overview
FreeEed popcorn overviewFreeEed popcorn overview
FreeEed popcorn overview
 
FreeEed presentation
FreeEed presentationFreeEed presentation
FreeEed presentation
 
Automated Hadoop Cluster Construction on EC2
Automated Hadoop Cluster Construction on EC2Automated Hadoop Cluster Construction on EC2
Automated Hadoop Cluster Construction on EC2
 
Hadoop on ec2
Hadoop on ec2Hadoop on ec2
Hadoop on ec2
 
Open source e_discovery
Open source e_discoveryOpen source e_discovery
Open source e_discovery
 
FreEed - Open Source eDiscovery
FreEed - Open Source eDiscoveryFreEed - Open Source eDiscovery
FreEed - Open Source eDiscovery
 
Houston Hadoop Meetup Presentation by Vikram Oberoi of Cloudera
Houston Hadoop Meetup Presentation by Vikram Oberoi of ClouderaHouston Hadoop Meetup Presentation by Vikram Oberoi of Cloudera
Houston Hadoop Meetup Presentation by Vikram Oberoi of Cloudera
 
Google Office in Zurich, Switzerland
Google Office in Zurich, SwitzerlandGoogle Office in Zurich, Switzerland
Google Office in Zurich, Switzerland
 
Fun art with fruit and vegetable
Fun art with fruit and vegetableFun art with fruit and vegetable
Fun art with fruit and vegetable
 
Carnavale de Venice
Carnavale de VeniceCarnavale de Venice
Carnavale de Venice
 
Holocaust Memorial Tato
Holocaust Memorial TatoHolocaust Memorial Tato
Holocaust Memorial Tato
 
Yehuda Pen
Yehuda PenYehuda Pen
Yehuda Pen
 
Mark Chagall
Mark ChagallMark Chagall
Mark Chagall
 
Thailand Visite
Thailand VisiteThailand Visite
Thailand Visite
 
Venice views with music
Venice views with musicVenice views with music
Venice views with music
 
Jean Beraud Paris
Jean Beraud ParisJean Beraud Paris
Jean Beraud Paris
 
Cities of the world
Cities of the worldCities of the world
Cities of the world
 
Great Views of Nature
Great Views of NatureGreat Views of Nature
Great Views of Nature
 

Recently uploaded

A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 

Recently uploaded (20)

A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 

Night owl by Boyd Meyer of PROS

  • 1. Night Owl Log Monitoring using Elasticsearch and Hadoop Boyd Meier (bmeier@pros.com) Hadoop Meetup – October 16, 2013 © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  • 2. Problem © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  • 3. Application Performance Monitoring ● Many servers ● Many applications ● Many log formats ● Many places to go look for information ● What if we could just look in one place and see everything? © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  • 4. Advanced Analysis ● The logs are too low-level ● The servers need the existing capacity ● The amount of data to be analyzed is huge ● Some analysis needs to be across multiple servers ● What if we want to change the analysis algorithms? ● How we can do analysis in the most flexible way possible? © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  • 5. Proactive Support ● See problems coming before they become crises ● Watch for errors and exceptions ● Track performance of the application ● Track usage of the application ● Enable checks we haven’t thought of yet © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  • 6. Some Analysis Questions ● What errors happen, and how often? ● Who did what, when? ● How long did it take to do a task? ● What else was happening on the server? © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  • 7. Constraints ● Very little budget – as much free stuff as possible ● Can’t use client machines ● Communications need to be secure ● Large amounts of data (Gb/day/client) ● Minimize support’s dependence on client IT © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  • 8. Approach © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  • 9. Hadoop ● We have a lot of data (~2 GB day with 3 clients) ● We need to process it in reasonable time ● We can’t afford a big machine for this ● We have lots of old machines lying around ● Sounds like a job for the elephant! But what about query? © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  • 10. Elasticsearch ● Query performance on base Hadoop is painful ● Ad-hoc queries are required ● Hadoop integration ● Cluster deployment ● Looks promising! How do we get the data into the server? © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  • 11. Logstash ● Handle many sources, not just logs ● Fan-in architecture to server ● Compressed, SSL encrypted data ● Can offload some logic on the client if desired ● Massively configurable ● Output to Elasticsearch ● Great! Now how about visualization? © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  • 12. Kibana ● Backed by Elasticsearch ● Supports dynamic queries ● View information over time ● Built-in support for Logstash ● Configurable, shareable dashboards © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  • 13. © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  • 14. Hadoop Processing ● Pig scripts process the data ● Wonderdog from InfoChimps to integrate Pig and Elasticsearch – There are issues: • Cluster stability using Wonderdog • Wonderdog Pig interface has not been updated in a while • Currently evaluating elasticsearch-hadoop project from Elasticsearch.org ● Analysis results are stored in Elasticsearch for ease of access © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  • 15. Demo © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  • 16. Configuration Details © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  • 17. © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  • 18. Software ● Ubuntu 12.04.2 LTS (Precise) ● Cloudera CDH 4.3.1 – Hadoop 2.0.0 – Hbase 0.94 – Hive 0.10 – Pig 0.11 ● Elasticsearch 0.90.3 ● Logstash 1.1.12 ● Kibana 3 M3 © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  • 19. Hardware Architecture ● 27 node cluster of commodity machines ● 42 TB of disk space ● Connected via 10 gigabit switch ● Each machine has: – 8 GB RAM – 2 TB SATA HDD – Gigabit Ethernet © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  • 20. Performance ● Over the month of September: – 188 million events ingested from 3 clients – 57.5 GB storage used (1.92 GB / day) ● At that rate, 42 TB is enough space for: – 142 billion events – 60 years of data from these clients – 1 year of data from 180 clients at the same volume per client © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  • 21. Resources ● Elasticsearch - http://www.elasticsearch.org/overview/ • http://github.com/elasticsearch/elasticsearch ● Logstash - http://www.elasticsearch.org/overview/logstash/ • https://github.com/logstash/logstash ● Kibana - http://www.elasticsearch.org/overview/kibana/ • https://github.com/elasticsearch/kibana ● ES – Hadoop - http://www.elasticsearch.org/overview/hadoop/ • http://github.com/elasticsearch/elasticsearch-hadoop ● Cloudera - http://www.cloudera.com/ © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
  • 22. World Headquarters 3100 Main Street, Suite #900 Houston, TX 77002 Phone: +1 713-335-5151 Sales: +1 855-846-0641 Fax: +1 713-335-8144 PROS Germany GmbH Feringastrasse 6 85774 Unterfoehring Munich Tel.: +49 89 99216 270 Fax: +49 89 99216 200 European Headquarters - United Kingdom Lakeside House 1 Furzeground Way Stockley Park Heathrow UB11 1BD Phone: +44 (0) 208 622 3555 Fax: +44 208 622 3230 Regional Office - Austin, TX 3600 Parmer Lane, Suite 205 Austin, Texas 78727 Regional Office - Cary, North Carolina 1000 Centre Green Way, #200 Cary, NC 27513 Phone:+1 919-228-6334 © COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY