SlideShare a Scribd company logo
1 of 50
Download to read offline
MySQL Performance
monitoring using
Statsd and Graphite
Art van Scheppingen
Head of Database Engineering
2	
  
1.  Who	
  are	
  we?	
  
2.  What	
  monitoring	
  tools	
  do	
  we	
  use?	
  
3.  What	
  are	
  StatsD,	
  Collectd	
  and	
  Graphite?	
  
4.  How	
  MySQL	
  logs	
  to	
  StatsD	
  
5.  Graphing	
  examples	
  
6.  Challenges	
  
7.  QuesHons?	
  
Overview
Who are we?
Who	
  is	
  Spil	
  Games?	
  
	
  
4	
  
•  Company	
  founded	
  in	
  2001	
  
•  350+	
  employees	
  world	
  wide	
  
•  180M+	
  unique	
  visitors	
  per	
  month	
  
•  Over	
  50M	
  registered	
  users	
  
•  45	
  portals	
  in	
  19	
  languages	
  
•  Casual	
  games	
  
•  Social	
  games	
  
•  Real	
  Hme	
  mulHplayer	
  games	
  
•  Mobile	
  games	
  
•  35+	
  MySQL	
  clusters	
  
•  60k	
  queries	
  per	
  second	
  (3.5	
  billion	
  qpd)	
  
Facts
5	
  
Geographic Reach
180	
  Million	
  Monthly	
  AcHve	
  Users(*)	
  
Source:	
  (*)	
  Google	
  Analy3cs,	
  August	
  2012	
  
	
  
6	
  
Girls,	
  Teens	
  and	
  Family	
  
	
  
spielen.com	
  
juegos.com	
  
gamesgames.com	
  
games.co.uk	
  
Brands
Monitoring
We	
  use(d)	
  many	
  many	
  many	
  
monitoring	
  tools	
  so	
  far!	
  
	
  
8	
  
•  Opsview/Nagios	
  (mainly	
  availability)	
  
•  CacH	
  (using	
  Baron	
  Schwartz/Percona	
  templates)	
  
•  MONYog	
  
•  Good	
  ol’	
  RRD	
  
Existing monitoring systems we use(d)
9	
  
Opsview/Nagios
•  Strong	
  points:	
  
•  Easy	
  to	
  create	
  (nagios)	
  plugins	
  
•  Slaves	
  for	
  scaling	
  out	
  
•  Weak	
  points:	
  
•  Stats	
  gathering	
  through	
  polling	
  
•  Low	
  granularity	
  (1	
  to	
  5	
  minutes)	
  
•  Difficult	
  URIs	
  for	
  graphs	
  
10	
  
Cacti
•  Strong	
  points:	
  
•  Awesome	
  Percona	
  templates	
  
•  Great	
  overviews	
  and	
  graphs	
  
•  Weak	
  points:	
  
•  Hard	
  to	
  add	
  new	
  metrics	
  (to	
  90+	
  servers)	
  
•  Not	
  scalable	
  
•  Low	
  granularity	
  (1	
  to	
  5	
  minutes)	
  
•  Hard	
  to	
  correlate	
  
11	
  
MonYOG
•  Strong	
  points:	
  
•  Easy	
  to	
  set	
  up	
  
•  Compare	
  any	
  server	
  with	
  another	
  
•  Compare	
  configuraHons	
  
•  Weak	
  points:	
  
•  “Closed	
  source”	
  
•  Not	
  scalable	
  
•  Jack	
  of	
  all	
  trades	
  
12	
  
Poll limitations
•  Limited	
  to	
  a	
  set	
  interval	
  
•  Data	
  gets	
  averaged	
  out	
  
•  (Host)	
  checks	
  are	
  run	
  serial	
  
•  Slowdowns	
  in	
  a	
  run	
  means	
  no/less	
  data	
  
•  Scaling:	
  add	
  more	
  masters/slaves	
  
•  Sekng	
  up	
  an	
  SSH	
  connecHon	
  is	
  slow	
  
13	
  
Difficult to add a new metric
host065!
bash-3.2# netstat -s | grep "listen queue"!
    26 times the listen queue of a socket overflowed!
!
host066!
bash-3.2# netstat -s | grep "listen queue"!
    33 times the listen queue of a socket overflowed!
14	
  
Other things you can’t do!
Statsd + Collectd
+ Graphite
What	
  are	
  they?	
  
	
  
16	
  
•  Highly	
  scalable	
  real-­‐Hme	
  graphing	
  system	
  
•  Collects	
  numeric	
  Hme-­‐series	
  
•  Backend	
  daemon	
  Carbon	
  
•  Carbon-­‐cache:	
  receives	
  data	
  
•  Carbon-­‐aggregator:	
  aggregates	
  data	
  
•  Carbon-­‐relay:	
  replicaHon	
  and	
  sharding	
  	
  
•  RRD	
  or	
  Whisper	
  database	
  
What is Graphite?
17	
  
•  Each	
  metric	
  is	
  in	
  its	
  own	
  bucket	
  
•  Periods	
  make	
  folders	
  
•  prod.syseng.mmm.<hostname>.admin_offline	
  
•  Metric	
  types	
  
•  Counters	
  
•  Gauge	
  
•  RetenHon	
  can	
  be	
  set	
  using	
  a	
  regex	
  
•  [mysql]	
  	
  
•  pasern	
  =	
  ^prod.syseng.mysql..*$	
  	
  
•  retenHons	
  =	
  2s:1d,1m:3d,5m:7d,1h:5y	
  
Graphite’s capabilities
18	
  
•  Unix	
  daemon	
  that	
  gathers	
  system	
  staHsHcs	
  
•  Over	
  90	
  (input/output)	
  plugins	
  
•  Plugin	
  to	
  send	
  metrics	
  to	
  Graphite/Carbon	
  
•  Very	
  useful	
  for	
  system	
  metrics	
  
What is Collectd?
19	
  
•  Front-­‐end	
  proxy	
  for	
  Graphite/Carbon	
  (by	
  Etsy)	
  
•  NodeJS	
  daemon	
  (also	
  other	
  languages)	
  
•  Receives	
  UDP	
  (on	
  localhost)	
  
•  Buffers	
  metrics	
  locally	
  
•  Flushes	
  periodically	
  data	
  to	
  Graphite/Carbon	
  (TCP)	
  
•  Client	
  libraries	
  available	
  in	
  about	
  any	
  language	
  
•  Send	
  any	
  metric	
  you	
  like!	
  
What is StatsD?
20	
  
•  StatsD	
  funcHons	
  
•  update_stats	
  
•  increment/decrement	
  
•  set	
  
•  gauge	
  
•  Hmers	
  
StatsD functions
21	
  
PHP:	
  
$statsd = new StatsD();!
$statsd->increment(“prod.app1.pages_rendered”, 1);!
$statsd->gauge(“prod.app1.page_concurrency”, 10);!
$statsd->set(“prod.app1.unique_users”, $userid);!
…!
$start = microtime(true); !
serve_out_content_to_clients(); !
$statsd->timing(”prod.app1.rendering_time", (microtime(true) - $start) *
1000);!
!
Library:!
https://github.com/etsy/statsd/blob/master/examples/php-example.php!
!
StatsD PHP code examples
22	
  
Our Graphite cluster(s)
Client	
  requesHng	
  graphs	
  
Graphite	
  Rendering	
  Cluster	
   Carbon	
  relay	
  
Loadbalancer	
  (port	
  443)	
  
DEV	
   SYSENG	
   SERVICES1	
   SERVICES2	
  
Server-­‐1	
   Server-­‐2	
   Server-­‐n	
  
Loadbalancer	
  (port	
  2003)	
  
8 nodes
3 nodes 2 nodes
23	
  
Graphite Storage Clusters
24	
  
Collectd
Collectd	
  
Gather	
  data	
  plugins	
  
CPU	
   DISK	
   LOAD	
   ….	
  
Carbon	
  TCP	
  
30 second interval
25	
  
StatsD
StatsD	
  
ApplicaHon	
  Level	
  
#	
  OF	
  LOGINS	
   CACHE	
  HIT/MISS	
   STATUS	
   INNODB	
  STATUS	
  
Carbon	
  TCP	
  
2 second interval
MySQL_Statsd	
  
localhost:8125
UDP
26	
  
Global scale?
MySQL + StatsD
How	
  do	
  we	
  use	
  them?	
  
	
  
28	
  
•  MySQL	
  plugin	
  for	
  Collectd	
  
•  Sends	
  SHOW	
  STATUS	
  
•  No	
  INNODB	
  STATUS	
  
•  Plugin	
  not	
  flexible	
  
•  DBI	
  plugin	
  for	
  Collectd	
  
•  Metrics	
  based	
  on	
  columns	
  
•  Different	
  granularity	
  needed	
  
•  Separate	
  daemon	
  (with	
  persistent	
  connecHon)	
  
•  StatsD	
  is	
  easy	
  as	
  ABC	
  
Why use StatsD over Collectd?
29	
  
•  Wrisen	
  in	
  Python	
  
•  Gathers	
  data	
  every	
  0.5	
  seconds	
  
•  Sends	
  to	
  StatsD	
  (localhost)	
  a•er	
  every	
  run	
  
•  Easy	
  to	
  set	
  up:	
  no	
  configuraHon	
  
•  Persistent	
  connecHon	
  
•  Baron	
  Schwartz’	
  InnoDB	
  status	
  parser	
  (cacH	
  poller)	
  
•  Other	
  interesHng	
  metrics	
  and	
  counters	
  
•  InformaHon	
  Schema	
  
•  MySQL	
  5.5/5.6	
  Performance	
  Schema	
  
•  MariaDB	
  specific	
  
•  Galera	
  specific	
  
MySQL StatsD daemon
30	
  
MySQL StatsD overview
MySQLCollector
SHOW
STATUS
SHOW
INNODB
STATUS
SHOW
VARIABLES
Persistent
connection
StatsD
Flushed
every
0.5 seconds
31	
  
•  Perl	
  (Net::Statsd)	
  
•  Sends	
  any	
  status	
  change	
  to	
  StatsD	
  (localhost)	
  
•  Non-­‐blocking	
  (thanks	
  to	
  UDP)	
  
•  Draw	
  as	
  infinite	
  in	
  Graphite	
  
MySQL Multi Master patch
32	
  
use Net::Statsd;!
$Net::Statsd::HOST = 'localhost'; # Default!
$Net::Statsd::PORT = 8125; # Default!
!
…!
!
# ONLINE -> HARD_OFFLINE!
unless ($ping && $mysql) {!
Net::Statsd::update_stats('prod.syseng.mmm.'.$host.'.hard_offline', 1);!
FATAL sprintf("State of host '%s' changed from %s to HARD_OFFLINE
(ping: %s, mysql: %s)", $host, $state, ($ping? 'OK' : 'not OK'), ($mysql?
'OK' : 'not OK'));!
$agent->state('HARD_OFFLINE');!
}!
!
…!
!
MMM Perl code example
33	
  
•  Deployments	
  
•  User	
  iniHated	
  acHons	
  
•  Logins	
  
•  High	
  scores	
  
•  Comments	
  /	
  raHngs	
  
•  Images	
  uploaded	
  
•  Payments	
  
•  ApplicaHon	
  metrics	
  
•  Error	
  counts	
  
•  Cache	
  staHsHcs	
  (cache	
  hit/miss)	
  
•  Request	
  Hmers	
  
•  Image	
  sizes	
  
Other metrics
Start graphing!
Now	
  it	
  starts	
  to	
  get	
  
interes=ng!	
  
35	
  
•  IdenHfy	
  your	
  KPIs	
  
•  Don’t	
  graph	
  everything	
  
•  More	
  graphs	
  ==	
  less	
  overview	
  
•  Combine	
  metrics	
  
•  Stack	
  clusters	
  
What is important for you?
36	
  
•  Include	
  other	
  metrics	
  into	
  your	
  graphs	
  
•  Deployments	
  
•  Failover(s)	
  
•  Combine	
  applicaHon	
  metrics	
  with	
  your	
  database	
  
•  Other	
  influences	
  
•  Solar	
  flares	
  
•  Start	
  of	
  the	
  new	
  Maya	
  calendar	
  
Correlate!
37	
  
•  URI	
  based	
  rendering	
  API	
  
•  Support	
  for	
  wildcards	
  
•  stats.prod.syseng.mysql.*.status.com_select	
  
•  sumSeries	
  (stats.prod.syseng.mysql.*.status.com_select)	
  	
  
•  aliasByNode(stats.prod.syseng.mysql.*.status.com_select,	
  4)	
  	
  
•  Many	
  funcHons	
  
•  Nth	
  percenHle	
  
•  Holt-­‐Winters	
  Forecast	
  
•  Timeshi•	
  
Graphite Graphing Engine
38	
  
Graphite Aggregator
syseng => {!
           nodes => [”databasehost1", ”databasehost2"],!
           copying_relay_instances => 8,!
           hashing_relay_instances => 8,!
           cache_instances => 8,!
           aggregation => {!
               0 => {!
                   name => ”mysql",!
                   pattern => '.*.mysql..*',!
                   send_raw => 1,!
               },!
           }!
       }!
!
!
stats.<env>.syseng.mysql.cluster1.status.questions.all (2) = !
!sum stats.<env>.syseng.mysql.*.status.questions!
!
39	
  
Graphite web interface
	
  	
  	
  	
  	
  	
  	
  	
  
40	
  
Graphite Example URL
https://graphitehost/render/?
width=722&height=357&_salt=1366550446.553&rightDashed=1&target=alias
%28sumSeries%28stats.prod.services.profilar.request.total.count.*%29%2C
%22Number%20of%20profile%20requests%22%29&target=alias%28secondYAxis
%28sumSeries%28stats_counts.prod.syseng.mysql.<node1>.status.questions%2C
%20stats_counts.prod.syseng.mysql.<node2).status.questions%29%29%2C
%22Number%20of%20queries%20profiles%20cluster
%22%29&from=00%3A00_20130415&until=23%3A59_20130421!
41	
  
Graphite Example URL
https://graphitehost/render/?
width=722&height=357&_salt=1366550446.553&rightDashed=1&target=alias
%28sumSeries%28stats.prod.services.profilar.request.total.count.*%29%2C
%22Number%20of%20profile%20requests%22%29&target=alias%28secondYAxis
%28sumSeries%28stats_counts.prod.syseng.mysql.<node1>.status.questions%2C
%20stats_counts.prod.syseng.mysql.<node2).status.questions%29%29%2C
%22Number%20of%20queries%20profiles%20cluster
%22%29&from=00%3A00_20130415&until=23%3A59_20130421!
42	
  
Other examples: MMM
43	
  
Other examples: timeshift
44	
  
Other examples: multiple weeks
Challenges
The	
  road	
  ahead	
  
46	
  
•  MySQL_statsd	
  rewrite	
  necessary	
  (not	
  opensource	
  yet)	
  
•  No	
  alerHng	
  through	
  Graphite	
  (yet)	
  
•  Machine	
  learning	
  
•  Eternal	
  hunger	
  for	
  more	
  metrics	
  
•  Abuse	
  of	
  the	
  system	
  
What challenges do we have?
47	
  
•  Persistent	
  connecHons	
  +	
  repeatable	
  read	
  
•  History	
  list	
  skyrocketed	
  
•  Too	
  many	
  metrics	
  slows	
  down	
  graphing	
  
•  Too	
  many	
  metrics	
  can	
  kill	
  a	
  host	
  
•  EstatsD	
  for	
  Erlang	
  
What lessons have we learned?
Questions…
49	
  
•  Graphite:	
  
hsp://graphite.readthedocs.org/en/latest/	
  
•  Collectd:	
  
hsps://collectd.org/	
  
•  StatsD	
  on	
  Github	
  by	
  Etsy:	
  
hsps://github.com/etsy/statsd/wiki	
  
•  Etsy	
  on	
  StatsD:	
  
hsp://codeascra•.etsy.com/2011/02/15/measure-­‐
anything-­‐measure-­‐everything/	
  
	
  
Practical links
50	
  
•  PresentaHon	
  can	
  be	
  found	
  at:	
  
hsp://spil.com/perconasc2013	
  
•  If	
  you	
  wish	
  to	
  contact	
  me:	
  
art@spilgames.com	
  
•  Don’t	
  forget	
  to	
  rate	
  my	
  talk!	
  
Thank you!

More Related Content

What's hot

Tale of ISUCON and Its Bench Tools
Tale of ISUCON and Its Bench ToolsTale of ISUCON and Its Bench Tools
Tale of ISUCON and Its Bench ToolsSATOSHI TAGOMORI
 
Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Jon Haddad
 
Hadoop Robot from eBay at China Hadoop Summit 2015
Hadoop Robot from eBay at China Hadoop Summit 2015Hadoop Robot from eBay at China Hadoop Summit 2015
Hadoop Robot from eBay at China Hadoop Summit 2015polo li
 
Unified Batch & Stream Processing with Apache Samza
Unified Batch & Stream Processing with Apache SamzaUnified Batch & Stream Processing with Apache Samza
Unified Batch & Stream Processing with Apache SamzaDataWorks Summit
 
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsTop 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsCloudera, Inc.
 
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAware
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAwareLeveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAware
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAwareLucidworks
 
Lessons From Sharding Solr At Etsy: Presented by Gregg Donovan, Etsy
Lessons From Sharding Solr At Etsy: Presented by Gregg Donovan, EtsyLessons From Sharding Solr At Etsy: Presented by Gregg Donovan, Etsy
Lessons From Sharding Solr At Etsy: Presented by Gregg Donovan, EtsyLucidworks
 
Large-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop EcosystemLarge-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop EcosystemGyula Fóra
 
Presto At Treasure Data
Presto At Treasure DataPresto At Treasure Data
Presto At Treasure DataTaro L. Saito
 
Introduction to Presto at Treasure Data
Introduction to Presto at Treasure DataIntroduction to Presto at Treasure Data
Introduction to Presto at Treasure DataTaro L. Saito
 
Drilling into Data with Apache Drill
Drilling into Data with Apache DrillDrilling into Data with Apache Drill
Drilling into Data with Apache DrillDataWorks Summit
 
Using Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.comUsing Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.comDamien Krotkine
 
Real-time data analytics with Cassandra at iland
Real-time data analytics with Cassandra at ilandReal-time data analytics with Cassandra at iland
Real-time data analytics with Cassandra at ilandJulien Anguenot
 
Percona tool kit for MySQL DBA's
Percona tool kit for MySQL DBA'sPercona tool kit for MySQL DBA's
Percona tool kit for MySQL DBA'sKarthik .P.R
 
PaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at YelpPaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at YelpNathan Handler
 
Mysql NDB Cluster's Asynchronous Parallel Design for High Performance
Mysql NDB Cluster's Asynchronous Parallel Design for High PerformanceMysql NDB Cluster's Asynchronous Parallel Design for High Performance
Mysql NDB Cluster's Asynchronous Parallel Design for High PerformanceBernd Ocklin
 
Cassandra Meetup: Real-time Analytics using Cassandra, Spark and Shark at Ooyala
Cassandra Meetup: Real-time Analytics using Cassandra, Spark and Shark at OoyalaCassandra Meetup: Real-time Analytics using Cassandra, Spark and Shark at Ooyala
Cassandra Meetup: Real-time Analytics using Cassandra, Spark and Shark at OoyalaDataStax Academy
 
Getting started with Spark & Cassandra by Jon Haddad of Datastax
Getting started with Spark & Cassandra by Jon Haddad of DatastaxGetting started with Spark & Cassandra by Jon Haddad of Datastax
Getting started with Spark & Cassandra by Jon Haddad of DatastaxData Con LA
 
Presto anatomy
Presto anatomyPresto anatomy
Presto anatomyDongmin Yu
 
Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?Tim Lossen
 

What's hot (20)

Tale of ISUCON and Its Bench Tools
Tale of ISUCON and Its Bench ToolsTale of ISUCON and Its Bench Tools
Tale of ISUCON and Its Bench Tools
 
Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)
 
Hadoop Robot from eBay at China Hadoop Summit 2015
Hadoop Robot from eBay at China Hadoop Summit 2015Hadoop Robot from eBay at China Hadoop Summit 2015
Hadoop Robot from eBay at China Hadoop Summit 2015
 
Unified Batch & Stream Processing with Apache Samza
Unified Batch & Stream Processing with Apache SamzaUnified Batch & Stream Processing with Apache Samza
Unified Batch & Stream Processing with Apache Samza
 
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsTop 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
 
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAware
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAwareLeveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAware
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAware
 
Lessons From Sharding Solr At Etsy: Presented by Gregg Donovan, Etsy
Lessons From Sharding Solr At Etsy: Presented by Gregg Donovan, EtsyLessons From Sharding Solr At Etsy: Presented by Gregg Donovan, Etsy
Lessons From Sharding Solr At Etsy: Presented by Gregg Donovan, Etsy
 
Large-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop EcosystemLarge-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop Ecosystem
 
Presto At Treasure Data
Presto At Treasure DataPresto At Treasure Data
Presto At Treasure Data
 
Introduction to Presto at Treasure Data
Introduction to Presto at Treasure DataIntroduction to Presto at Treasure Data
Introduction to Presto at Treasure Data
 
Drilling into Data with Apache Drill
Drilling into Data with Apache DrillDrilling into Data with Apache Drill
Drilling into Data with Apache Drill
 
Using Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.comUsing Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.com
 
Real-time data analytics with Cassandra at iland
Real-time data analytics with Cassandra at ilandReal-time data analytics with Cassandra at iland
Real-time data analytics with Cassandra at iland
 
Percona tool kit for MySQL DBA's
Percona tool kit for MySQL DBA'sPercona tool kit for MySQL DBA's
Percona tool kit for MySQL DBA's
 
PaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at YelpPaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at Yelp
 
Mysql NDB Cluster's Asynchronous Parallel Design for High Performance
Mysql NDB Cluster's Asynchronous Parallel Design for High PerformanceMysql NDB Cluster's Asynchronous Parallel Design for High Performance
Mysql NDB Cluster's Asynchronous Parallel Design for High Performance
 
Cassandra Meetup: Real-time Analytics using Cassandra, Spark and Shark at Ooyala
Cassandra Meetup: Real-time Analytics using Cassandra, Spark and Shark at OoyalaCassandra Meetup: Real-time Analytics using Cassandra, Spark and Shark at Ooyala
Cassandra Meetup: Real-time Analytics using Cassandra, Spark and Shark at Ooyala
 
Getting started with Spark & Cassandra by Jon Haddad of Datastax
Getting started with Spark & Cassandra by Jon Haddad of DatastaxGetting started with Spark & Cassandra by Jon Haddad of Datastax
Getting started with Spark & Cassandra by Jon Haddad of Datastax
 
Presto anatomy
Presto anatomyPresto anatomy
Presto anatomy
 
Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?Key-Value-Stores -- The Key to Scaling?
Key-Value-Stores -- The Key to Scaling?
 

Similar to MySQL Performance Monitoring

MySQL performance monitoring using Statsd and Graphite (PLUK2013)
MySQL performance monitoring using Statsd and Graphite (PLUK2013)MySQL performance monitoring using Statsd and Graphite (PLUK2013)
MySQL performance monitoring using Statsd and Graphite (PLUK2013)spil-engineering
 
MySQL performance monitoring using Statsd and Graphite
MySQL performance monitoring using Statsd and GraphiteMySQL performance monitoring using Statsd and Graphite
MySQL performance monitoring using Statsd and GraphiteDB-Art
 
Buildingsocialanalyticstoolwithmongodb
BuildingsocialanalyticstoolwithmongodbBuildingsocialanalyticstoolwithmongodb
BuildingsocialanalyticstoolwithmongodbMongoDB APAC
 
Capacity Planning for fun & profit
Capacity Planning for fun & profitCapacity Planning for fun & profit
Capacity Planning for fun & profitRodrigo Campos
 
Asynchronous single page applications without a line of HTML or Javascript, o...
Asynchronous single page applications without a line of HTML or Javascript, o...Asynchronous single page applications without a line of HTML or Javascript, o...
Asynchronous single page applications without a line of HTML or Javascript, o...Robert Schadek
 
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...Mark Rittman
 
Creating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at ScaleCreating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at ScaleSean Chittenden
 
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in ProductionTugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in ProductionCodemotion
 
Open Source Monitoring Tools
Open Source Monitoring ToolsOpen Source Monitoring Tools
Open Source Monitoring Toolsm_richardson
 
Introduction to real time big data with Apache Spark
Introduction to real time big data with Apache SparkIntroduction to real time big data with Apache Spark
Introduction to real time big data with Apache SparkTaras Matyashovsky
 
Hannes end-of-the-router-tnc17
Hannes end-of-the-router-tnc17Hannes end-of-the-router-tnc17
Hannes end-of-the-router-tnc17Hannes Gredler
 
Webinar: Index Tuning and Evaluation
Webinar: Index Tuning and EvaluationWebinar: Index Tuning and Evaluation
Webinar: Index Tuning and EvaluationMongoDB
 
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNAFirst Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNATomas Cervenka
 
Timeseries - data visualization in Grafana
Timeseries - data visualization in GrafanaTimeseries - data visualization in Grafana
Timeseries - data visualization in GrafanaOCoderFest
 
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...Facultad de Informática UCM
 
Using Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryUsing Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryStanka Dalekova
 
Using Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryUsing Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryStanka Dalekova
 
Webinar: Managing Real Time Risk Analytics with MongoDB
Webinar: Managing Real Time Risk Analytics with MongoDB Webinar: Managing Real Time Risk Analytics with MongoDB
Webinar: Managing Real Time Risk Analytics with MongoDB MongoDB
 
Stupid Boot Tricks: using ipxe and chef to get to boot management bliss
Stupid Boot Tricks: using ipxe and chef to get to boot management blissStupid Boot Tricks: using ipxe and chef to get to boot management bliss
Stupid Boot Tricks: using ipxe and chef to get to boot management blissmacslide
 

Similar to MySQL Performance Monitoring (20)

MySQL performance monitoring using Statsd and Graphite (PLUK2013)
MySQL performance monitoring using Statsd and Graphite (PLUK2013)MySQL performance monitoring using Statsd and Graphite (PLUK2013)
MySQL performance monitoring using Statsd and Graphite (PLUK2013)
 
MySQL performance monitoring using Statsd and Graphite
MySQL performance monitoring using Statsd and GraphiteMySQL performance monitoring using Statsd and Graphite
MySQL performance monitoring using Statsd and Graphite
 
Buildingsocialanalyticstoolwithmongodb
BuildingsocialanalyticstoolwithmongodbBuildingsocialanalyticstoolwithmongodb
Buildingsocialanalyticstoolwithmongodb
 
Osd ctw spark
Osd ctw sparkOsd ctw spark
Osd ctw spark
 
Capacity Planning for fun & profit
Capacity Planning for fun & profitCapacity Planning for fun & profit
Capacity Planning for fun & profit
 
Asynchronous single page applications without a line of HTML or Javascript, o...
Asynchronous single page applications without a line of HTML or Javascript, o...Asynchronous single page applications without a line of HTML or Javascript, o...
Asynchronous single page applications without a line of HTML or Javascript, o...
 
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...
 
Creating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at ScaleCreating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at Scale
 
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in ProductionTugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
 
Open Source Monitoring Tools
Open Source Monitoring ToolsOpen Source Monitoring Tools
Open Source Monitoring Tools
 
Introduction to real time big data with Apache Spark
Introduction to real time big data with Apache SparkIntroduction to real time big data with Apache Spark
Introduction to real time big data with Apache Spark
 
Hannes end-of-the-router-tnc17
Hannes end-of-the-router-tnc17Hannes end-of-the-router-tnc17
Hannes end-of-the-router-tnc17
 
Webinar: Index Tuning and Evaluation
Webinar: Index Tuning and EvaluationWebinar: Index Tuning and Evaluation
Webinar: Index Tuning and Evaluation
 
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNAFirst Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
 
Timeseries - data visualization in Grafana
Timeseries - data visualization in GrafanaTimeseries - data visualization in Grafana
Timeseries - data visualization in Grafana
 
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
 
Using Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryUsing Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech Industry
 
Using Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech IndustryUsing Graph Analysis and Fraud Detection in the Fintech Industry
Using Graph Analysis and Fraud Detection in the Fintech Industry
 
Webinar: Managing Real Time Risk Analytics with MongoDB
Webinar: Managing Real Time Risk Analytics with MongoDB Webinar: Managing Real Time Risk Analytics with MongoDB
Webinar: Managing Real Time Risk Analytics with MongoDB
 
Stupid Boot Tricks: using ipxe and chef to get to boot management bliss
Stupid Boot Tricks: using ipxe and chef to get to boot management blissStupid Boot Tricks: using ipxe and chef to get to boot management bliss
Stupid Boot Tricks: using ipxe and chef to get to boot management bliss
 

Recently uploaded

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 

Recently uploaded (20)

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 

MySQL Performance Monitoring

  • 1. MySQL Performance monitoring using Statsd and Graphite Art van Scheppingen Head of Database Engineering
  • 2. 2   1.  Who  are  we?   2.  What  monitoring  tools  do  we  use?   3.  What  are  StatsD,  Collectd  and  Graphite?   4.  How  MySQL  logs  to  StatsD   5.  Graphing  examples   6.  Challenges   7.  QuesHons?   Overview
  • 3. Who are we? Who  is  Spil  Games?    
  • 4. 4   •  Company  founded  in  2001   •  350+  employees  world  wide   •  180M+  unique  visitors  per  month   •  Over  50M  registered  users   •  45  portals  in  19  languages   •  Casual  games   •  Social  games   •  Real  Hme  mulHplayer  games   •  Mobile  games   •  35+  MySQL  clusters   •  60k  queries  per  second  (3.5  billion  qpd)   Facts
  • 5. 5   Geographic Reach 180  Million  Monthly  AcHve  Users(*)   Source:  (*)  Google  Analy3cs,  August  2012    
  • 6. 6   Girls,  Teens  and  Family     spielen.com   juegos.com   gamesgames.com   games.co.uk   Brands
  • 7. Monitoring We  use(d)  many  many  many   monitoring  tools  so  far!    
  • 8. 8   •  Opsview/Nagios  (mainly  availability)   •  CacH  (using  Baron  Schwartz/Percona  templates)   •  MONYog   •  Good  ol’  RRD   Existing monitoring systems we use(d)
  • 9. 9   Opsview/Nagios •  Strong  points:   •  Easy  to  create  (nagios)  plugins   •  Slaves  for  scaling  out   •  Weak  points:   •  Stats  gathering  through  polling   •  Low  granularity  (1  to  5  minutes)   •  Difficult  URIs  for  graphs  
  • 10. 10   Cacti •  Strong  points:   •  Awesome  Percona  templates   •  Great  overviews  and  graphs   •  Weak  points:   •  Hard  to  add  new  metrics  (to  90+  servers)   •  Not  scalable   •  Low  granularity  (1  to  5  minutes)   •  Hard  to  correlate  
  • 11. 11   MonYOG •  Strong  points:   •  Easy  to  set  up   •  Compare  any  server  with  another   •  Compare  configuraHons   •  Weak  points:   •  “Closed  source”   •  Not  scalable   •  Jack  of  all  trades  
  • 12. 12   Poll limitations •  Limited  to  a  set  interval   •  Data  gets  averaged  out   •  (Host)  checks  are  run  serial   •  Slowdowns  in  a  run  means  no/less  data   •  Scaling:  add  more  masters/slaves   •  Sekng  up  an  SSH  connecHon  is  slow  
  • 13. 13   Difficult to add a new metric host065! bash-3.2# netstat -s | grep "listen queue"!     26 times the listen queue of a socket overflowed! ! host066! bash-3.2# netstat -s | grep "listen queue"!     33 times the listen queue of a socket overflowed!
  • 14. 14   Other things you can’t do!
  • 15. Statsd + Collectd + Graphite What  are  they?    
  • 16. 16   •  Highly  scalable  real-­‐Hme  graphing  system   •  Collects  numeric  Hme-­‐series   •  Backend  daemon  Carbon   •  Carbon-­‐cache:  receives  data   •  Carbon-­‐aggregator:  aggregates  data   •  Carbon-­‐relay:  replicaHon  and  sharding     •  RRD  or  Whisper  database   What is Graphite?
  • 17. 17   •  Each  metric  is  in  its  own  bucket   •  Periods  make  folders   •  prod.syseng.mmm.<hostname>.admin_offline   •  Metric  types   •  Counters   •  Gauge   •  RetenHon  can  be  set  using  a  regex   •  [mysql]     •  pasern  =  ^prod.syseng.mysql..*$     •  retenHons  =  2s:1d,1m:3d,5m:7d,1h:5y   Graphite’s capabilities
  • 18. 18   •  Unix  daemon  that  gathers  system  staHsHcs   •  Over  90  (input/output)  plugins   •  Plugin  to  send  metrics  to  Graphite/Carbon   •  Very  useful  for  system  metrics   What is Collectd?
  • 19. 19   •  Front-­‐end  proxy  for  Graphite/Carbon  (by  Etsy)   •  NodeJS  daemon  (also  other  languages)   •  Receives  UDP  (on  localhost)   •  Buffers  metrics  locally   •  Flushes  periodically  data  to  Graphite/Carbon  (TCP)   •  Client  libraries  available  in  about  any  language   •  Send  any  metric  you  like!   What is StatsD?
  • 20. 20   •  StatsD  funcHons   •  update_stats   •  increment/decrement   •  set   •  gauge   •  Hmers   StatsD functions
  • 21. 21   PHP:   $statsd = new StatsD();! $statsd->increment(“prod.app1.pages_rendered”, 1);! $statsd->gauge(“prod.app1.page_concurrency”, 10);! $statsd->set(“prod.app1.unique_users”, $userid);! …! $start = microtime(true); ! serve_out_content_to_clients(); ! $statsd->timing(”prod.app1.rendering_time", (microtime(true) - $start) * 1000);! ! Library:! https://github.com/etsy/statsd/blob/master/examples/php-example.php! ! StatsD PHP code examples
  • 22. 22   Our Graphite cluster(s) Client  requesHng  graphs   Graphite  Rendering  Cluster   Carbon  relay   Loadbalancer  (port  443)   DEV   SYSENG   SERVICES1   SERVICES2   Server-­‐1   Server-­‐2   Server-­‐n   Loadbalancer  (port  2003)   8 nodes 3 nodes 2 nodes
  • 24. 24   Collectd Collectd   Gather  data  plugins   CPU   DISK   LOAD   ….   Carbon  TCP   30 second interval
  • 25. 25   StatsD StatsD   ApplicaHon  Level   #  OF  LOGINS   CACHE  HIT/MISS   STATUS   INNODB  STATUS   Carbon  TCP   2 second interval MySQL_Statsd   localhost:8125 UDP
  • 27. MySQL + StatsD How  do  we  use  them?    
  • 28. 28   •  MySQL  plugin  for  Collectd   •  Sends  SHOW  STATUS   •  No  INNODB  STATUS   •  Plugin  not  flexible   •  DBI  plugin  for  Collectd   •  Metrics  based  on  columns   •  Different  granularity  needed   •  Separate  daemon  (with  persistent  connecHon)   •  StatsD  is  easy  as  ABC   Why use StatsD over Collectd?
  • 29. 29   •  Wrisen  in  Python   •  Gathers  data  every  0.5  seconds   •  Sends  to  StatsD  (localhost)  a•er  every  run   •  Easy  to  set  up:  no  configuraHon   •  Persistent  connecHon   •  Baron  Schwartz’  InnoDB  status  parser  (cacH  poller)   •  Other  interesHng  metrics  and  counters   •  InformaHon  Schema   •  MySQL  5.5/5.6  Performance  Schema   •  MariaDB  specific   •  Galera  specific   MySQL StatsD daemon
  • 30. 30   MySQL StatsD overview MySQLCollector SHOW STATUS SHOW INNODB STATUS SHOW VARIABLES Persistent connection StatsD Flushed every 0.5 seconds
  • 31. 31   •  Perl  (Net::Statsd)   •  Sends  any  status  change  to  StatsD  (localhost)   •  Non-­‐blocking  (thanks  to  UDP)   •  Draw  as  infinite  in  Graphite   MySQL Multi Master patch
  • 32. 32   use Net::Statsd;! $Net::Statsd::HOST = 'localhost'; # Default! $Net::Statsd::PORT = 8125; # Default! ! …! ! # ONLINE -> HARD_OFFLINE! unless ($ping && $mysql) {! Net::Statsd::update_stats('prod.syseng.mmm.'.$host.'.hard_offline', 1);! FATAL sprintf("State of host '%s' changed from %s to HARD_OFFLINE (ping: %s, mysql: %s)", $host, $state, ($ping? 'OK' : 'not OK'), ($mysql? 'OK' : 'not OK'));! $agent->state('HARD_OFFLINE');! }! ! …! ! MMM Perl code example
  • 33. 33   •  Deployments   •  User  iniHated  acHons   •  Logins   •  High  scores   •  Comments  /  raHngs   •  Images  uploaded   •  Payments   •  ApplicaHon  metrics   •  Error  counts   •  Cache  staHsHcs  (cache  hit/miss)   •  Request  Hmers   •  Image  sizes   Other metrics
  • 34. Start graphing! Now  it  starts  to  get   interes=ng!  
  • 35. 35   •  IdenHfy  your  KPIs   •  Don’t  graph  everything   •  More  graphs  ==  less  overview   •  Combine  metrics   •  Stack  clusters   What is important for you?
  • 36. 36   •  Include  other  metrics  into  your  graphs   •  Deployments   •  Failover(s)   •  Combine  applicaHon  metrics  with  your  database   •  Other  influences   •  Solar  flares   •  Start  of  the  new  Maya  calendar   Correlate!
  • 37. 37   •  URI  based  rendering  API   •  Support  for  wildcards   •  stats.prod.syseng.mysql.*.status.com_select   •  sumSeries  (stats.prod.syseng.mysql.*.status.com_select)     •  aliasByNode(stats.prod.syseng.mysql.*.status.com_select,  4)     •  Many  funcHons   •  Nth  percenHle   •  Holt-­‐Winters  Forecast   •  Timeshi•   Graphite Graphing Engine
  • 38. 38   Graphite Aggregator syseng => {!            nodes => [”databasehost1", ”databasehost2"],!            copying_relay_instances => 8,!            hashing_relay_instances => 8,!            cache_instances => 8,!            aggregation => {!                0 => {!                    name => ”mysql",!                    pattern => '.*.mysql..*',!                    send_raw => 1,!                },!            }!        }! ! ! stats.<env>.syseng.mysql.cluster1.status.questions.all (2) = ! !sum stats.<env>.syseng.mysql.*.status.questions! !
  • 39. 39   Graphite web interface                
  • 40. 40   Graphite Example URL https://graphitehost/render/? width=722&height=357&_salt=1366550446.553&rightDashed=1&target=alias %28sumSeries%28stats.prod.services.profilar.request.total.count.*%29%2C %22Number%20of%20profile%20requests%22%29&target=alias%28secondYAxis %28sumSeries%28stats_counts.prod.syseng.mysql.<node1>.status.questions%2C %20stats_counts.prod.syseng.mysql.<node2).status.questions%29%29%2C %22Number%20of%20queries%20profiles%20cluster %22%29&from=00%3A00_20130415&until=23%3A59_20130421!
  • 41. 41   Graphite Example URL https://graphitehost/render/? width=722&height=357&_salt=1366550446.553&rightDashed=1&target=alias %28sumSeries%28stats.prod.services.profilar.request.total.count.*%29%2C %22Number%20of%20profile%20requests%22%29&target=alias%28secondYAxis %28sumSeries%28stats_counts.prod.syseng.mysql.<node1>.status.questions%2C %20stats_counts.prod.syseng.mysql.<node2).status.questions%29%29%2C %22Number%20of%20queries%20profiles%20cluster %22%29&from=00%3A00_20130415&until=23%3A59_20130421!
  • 44. 44   Other examples: multiple weeks
  • 46. 46   •  MySQL_statsd  rewrite  necessary  (not  opensource  yet)   •  No  alerHng  through  Graphite  (yet)   •  Machine  learning   •  Eternal  hunger  for  more  metrics   •  Abuse  of  the  system   What challenges do we have?
  • 47. 47   •  Persistent  connecHons  +  repeatable  read   •  History  list  skyrocketed   •  Too  many  metrics  slows  down  graphing   •  Too  many  metrics  can  kill  a  host   •  EstatsD  for  Erlang   What lessons have we learned?
  • 49. 49   •  Graphite:   hsp://graphite.readthedocs.org/en/latest/   •  Collectd:   hsps://collectd.org/   •  StatsD  on  Github  by  Etsy:   hsps://github.com/etsy/statsd/wiki   •  Etsy  on  StatsD:   hsp://codeascra•.etsy.com/2011/02/15/measure-­‐ anything-­‐measure-­‐everything/     Practical links
  • 50. 50   •  PresentaHon  can  be  found  at:   hsp://spil.com/perconasc2013   •  If  you  wish  to  contact  me:   art@spilgames.com   •  Don’t  forget  to  rate  my  talk!   Thank you!