6. Who decides what to monitor?
Pre-Define checks
Check only what os needed
OR
Let the system you monitor decide what to
expose
Alert based on that data
7. Push or pull ?
Push: need to know and configure where your
server is
What if 2 3 10 servers?
What in dev?
Pull: Let any server take the metrics
Use service discovery to know what to
fetch
13. Performance
Prometheus is designed to fetch data in an
interval measured in SECONDS
Designed to handle lots of metrics
New storage engine in 2.0 to scale even better
and support usecases like kubernetes
14. Data Centric
A Metric in Prometheus has metadata:
myql_global_status_handlers_total{handler="tmp_write"} 1122
And lots of function to filter, change, remove...
those metadata while fetching them.
15. Metrics type
Counters (always go up)
Gauge (go up and down)
Histograms (aggregate by buckets)
Summary (percentiles) (most of the time
useless)
16. A word about
Prometheus vs Graphite
Prometheus does not see a metric as an "event".
Metrics are current value until they are replaced.
You can not see when a metric has been included
in Prometheus.
For Events, Prometheus refers to Elasticsearch.
19. Prometheus uses HTTP and
PLAIN TEXT.
http(s) is supported
basic auth / token also
(exposing https is not Prometheus job)
20. $ curl http://127.0.0.1:9090/metrics
# HELP prometheus_notifications_queue_length
# The number of alert notifications in the queue.
# TYPE prometheus_notifications_queue_length gauge
prometheus_notifications_queue_length 0
# HELP prometheus_notifications_sent_total
# Total number of alerts successfully sent.
# TYPE prometheus_notifications_sent_total counter
prometheus_notifications_sent_total{alertmanager="127.0.0.1
:9093"} 4.796464e+06
26. Exporters
Exporters expose metrics with an HTTP API
They connect to the real target
Bindings available for many languages
Exporters do not save data ; they are not
"proxies" and don't "cache" anything
40. MySQL Replication
MySQL Master <-> MySQL Master
MySQL Master -> MySQL Slave
MySQL Master -> MySQL Slave -> MySQL
Slave
MySQL Masters -> MySQL Slaves -> MySQL
Slaves -> MySQL Slaves
MySQL Master -> MySQL Slaves
41. pt-heartbeat
pt-heartbeart is a daemon that updates an entry
with current timestamp on a mysql server every
second.
On the replica, you can check the timestamp and
do NOW timestamp to get the real lag.
+++
| ts | server_id |
+++
| 20170817T16:55:01.001030 | 1 |
+++
65. Alertmanager
When prometheus has an alert, it sends it.
Every minute by default, as long as the alert is
ongoing
Alertmanager, a separated daemon, do the
rest of the work
67. grouping alerts
5 nodes are down. Do you want 5 email?
group_by: ['alertname', 'cluster', 'service']
68. inhibition
Datacenter is on fire. Do you want to know that
switches, hosts, services are down?
source_match:
severity: 'critical'
target_match:
severity: 'warning'
71. Alerting routes
You can send alerts to defined people based
on routes
Everything to logs mailbox
Critical alerts
Network to net team SMS
Svc to app team SMS
Warning alerts
Network to net team mail
Svc to app team mail
72. High Availability: Prometheus
Have different prometheis servers with the
same config
They do not talk to each other
All of them fetch the same data
They monitor each other
73. High availability: Alertmanager
Have multiple alertmanagers with the same
config
Prometheis send alerts to all alertmanagers
Alert manager talk to each other not to send
the same notification
75. One tool does one job...
Prometheus will collect data
Alertmanager will send notifications
Exporters will expose data
Grafana will graph data
76. Grafana
Open Source (Apache 2.0)
Web app
Specialized in visualization
Pluggable
Multiple datasources: prometheus, graphite,
influxdb...
Has an API!
77. History of Grafana
Grafana is a fork of Kibana 3 ; used to be JS-
Driven.
Now fully featured, requires a database, multi-
projects/users support, etc...
85. Multiple Prometheus instances
You can add multiple prometheus instances
on grafana
You can add dropdown on the top to select
which one you want to use
Use case: prometheus HA, local prometheus
(with access mode=direct)
86. Creating Grafana Dashboards
Takes time
Requires deep knowledge of the tools
Improved over time
Easy to share (json + online library)
87. Percona Grafana Dashboard
Percona Open Sourced Grafana Dashboards
Covering MySQL, Mongo and Linux monitoring
Part of a bigger picture, PMM, but usable
standalone
Open Source (AGPL!)
https://github.com/percona/grafana-
dashboards
88. Installing Percona Graphes
Method 1
Enable File dashboards in Grafana
Clone grafana-dashboards to the configured
location (or make a package)
Method 2
Use the Grafana API to upload the JSON's.
89. MySQL Setup
You'll need mysqld_exporter, with a user
MySQL 5.1+
Performance Schema for full set of metrics
mysqld_exporter
-collect.binlog_size=true
-collect.info_schema.processlist=true`
95. We don't need all of them?
Because Grafana is just viz, you can import
only the one you want (e.g. exclude Mongo)
You can import later any extra dashboard you
need
104. Jsonnet library to create dashboard
Brand new, still WIP
https://github.com/grafana/grafonnet-lib
106. Conclusions
Prometheus and Grafana are first-class
monitoring tools
Totally different approach than other tools
Embeddable into your apps
Percona Dashboards gets your graphes ready
in no-time with minimal efforts