2. Optimization, Backups,
Replication, and more
3rd Edition
Covers Version 5.5
High
Performance
MySQL
Baron Schwartz,
Peter Zaitsev &
Vadim Tkachenko
ME
• Cofounder of @VividCortex
• Author of High Performance MySQL
• @xaprb on Twitter
• baron@vividcortex.com
• http://www.linkedin.com/in/xaprb
3. RANT, RECAPPED
• The sky is falling
• Tools drive processes, and we need better tools designed for methods
• Pay attention to CAPS (Capacity, Availability, Performance, Scalability)
• Monitoring tools need to be a lot smarter
• Measure and monitor “work getting done”
4. HARD CAPACITY
• Disk volume
• CPU Cycles
• max_connections
• File descriptors, sockets, TCP port
numbers, etc
• %used, absolute quantity available
5. SOFT CAPACITY
• Neil Gunther’s Universal Scalability
Law
• %used, absolute quantity available
• Throughput, concurrency, errors
6. AVAILABILITY
• Availability is absence of downtime • %used, absolute quantity available
• Throughput, concurrency, errors
• MTBF, MTTR, MTTD, %availability
7. TASK PERFORMANCE
• Task performance is consistently fast
response time.
• Measure an SLA in percentile
response time per task, over
observation intervals
• %used, absolute quantity available
• Throughput, concurrency, errors
• MTBF, MTTR, MTTD, %availability
• Response time, 95% response time
8. RESOURCE PERFORMANCE
• Resource performance is ability to run
tasks consistently fast.
• %used, absolute quantity available
• Throughput, concurrency, errors
• MTBF, MTTR, MTTD, %availability
• Response time, 95% response time
• Throughput, concurrency, busy time,
total response time, backlog/queue
9. SCALABILITY
• Universal Scalability Law again • %used, absolute quantity available
• Throughput, concurrency, errors
• MTBF, MTTR, MTTD, %availability
• Response time, 95% response time
• Throughput, concurrency, busy time,
total response time, backlog/queue
10. STALL DETECTION
• Overloaded or underperforming? • %used, absolute quantity available
• Throughput, concurrency, errors
• MTBF, MTTR, MTTD, %availability
• Response time, 95% response time
• Throughput, concurrency, busy time,
total response time, backlog/queue
• Utilization, saturation, errors, sources
of load/demand
12. WHAT NOT TO DO
• Don’t use top-N lists from Google
• Don’t just do what’s included in some
Nagios plugin
13. №1
TOP 10 LIST
1. MySQL availability
2. Presence of insecure users and databases
3. Aborted connects
4. Error log
5. Deadlocks
6. Change in server configuration
7. Slow query log
8. Slave lag
9. Percentage of maximum allowed connections
10. Percentage of full table scans
26. AVAILABILITY
• Ability to connect and run a query?
• Uptime is small?
• Replication is running?
27. PERFORMANCE
• You can get throughput (Queries) and concurrency (Threads_running) from MySQL
• But in a Nagios check, no context to know whether they’re good or bad
• You generally can’t get response time, busy time, utilization, backlog, etc
• You can aggregate thread states, thread times, users, databases, query abstracts...
29. THOU SHALT NOT
• Cache hit ratios
• Thread cache hit ratio
• Buffer pool cache hit ratio
• Table cache hit ratio
• Key cache hit ratio
• Query cache hit ratio
• Rates of “bad” queries
• % temp tables on disk
• % full table scans
• % slow queries
• Unfixable things
• Replication delay
30. WHY NOT?
• Those are properties of the workload and application
• They are not conditions to alert/warn about
• They are not fixable / actionable in the service
39. №1 ALERT!!!!!
CRIT
* Disk /dev/sda2 full
* Replication stopped
* Oldest transaction 86400 seconds
* 4999 threads in status “Waiting for table metadata lock”
40. HOLLER AT ME
QUESTIONS?
@XAPRB / BARON@VIVIDCORTEX.COM
41. RESOURCES
• Chapter 3 of High Performance MySQL, 3rd Edition
• Percona White Papers
• Causes of Downtime in Production MySQL Servers
• Preventing MySQL Emergencies
• Goal-Driven Performance Optimization
• Forecasting MySQL Scalability with the Universal Scalability Law
• Method R: Optimizing Oracle Performance, Cary Millsap
• The Goal, Eli Goldratt
• The USE Method (Brendan Gregg) & his new book
• Guerrilla Capacity Planning, Neil J. Gunther
• Fundamental Performance & Scalability Instrumentation