SlideShare a Scribd company logo
1 of 37
Download to read offline
DEEP INTO YOUR
APPLICATION ...
PERFORMANCE & PROFILING
/Fabien Arcellier @farcellier
ABOUT @FARCELLIER
Technical Architect, Developer, Life-
long learner at
Favourite subject : Devops,
Performance & Software craftmanship
Octo Technology
WHAT'S THE MENU
What means profiling a application ?
How does it works ?
Apply on real world application memcached
PROFILING IN A FEW WORDS ...
Software profiling is a form of dynamic
program analysis that measures, for
example :
the space or time complexity of a
program
the usage of particular instructions
the frequency and duration of function
calls, ...
@copyright wikipedia
TO GET THIS SORT OF REPORT ...
TO HAVE A BETTER VIEW ON WHAT'S
HAPPENS ON YOUR HARDWARE, ...
@copyright highscalability
TO IMPROVE YOUR APPLICATION
PERFORMANCE, ...
@copyright macifcourseaularge
You need measurements to improve continuously your
application performance.
TO UNDERSTAND YOUR
APPLICATION, ...
You want to understand what is consuming your CPU.
TO MONITOR YOUR SERVER, ...
Flame Graph Search
app
__libc_start_main
main
dot
mat_mul
You want to understand what your CPUs are doing.
AT THE BEGINNING THERE IS A
PROGRAM ...
int main(void)
{
  return 0;
}
int func1(void) {
  return 0;
}
Use gcc to compile it
gcc ­c app.c ­o app
WITH A SIMPLE SYMBOLS TABLE ...
readelf ­ Displays information about ELF files
readelf ­s app
45: 0000000000400580     2 FUNC    GLOBAL DEFAULT   13 __libc_csu_fini
46: 00000000004004f8    11 FUNC    GLOBAL DEFAULT   13 func1
...
57: 0000000000601040     0 NOTYPE  GLOBAL DEFAULT   25 _end
58: 0000000000400400     0 FUNC    GLOBAL DEFAULT   13 _start
59: 0000000000601038     0 NOTYPE  GLOBAL DEFAULT   25 __bss_start
60: 00000000004004ed    11 FUNC    GLOBAL DEFAULT   13 main
...
00000000004004ed : Virtual address of the symbol
FUNC : type.
main : Name of the symbol
HOW IT WORKS ?
60: 00000000004004ed    11 FUNC    GLOBAL DEFAULT   13 main
CAPTURE EVENTS AND ASSOCIATE
THEM TO SYMBOLS
Generally we can list 3 type of profilers :
Instrumented profiling
Sampling profiling
Event-based profiling (Java, .Net, ...)
INSTRUMENTED PROFILING
Gprof, Callgrind, ...
Pro
Capture all events
Granularity
Cons
Slower than raw execution (20 times slower for
callgrind)
Intrusive (modify code assembly or emulate a virtual
processor)
What they capture and what they show could differs
TOOLING - CALLGRIND
Callgrind is a callgraph analyzer that comes with Valgrind.
Valgrind is a virtual machine using just-in-time (JIT)
compilation techniques.
EXAMPLE WITH A MATRIX CALCULUS
You can instrument your execution with callgrind and
explore on kcachegrind.
SAMPLING PROFILING
Perf, Oprofile, Intel Vtune, ...
Pro
~5 or 10% slower than raw execution
Run on any code
Cons
Some events are invisible
SANDBOX - WRITE MY OWN
SAMPLING PROFILER
To understand how simple a sampling profiler is, write your
own thread dump using gdb.
gstack() {
  tmp=$(tempfile)
  echo thread apply all bt >"$tmp"
  gdb ­batch ­nx ­q ­x "$tmp" ­p "$1"
  rm ­f "$tmp"
}
You execute with frequency to know where your program is
spending time
while sleep 1; do gstack @pid@ ; done
TOOLING - PERF & FLAMEGRAPH
Perf instrumentation appears on linux 2.6+ (Ubuntu 11.10
& redhat 6)
common interface for hardware counter
Flamegraph is actively developped by Brendan Gregg
EXAMPLE WITH A MATRIX CALCULUS
Flame Graph
app
__libc_start_main
main
dot
mat_mul
We don't have any time record on mat_new, even if it's
called 3 times.
FLAMEGRAPH INSTALLATION
git clone https://github.com/brendangregg/FlameGraph.git
sudo ln ­s $PWD/flamegraph.pl /usr/bin/flamegraph.pl
sudo ln ­s $PWD/stackcollapse­perf.pl /usr/bin/stackcollapse­perf.pl
sudo ln ­s $PWD/stackcollapse­jstack.pl /usr/bin/stackcollapse­jstack.pl
sudo ln ­s $PWD/stackcollapse­gdb.pl /usr/bin/stackcollapse­gdb.pl
WHAT'S HAPPENDS INSIDE
MEMCACHE ?
COMPILE MEMCACHE
git clone https://github.com/memcached/memcached.git
cd memcached
./configure && make
WHAT'S HIDDEN INSIDE MEMCACHE
BINARY ?
readelf ­s ./memcached
...
434: 000000000040edf0    10 FUNC    GLOBAL DEFAULT   13 slabs_rebalancer_res
435: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND setuid@@GLIBC_2
436: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND event_base_loop
437: 0000000000412fd0   315 FUNC    GLOBAL DEFAULT   13 pause_threads
438: 00000000004135e0    10 FUNC    GLOBAL DEFAULT   13 STATS_LOCK
439: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND getaddrinfo@@GLIBC_2
440: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND strerror@@GLIBC_2
441: 000000000040f550   201 FUNC    GLOBAL DEFAULT   13 do_item_unlink
442: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND event_init
443: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND sleep@@GLIBC_2
444: 0000000000412b40   247 FUNC    GLOBAL DEFAULT   13 assoc_delete
...
WHAT'S HAPPENS WHEN I WRITE 100
RECORD ON MEMCACHE
Doing a test with valgrind (not production friendly)
Capture cpu usage with gdb
Capture cpu usage with perf_event
Capture cache miss with perf_event
MEMCACHE - PROFILING WITH
CALLGRIND
Understand what's happen internally by following execution
trace.
valgrind ­­tool=callgrind ­­instr­atstart=no ./memcached
On another terminal
callgrind_control ­i on
php memcache­set.php
callgrind_control ­i off
MEMCACHE - PROFILING WITH
CALLGRIND
kcachegrind callgrind.out.@pid@
MEMCACHE - PROFILING WITH GDB
./memcached &
while sleep 0.1; do gstack 8748; done > stack.txt
cat stack.txt | stackcollapse­gdb.pl | flamegraph.pl > gdb_graph.svg
In an another terminal
php memcache­set.php
MEMCACHE - PROFILING WITH PERF
We capture events to build callgraph
perf record ­g ./memcached
In an another terminal
php memcache­set.php
To show an interactive report
perf report
perf report ­­stdio
MEMCACHE - PROFILING CPU CYCLE
WITH PERF
perf script | stackcollapse­perf.pl | flamegraph.pl > graph_stack_missing.sv
Flamegraph
Some information from kernel are missing.
MEMCACHED - PROFILING CPU
CYCLE WITH PERF - WITH KERNEL
STACKTRACE
./memcached &
sudo perf record ­a ­g ­p @pid@
In an another terminal
php memcache­set.php
Generate the flamegraph
perf script | stackcollapse­perf.pl | flamegraph.pl > graph.svg
Flamegraph
MEMCACHED - PROFILING CACHE
MISS WITH PERF
./memcached &
sudo perf record ­e branch­misses ­a ­g ­p @pid@
SYSTEM - WHAT'S YOUR SYSTEM IS
DOING ?
sudo perf record ­a ­g
USE FLAMEGRAPH WITH JAVA
You can export a flamegraph from jstack output
Logstash contention flamegraph
GOING FURTHER
Perf wiki
Callgrind docs
Brendan Gregg website
How profilers lie: the cases of gprof and KCachegrind
Intel Vtune
TO SUMMARY
Prefer :
perf when you are looking for a bottleneck or you want to
watch what's happens on a machine
callgrind when you want to understand what's happen in
the code and when the performance is not a requirement
Deep into your applications, performance & profiling

More Related Content

Similar to Deep into your applications, performance & profiling

Porting Rails Apps to High Availability Systems
Porting Rails Apps to High Availability SystemsPorting Rails Apps to High Availability Systems
Porting Rails Apps to High Availability Systems
Marcelo Pinheiro
 
ALPHA Script - Presentation
ALPHA Script - PresentationALPHA Script - Presentation
ALPHA Script - Presentation
PROBOTEK
 

Similar to Deep into your applications, performance & profiling (20)

Aucklug slides - desktop tips and tricks
Aucklug slides - desktop tips and tricksAucklug slides - desktop tips and tricks
Aucklug slides - desktop tips and tricks
 
Porting Rails Apps to High Availability Systems
Porting Rails Apps to High Availability SystemsPorting Rails Apps to High Availability Systems
Porting Rails Apps to High Availability Systems
 
Product! - The road to production deployment
Product! - The road to production deploymentProduct! - The road to production deployment
Product! - The road to production deployment
 
ALPHA Script - Presentation
ALPHA Script - PresentationALPHA Script - Presentation
ALPHA Script - Presentation
 
Using Docker For Development
Using Docker For DevelopmentUsing Docker For Development
Using Docker For Development
 
Monitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaMonitoring using Prometheus and Grafana
Monitoring using Prometheus and Grafana
 
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
 
A Fabric/Puppet Build/Deploy System
A Fabric/Puppet Build/Deploy SystemA Fabric/Puppet Build/Deploy System
A Fabric/Puppet Build/Deploy System
 
Why we choose Symfony2
Why we choose Symfony2Why we choose Symfony2
Why we choose Symfony2
 
Deep Dive into Futures and the Parallel Programming Library
Deep Dive into Futures and the Parallel Programming LibraryDeep Dive into Futures and the Parallel Programming Library
Deep Dive into Futures and the Parallel Programming Library
 
Apache Spark Performance is too hard. Let's make it easier
Apache Spark Performance is too hard. Let's make it easierApache Spark Performance is too hard. Let's make it easier
Apache Spark Performance is too hard. Let's make it easier
 
Debugging Python with gdb
Debugging Python with gdbDebugging Python with gdb
Debugging Python with gdb
 
Kubernetes laravel and kubernetes
Kubernetes   laravel and kubernetesKubernetes   laravel and kubernetes
Kubernetes laravel and kubernetes
 
Frameworkless CLI app in PHP
Frameworkless CLI app in PHPFrameworkless CLI app in PHP
Frameworkless CLI app in PHP
 
Debugging Effectively in the Cloud - Felipe Fidelix - Presentation at eZ Con...
Debugging Effectively in the Cloud - Felipe Fidelix - Presentation at  eZ Con...Debugging Effectively in the Cloud - Felipe Fidelix - Presentation at  eZ Con...
Debugging Effectively in the Cloud - Felipe Fidelix - Presentation at eZ Con...
 
Real world Webapp
Real world WebappReal world Webapp
Real world Webapp
 
C# Production Debugging Made Easy
 C# Production Debugging Made Easy C# Production Debugging Made Easy
C# Production Debugging Made Easy
 
Scaling python webapps from 0 to 50 million users - A top-down approach
Scaling python webapps from 0 to 50 million users - A top-down approachScaling python webapps from 0 to 50 million users - A top-down approach
Scaling python webapps from 0 to 50 million users - A top-down approach
 
DCEU 18: Developing with Docker Containers
DCEU 18: Developing with Docker ContainersDCEU 18: Developing with Docker Containers
DCEU 18: Developing with Docker Containers
 
An OpenShift Primer for Developers to get your Code into the Cloud (PTJUG)
An OpenShift Primer for Developers to get your Code into the Cloud (PTJUG)An OpenShift Primer for Developers to get your Code into the Cloud (PTJUG)
An OpenShift Primer for Developers to get your Code into the Cloud (PTJUG)
 

Recently uploaded

Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 

Deep into your applications, performance & profiling

  • 1. DEEP INTO YOUR APPLICATION ... PERFORMANCE & PROFILING /Fabien Arcellier @farcellier
  • 2. ABOUT @FARCELLIER Technical Architect, Developer, Life- long learner at Favourite subject : Devops, Performance & Software craftmanship Octo Technology
  • 3. WHAT'S THE MENU What means profiling a application ? How does it works ? Apply on real world application memcached
  • 4. PROFILING IN A FEW WORDS ... Software profiling is a form of dynamic program analysis that measures, for example : the space or time complexity of a program the usage of particular instructions the frequency and duration of function calls, ...
  • 5. @copyright wikipedia TO GET THIS SORT OF REPORT ...
  • 6. TO HAVE A BETTER VIEW ON WHAT'S HAPPENS ON YOUR HARDWARE, ... @copyright highscalability
  • 7. TO IMPROVE YOUR APPLICATION PERFORMANCE, ... @copyright macifcourseaularge You need measurements to improve continuously your application performance.
  • 8. TO UNDERSTAND YOUR APPLICATION, ... You want to understand what is consuming your CPU.
  • 9. TO MONITOR YOUR SERVER, ... Flame Graph Search app __libc_start_main main dot mat_mul You want to understand what your CPUs are doing.
  • 10. AT THE BEGINNING THERE IS A PROGRAM ... int main(void) {   return 0; } int func1(void) {   return 0; } Use gcc to compile it gcc ­c app.c ­o app
  • 11. WITH A SIMPLE SYMBOLS TABLE ... readelf ­ Displays information about ELF files readelf ­s app 45: 0000000000400580     2 FUNC    GLOBAL DEFAULT   13 __libc_csu_fini 46: 00000000004004f8    11 FUNC    GLOBAL DEFAULT   13 func1 ... 57: 0000000000601040     0 NOTYPE  GLOBAL DEFAULT   25 _end 58: 0000000000400400     0 FUNC    GLOBAL DEFAULT   13 _start 59: 0000000000601038     0 NOTYPE  GLOBAL DEFAULT   25 __bss_start 60: 00000000004004ed    11 FUNC    GLOBAL DEFAULT   13 main ... 00000000004004ed : Virtual address of the symbol FUNC : type. main : Name of the symbol
  • 12. HOW IT WORKS ? 60: 00000000004004ed    11 FUNC    GLOBAL DEFAULT   13 main
  • 13. CAPTURE EVENTS AND ASSOCIATE THEM TO SYMBOLS Generally we can list 3 type of profilers : Instrumented profiling Sampling profiling Event-based profiling (Java, .Net, ...)
  • 14. INSTRUMENTED PROFILING Gprof, Callgrind, ... Pro Capture all events Granularity Cons Slower than raw execution (20 times slower for callgrind) Intrusive (modify code assembly or emulate a virtual processor) What they capture and what they show could differs
  • 15. TOOLING - CALLGRIND Callgrind is a callgraph analyzer that comes with Valgrind. Valgrind is a virtual machine using just-in-time (JIT) compilation techniques.
  • 16. EXAMPLE WITH A MATRIX CALCULUS You can instrument your execution with callgrind and explore on kcachegrind.
  • 17. SAMPLING PROFILING Perf, Oprofile, Intel Vtune, ... Pro ~5 or 10% slower than raw execution Run on any code Cons Some events are invisible
  • 18. SANDBOX - WRITE MY OWN SAMPLING PROFILER To understand how simple a sampling profiler is, write your own thread dump using gdb. gstack() {   tmp=$(tempfile)   echo thread apply all bt >"$tmp"   gdb ­batch ­nx ­q ­x "$tmp" ­p "$1"   rm ­f "$tmp" } You execute with frequency to know where your program is spending time while sleep 1; do gstack @pid@ ; done
  • 19. TOOLING - PERF & FLAMEGRAPH Perf instrumentation appears on linux 2.6+ (Ubuntu 11.10 & redhat 6) common interface for hardware counter Flamegraph is actively developped by Brendan Gregg
  • 20. EXAMPLE WITH A MATRIX CALCULUS Flame Graph app __libc_start_main main dot mat_mul We don't have any time record on mat_new, even if it's called 3 times.
  • 24. WHAT'S HIDDEN INSIDE MEMCACHE BINARY ? readelf ­s ./memcached ... 434: 000000000040edf0    10 FUNC    GLOBAL DEFAULT   13 slabs_rebalancer_res 435: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND setuid@@GLIBC_2 436: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND event_base_loop 437: 0000000000412fd0   315 FUNC    GLOBAL DEFAULT   13 pause_threads 438: 00000000004135e0    10 FUNC    GLOBAL DEFAULT   13 STATS_LOCK 439: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND getaddrinfo@@GLIBC_2 440: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND strerror@@GLIBC_2 441: 000000000040f550   201 FUNC    GLOBAL DEFAULT   13 do_item_unlink 442: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND event_init 443: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND sleep@@GLIBC_2 444: 0000000000412b40   247 FUNC    GLOBAL DEFAULT   13 assoc_delete ...
  • 25. WHAT'S HAPPENS WHEN I WRITE 100 RECORD ON MEMCACHE Doing a test with valgrind (not production friendly) Capture cpu usage with gdb Capture cpu usage with perf_event Capture cache miss with perf_event
  • 26. MEMCACHE - PROFILING WITH CALLGRIND Understand what's happen internally by following execution trace. valgrind ­­tool=callgrind ­­instr­atstart=no ./memcached On another terminal callgrind_control ­i on php memcache­set.php callgrind_control ­i off
  • 27. MEMCACHE - PROFILING WITH CALLGRIND kcachegrind callgrind.out.@pid@
  • 28. MEMCACHE - PROFILING WITH GDB ./memcached & while sleep 0.1; do gstack 8748; done > stack.txt cat stack.txt | stackcollapse­gdb.pl | flamegraph.pl > gdb_graph.svg In an another terminal php memcache­set.php
  • 29. MEMCACHE - PROFILING WITH PERF We capture events to build callgraph perf record ­g ./memcached In an another terminal php memcache­set.php To show an interactive report perf report perf report ­­stdio
  • 30. MEMCACHE - PROFILING CPU CYCLE WITH PERF perf script | stackcollapse­perf.pl | flamegraph.pl > graph_stack_missing.sv Flamegraph Some information from kernel are missing.
  • 31. MEMCACHED - PROFILING CPU CYCLE WITH PERF - WITH KERNEL STACKTRACE ./memcached & sudo perf record ­a ­g ­p @pid@ In an another terminal php memcache­set.php Generate the flamegraph perf script | stackcollapse­perf.pl | flamegraph.pl > graph.svg Flamegraph
  • 32. MEMCACHED - PROFILING CACHE MISS WITH PERF ./memcached & sudo perf record ­e branch­misses ­a ­g ­p @pid@
  • 33. SYSTEM - WHAT'S YOUR SYSTEM IS DOING ? sudo perf record ­a ­g
  • 34. USE FLAMEGRAPH WITH JAVA You can export a flamegraph from jstack output Logstash contention flamegraph
  • 35. GOING FURTHER Perf wiki Callgrind docs Brendan Gregg website How profilers lie: the cases of gprof and KCachegrind Intel Vtune
  • 36. TO SUMMARY Prefer : perf when you are looking for a bottleneck or you want to watch what's happens on a machine callgrind when you want to understand what's happen in the code and when the performance is not a requirement