Sysdig is a new dynamic tracer for Linux, inspired by strace, dtrace, and tcpdump. Very useful as a super fast strace replacement and systemwide performance/security/etc. diagnostics.
6. that’s cool, but…
1 m illion sysca lls, as fast as possible
worst case for a ny tracer
# dd if=/dev/zero of=/dev/null bs=1k count=1M
1048576+0 records in
1048576+0 records out
1073741824 bytes (1.1 GB) copied, 0.332905 s, 3.2 GB/s
# strace -o /dev/null !!
1048576+0 records in
1048576+0 records out
1073741824 bytes (1.1 GB) copied, 18.2365 s, 58.9 MB/s
50x overhead
21. filters
fd.name FD full name. If the fd is a file, this
field contains the full path. If the FD
is a socket, this field contain the
connection tuple.
!
proc.apid the pid of one of the process
ancestors.
!
evt.latency delta between an exit event and the
correspondent enter event.
!
(...)
!
# sysdig -l | grep -Ec '^[a-z0-9_.]+'
88
23. back to that dd again…
# sysdig proc.name=not_dd > /dev/null & dd if=/dev/
zero of=/dev/null bs=1k count=1M ; killall sysdig
[1] 24070
1048576+0 records in
1048576+0 records out
1073741824 bytes (1.1 GB) copied, 0.981408 s, 1.1 GB/s
24. output formatting
sa m e as filters (mostly)
# sysdig -p '%user.name %proc.name %fd.name: %evt.res'
evt.failed = true
ubuntu cat /etc/shadow: EACCES
ubuntu cat /usr/share/locale/en_US.UTF-8/LC_MESSAGES/
libc.mo: ENOENT
ubuntu cat /usr/share/locale/en_US.utf8/LC_MESSAGES/
libc.mo: ENOENT
ubuntu cat /usr/share/locale/en_US/LC_MESSAGES/
libc.mo: ENOENT
25. bottleneck in a haystack
# sysdig -p '%evt.latency.s.%evt.latency.ns %evt.dir
%evt.type %fd.name' fd.type contains ip and fd.sport != 22
(...)
0.000000000 >sendto 192.168.1.118:36220->46.28.247.84:80
0.000114365 <sendto 192.168.1.118:36220->46.28.247.84:80
0.000000000 >recvfrom 192.168.1.118:36220->46.28.247.84:80
0.000005090 <recvfrom 192.168.1.118:36220->46.28.247.84:80
0.000000000 >close 192.168.1.118:36220->46.28.247.84:80
0.000001587 <close 192.168.1.118:36220->46.28.247.84:80
26. sysdig -w
sysdig -r
sysdig -r
sysdig -r
.scap file
shit’s on fire, yo
capture trace file,
restore service analyze trace at your leisure
27. lies, damn lies and benchmarks
sysdig -w
sysdig -r
sysdig -r
sysdig -r
.scap file
do a single
benchmark run
analyze/postprocess
lots of ways
29. chisel all the things!
# sysdig -cl | grep -c ^[a-z]
37
# find /usr/share/sysdig/chisels/ -name '*.lua' | wc -l
42
the extra ones a re utilities to use in ch isels
(json, A NSI term ina l, etc.)
30. chisels: performance
bottlenecks Slowest system calls
fileslower Trace slow file I/O
netlower Trace slow network I/O
proc_exec_time Show process execution time
scallslower Trace slow syscalls
topscalls Top system calls by number of calls
topscalls_time Top system calls by time
yu p, a ty po ;)
31. chisels: security
list_login_shells List the login shell IDs
!
shellshock_detect print shellshock attacks
!
spy_users Display interactive user activity
power corru pts,
absolute power is even more fun
32. All right gentlemen,
we need some system info
lsof, ps, n etstat
lsof, ps, netstat
with time travel
http://draios.com/ps-lsof-netstat-time-travel/
34. version 0.1.91
do you feel lucky?
• some syscalls not yet implemented (no args)
• it did crash once (fixed immediately though)
• PID namespaces ignored
• root/privileged user only
• one sysdig process at a time
way better tha n strace though