This document summarizes a presentation about using DTrace on OS X. It introduces DTrace as a dynamic tracing tool for user and kernel space. It discusses the D programming language used for writing DTrace scripts, including data types, variables, operators, and actions. Example one-liners and scripts are provided to demonstrate syscall tracking, memory allocation snooping, and hit tracing. The presentation outlines some past security work using DTrace and similar dynamic tracing tools. It concludes with proposing future work like more kernel and USDT tracing as well as Python bindings for DTrace.
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
1. DTrace + OS X = Fun
Andrzej Dyjak (@dyjakan)
Confidence 2015, Kraków
www.census-labs.com
2. > AGENDA
• Part 1: Introduction
I. What is DTrace?
II. D language
III. Past work
IV. Similar projects
• Part 2: Usage
I. One-liners
II. Scripts
III. Future work
IV. References
4. > What is DTrace?
„DTrace is a comprehensive dynamic tracing facility
(...) that can be used by administrators and
developers on live production systems to examine
the behavior of both user programs and of the
operating system itself. DTrace enables you to
explore your system to understand how it works,
track down performance problems across many
layers of software, or locate the cause of aberrant
behavior.”
To put it simply: Retarded debugger / DBI engine for
user and kernel modes.
9. BONUS: USDT (User-Level Statically
Defined Tracing)
„(…) providing debug macros that can be
customized and placed throughout the
code.”
Debugging / analysis capabilities can be
improved even more.
10. > D language
• Data types
• Variables
• Built-ins
• Operators
• Control statements
• Actions & subroutines
• Default providers
11. > Data types
• char, short, int, long, long long, float,
double, long double
• Aliases (like int32_t)
• You can dereference pointers and walk
structure chains
• You can cast things
14. > Operators
• Arithmetic
• Relational (apply also to strings, e.g. As a
predicate /execname == ”foobar”/)
• Logical (XOR is ^^)
• Bitwise (XOR is ^)
• Assignment
• Increment / Decrement
21. > Past work (in the context of
security)
• BlackHat 2008 (and some others)
– „RE:Trace - Applied Reverse Engineering on
OS X” by Tiller Beauchamp and David Weston
• Infiltrate 2013
– „Destructive D-Trace” by nemo
22. > Similar projects (among others)
• SystemTap (Red Hat)
– Very similar to DTrace, kinda like a response from
Red Hat for Linux
– For interesting usage case see http://census-
labs.com/news/2014/11/06/systemtap-unbound-
overflow/
• Detours (Microsoft)
– „Software package for re-routing Win32 APIs
underneath applications.”
– Similar in functionality, differs in the implementation,
e.g.
http://blogs.msdn.com/b/oldnewthing/archive/2011/09/
21/10214405.aspx
30. > Tracking input
• I’ve covered this on my blog for read()
• However, often times mmap() is used
instead and this led to an interesting
problem
• Also, this can be reimplemented for
network input as well
36. > Memory allocation snooping
• Implementation of a simple tool that
imitates output of ltrace for memory
allocation functions from libc
But there are more possible scenarios, e.g.:
• Heap layout analysis
• Snooping into custom memory allocators
• Tracking kernel memory allocations
45. > Hit tracing
• Kinda like a code coverage but the end-goal
is different
• Two modes of operation:
– Shallow would mark functions within module
– Deep would mark instructions within a function
• Output is pre-processed and lands in IDA for
graph colorization
• Similar to
http://dvlabs.tippingpoint.com/blog/2008/07/1
7/mindshare-hit-tracing-in-windbg
46. > Future work
• More kernel work
• More USDT work (V8?)
• Python-based DTrace consumer (a.k.a.
Python bindings)
I’m open to ideas, don’t be shy and mail me.
47. > References
• http://dtrace.org/blogs/
• https://wikis.oracle.com/display/DTrace/Docu
mentation
• http://dtracebook.com
• http://dtracehol.com
• http://phrack.org/issues/63/3.html
• „Dynamic Instrumentation of Production
Systems” Cantrill, Shapiro, Leventhal
• Apple TN2124, DTrace entry
DTrace was designed and implemented by Bryan Cantrill, Mike Shapiro, and Adam Leventhal. It was released in 2005. Since then it was open-sourced and for now it is supported by Solaris, Mac OS X, FreeBSD, NetBSD, Linux kernel (partially). Particularly OS X included dtrace in 2007 (version 10.5. Leopard) as part of Instruments testing suite.
Stability was core assumption, that’s why there is no overhead when probes are disabled and also that’s why it’s limited in functionality.
DTrace mascot.
PROVIDER gives us general funcionality; MODULE sets the module we're focusing on (e.g. specific dylib); FUNCTION specifies function within module (this poses a limitation i.e. you can’t trace binaries that have their symbols stripped; ofc that also applies to unusal ‘calls’ like JMPing into code chunk instead of calling it – these will be invisible to dtrace); NAME gives us some idea about semantic meaning (e.g. entry/return, BEGIN/END, also when tracing a function you can specify offset within a function (at this offset you can e.g. peek into memory pointed by some register) or leave NAME blank to trace all the instruction within); PREDICATE acts as a conditional, and ACTIONS are what's gonna happen when the probe fires.
NOTE: For PROVIDER:MODULE:FUNCTION:NAME you can use wildcards, e.g. *open* will trace any function with ‘open’ string in it.
You can use Dtrace rapidly as one-liners and for more challenging tasks we can switch to scripting. Also, D scripts can be embedded into e.g. bash scripts to gain additional possibilities like argument parsing. Note: dtrace requires root privileges (interaction with kernel mode + destructive actions)
Example of dtrace script and a one-liner. Talk about dtrace provider and its BEGIN and END probe (they fire on starting and ending of a dtrace script).
Dtrace command invokes the compiler for the D language that outputs D Intermedaite Format which is sent to kernel part. As previously mentioned, Dtrace is pretty strict about corectness and guarantees safety with no additional overhead when probes are disabled (and in fact a system with disabled probes is identical to a system without dtrace at all).
There is a possibility of stand-alone consumer as e.g. Pythons bindings. Dtrace providers are kernel modules that talk with dtrace kernel module through API.
You yourself can put static probes inside of the application, re-compile it and improve analysis with dynamic activation of the USDT probes when required (neglibile overhead when disabled).
Possibilities examples: The JavaScript provider uses USDT to instrument the Mozilla JavaScript engine (Spider Monkey). It provides probes for function calls, object creation, garbage collection, and code execution. Basically you can use this provider to trace the operation of JavaScript code. I did not test it, so I’m not sure if this is still ‘the thing’ but the sole possibility is enough (e.g. woudln’t V8 equivalent be awesome? maybe)
D is a C-like language as they will see in a second
C-like syntax and functionality. Ptr dereferencing & traversing structure chains. Also, character escapes sequences are same as in C (e.g. \n = backslash n for newline)
We do not declare data type of a variable explicitly; Associative arrays = keys are tuples; We can declare a variable without initilization; When we zero-out variable it’s freed.
Talk about global / clause / thread locality (when would we use it? Mention later examples)
External variables are a way to access kernel variables in your Dtrace script. You need to pre-pend variable name with a backtick character to access them.
Also, worth mentioning that Dtrace supports structs and unions along with typedefs. And even bit fields!
Talk about each and every one. Curpsinfo points to psinfo_t struct, curlwpsinfo points to lwpsinfo_t, and curthread points to kthread_t both are internal structures for the current process and thread. Give examples for args usage e.g. File descriptors from your scripts + Note: Args for C++ methods can be tricky to access, this is because it’s up to C++ compiler to organize arguments and you need to know how they’re organized before tracing operation (i.e. Which argument is the this ptr and are there any compiler bonus args). Ppid is parent PID.
Double note that there is some more built-ins, worth checking it for yourself.
All this are C-like. Nothing much to say in this slide. Run through them and say e.g. ‘usual + - * /’ etc
This is due to guaranteed safety. Loops and ifs too easily can lead to never ending story (=break the system) which would break core assumption about safe usage on production systems. However, there are reserved keyword for loops, ifs, gotos, et cetera but they never saw an actual implementation (or a release, who knows what Sun/Oracle and later Joyent did).
Stack() / ustack() – self explanatory, display kernel and user mode stack.
Tracemem() – dumps memory into the screen (peek a boo!)
Alloca() – dynamic mem allocation inside of the dtrace script
Bcopy() – might be used to copy data into newly allocated buffer
Copyin() / copyinstr() / copyinto() – used to peek into data from user-mode processes (e.g. playing with user-mode requires usage of these in order to transer data to kernel)
Msgsize() / strlen() – sometimes we want to measure sth
Talk for a second about pragmas, like quiet or destructive.
Stop() – stopping process at point XYZ
Raise() – sending signal, similar to kill cmd
Copyout() / copyoutstr() – allows data modification, nemo used it to tamper function call arguments (he did so for x86 where fcalls args are usually all passed via stack; for x64 where first 6 args are passed via registers it might work if the argument is a pointer not a value => take the arg from register and mangle memory pointed by it)
System() – execs an application
Breakpoint() – puts kernel-mode breakpoint (sucks if you don’t have connected debugger)
Panic() – induces kernel panic at specific point
Chill() – causing dtrace to spin for N nanoseconds, this is interesting when testing race conditions (you can slow down execution on purpose just to win races more often)
There is more providers and some of them might be interesting to _you_, e.g. Tcp/ip/udp providers might be interesting to sysadmins/network operators.
Syscall – for tracing syscalls
Pid – for tracing specific processes
Objc – apple specific, more in ‘man dtrace’, for tracing specific objd functionality
Fbt – function boundary tracing, you can use it to trace function from kernel (usage example for vulnerability analysis is in the ‘guide to kernel exploitation’ book)
Proc – process creation and termination monitoring (nicely used in recent ‘launchd’ blog post by wuntee)
Also, mention PHP and PYTHON providers as interesting (e.g. You can watch internals of the python script not the python VM when using python provider) and JS provider with conjunction to previously mentioned Firefox’s USDT.
Examples of active providers probes list (with counting!).
Pid provider is a-ok because by default nothing uses it (that’s why it shows 0). Fbt is huge because you can probe most of the kernel functions.
Tiller and David’s talk touches the aspect of vulnerability research (ease of analysis + HIDS + code coverage). They also introduce „RE:Trace framework” a mixture of DTrace and Ruby bindings however I was not able to find it in teh Internetz. Also, I’m not a jeweller so that’s a no-no for ruby bindings.
Nemo’s talk is mostly about rootkit-like functionality implemented via Dtrace (e.g. hiding files). I’m not so sure if this is the best course of action for gaining persistance but he presents interesting examples, mostly tampering with syscalls.
In general context there is shit load of resources about general Dtrace usage, I will list most interesting ones in the reference section.
Regarding SystemTap: there even exists compatibility between Dtrace and Systemtap when it comes to USDTs
Regarding provided link: Detours is the reason behind „mov edi, edi” hot-patch point inside of Microsoft’s DLLs. So it does introduce slight overhead even when disabled (as opposed to dtrace).
Go!
We’re snooping into Preview
These are logs for Safari execution
Googling for ideas is stil work-in-progress at google.
Global arrays for flagging FDs and MMAPs
Marking opened FD and printing out logging information.
Marking mmaps. Hm, but where’s our tracemem at mmap()’s return?
What? Tracemem() at munmap()? Well, as oppsoed to read() we can’t peek into memory at mmap()’s return even though the pointer is already valid. I found out that dtrace can’t peek into memory that was not previously touched and it seems that this is the case here. This has some down sides (memory could be altered at this point) but it’s better than nothing and I actually successfuly used it when tracking input for some OS X applications.
Just to be nice, we’re freeing closed FDs. This concludes input tracking, for working examples go to my blog and read latest post (even though for read() it’s basically the same).
Heap layout analysis when you’re performing heap exploitation (e.g. Can you somehow influence the heap layout? How reliably? Often times you can tinker with the heap but you don’t always get 100% reliability, then it’s good to know how many times your object is in the range where you want it to be).
Custom memory allocators are also very interesting, mainly because if you would snoop only into system API for memory allocs you wouldn’t get much meaning (=application makes its own pools) but if you study the mechanism of the allocator then you can insert probes at appropriate functions via pid provider and get meaningful information.
For kernel memory allocation we would need to utilize FBT provider in order to snoop into BSD wrappers or just go straight into the dragons den and snoop into MACH internals.
Note that when returning arg1 holds our return value instead of original argument.
For other allocation functions (valloc, calloc, realloc, reallocf) probes would look similar, hence no point in going through all of them however I did include them for the sake of completness.
Sidenote: what’s the difference between realloc() and reallocf()? When reallocf() fails it frees source buffer (this call is FreeBSD specific to which OS X is closely related).
Freeing is as simple as it gets (not really, but for current version this is how we do things).
Output example. However we can pipe this into villoc and get visualizations!
Merging with villoc is an on-going project; we needed to discuss couple of things (e.g. Is memory allocation on OS X thread-safe or not? (Aparently it is, since it’s a POSIX requirement) and other things along what faults do we want to detect) in any way working alfa version is available on my github.
This is work in progress (mainly due to IDA side of the tool). It should be soon available on my github.
Typical end goal for code coverage is to shrink an input pool for fuzzing operation of the application XYZ, I want to mark what code was touched for very specific input in order to speed up my analysis inside of a tool like IDA or Hopper. Yes, I am aware of IDA’s mac_servers for debugging integration. I’ve had some problems with them.
Regarding Python: When and if I finish python-based dtrace consumer I will open source it (if you’re interested you can follow me on twitter or github).