3. 3
# whoami
• Senior Security Research @ Intel
• Maintainer of Xen’s introspection subsystem
• Maintainer of LibVMI
• Hypervisor agnostic introspection library (Xen, KVM, etc)
• Lot’s of convenient APIs to do introspection with
• Background in malware research & black-box binary analysis
4. 4
Outline
1. Intro & Motivation
2. VM introspection
3. VM forking nuts & bolts
4. Fuzzing on Xen
• Harnessing & coverage tracing
• VMs with PCI-passthrough (IOMMU) devices
• Doublefetch detection
5. 5
Motivation
• Time-tested approach to software validation
• Conceptually straight-forward
• In practice can be difficult depending on what you want to fuzz
• How do you create coverage trace for the kernel?
• How do you recover fast enough for fuzzing to be effective?
• How do you ensure system is in the proper state?
• How do you fuzz kernel-internal interfaces?
• How do you detect more then just “crashes”?
6. 6
Kernel fuzzers do exist
• syzkaller
• Linux syscall fuzzer with built-in coverage guidance
• kAFL
• KVM based using AFL, coverage via Intel PT & PML
• Chocolate milk
• Custom bootloader & hypervisor, all in rust
7. 7
Why make another one?
• These platforms are tightly coupled to their use-case
• We wanted something stable but also flexible to build on
• Preferring code that’s upstream to cut down on time it takes to
maintain custom patches & debugging things when they break
• Xen’s VMI subsystem is still experimental but fits the bill
• Also allows us to consider new types of fuzzing approaches
• Also allows us to target new use-cases
8. 8
VM introspection
• Inspect VM internals from an external perspective
• Very similar to kernel debugging & memory forensics
• We can pause the VM at any event that traps to the VMM
• EPT faults
• Breakpoints
• CPUID
• Singlestep (MTF)
• Can do it both with in-guest help or without
9. 9
Why VM forking?
• We need a way to restore VMs to a start point quickly after each
fuzz cycle
• Restoring from a save-file can take up to 2s
• Even from a fast SSD or tmpfs
• Fuzzing to be effective we need to be faster than that
• Xen has a long-forgotten, half abandoned subsystem:
• Memory sharing!
• We can use it to create forks in a fast & lightweight manner!
10. 10
VM forking overview
1. Create VM with an empty EPT (ie. no memory)
2. Specify its parent VM
3. Copy vCPU parameters from parent
4. When VM is started it will page-fault back to Xen each time it tries
to access memory not yet mapped
5. Populate pages on-demand in the page-fault handler
• Read & execute accesses are populated with a shared entry
• Write accesses are deduplicated
11. 11
VM forking details
• It’s a bit different then fork() on Linux
• The parent domain currently remains paused while forks are active
• This was fine for our use-case
• For a full domain split, all the parent pages need to be made shared
• Pages that can’t be made shared would need an extra copy
• Doable, was out-of-scope for now
• Forks can be further forked!
• Pages are searched for recursively
12. 12
VM forking details
• VM forks can run with only CPU & memory
• No disk
• No networking
• No I/O
• No interrupts!
• It’s possible to launch QEMU to start backend services
• Patches implementing this are posted but not yet upstream
• Launching & resetting QEMU is slow
• Not a priority since it’s not required for fuzzing
13. 13
VM forks: resetting
• No need to keep creating forks for every fuzz iteration
• We can just reset a previously forked VM
• Re-copy vCPU settings from parent
• Keep memory shared entries in place
• Future iterations will be that much faster
• Throw-away deduplicated memory
• Reset speed depends on how much memory needs to be freed here
• During fuzzing it’s usually very few pages
14. 14
VM forking speed
VM fork creation time:
~745 μs ~= 1300 VM/s
VM fork reset time:
~111 μs ~= 9000 reset/s
Measured on i5-8350U
15. 15
Harnessing
• Fuzzer needs to know where the target code starts & stops
• Need to manually mark it
• Harness needs to trap to the hypervisor
• Should not have side-effects
• Code needs to execute normally between start & stop harness
• Code needs to consume some input
• We need to know where the input is so we can fuzz it
16. 16
Harnessing
CPUID instruction always traps to VMM
We use a magic CPUID leaf as our mark
No side-effect on target code, without
the fuzzer this is effectively a NOP
Call harness() before and after target code
Just printk info before the first harness!
17. 17
Harnessing
• Parent VM will display information about target (buffer address) on
its virtual serial console that we’ll fuzz
• Parent VM will trap to the VMM on CPUID
• Detect if it’s the start signal (magic value) and pause Parent VM
• Increments IP so vCPU will be next starting just after the CPUID
18. 18
Coverage tracing
• Fuzzer (AFL) needs to know when new code-paths are discovered
• By default AFL requires you to recompile your target
• Instruments each branch with hooks
• We don’t want to recompile the whole kernel
• We want to minimize the modifications we make to the target
• Just adding the calls to harness() and displaying relevant information
• During fuzzing code will run in a VM fork & the only visibility we
have is when it traps to VMM
19. 19
Coverage tracing with VMI
• We can read & write to the VM forks memory from the VMM!
1. Configure VM fork to trap breakpoints to the VMM
2. Read & disassemble code from start point (RIP)
3. Find next control-flow instruction
4. Replace it with breakpoint
5. Resume vCPU
6. Breakpoint traps, remove breakpoint and enable singlestep (MTF)
7. MTF traps, disable MTF, goto Step 2
• Works in nested setups as well (tested with Xen inside VMware)!
20. 20
Detecting crashes
• Breakpoint the kernel’s crash handlers
• Defined as “sink” points
• Breakpoints trap to the hypervisor, if any of them execute report
“crash”
• Good base targets to sink:
• panic()
• oops_begin()
• page_fault() or it’s new name asm_exc_page_fault()
21. 21
Putting it all together
1. Setup parent VM: trap on first call to harness()
2. Create first fork: breakpoint the sinks
3. Create second fork: fuzz, execute & collect coverage trace!
Parent VM -> Sink VM -> Fuzz VM
23. 23
Coverage tracing with Intel Processor Trace
• Disassembly, breakpoint & singlestep is expensive
• We can go faster if the silicon collects the info for us
• Designate memory location (up to 4GB) as PT buffer
• VM forks’ execution will be recorded there
• Need to decode custom PT buffer format to reconstruct coverage
• Can be tedious and existing decoders not designed for high-speed fuzzing
• Open Source community to the rescue: https://github.com/nyx-fuzz/libxdc
• Does not work in nested setup, only single address-space
25. 25
Alternative harnessing
• What if we can’t recompile our target to add the harness()?
• We can use a debugger to add breakpoints as our harness!
• Run with GDB, set breakpoint before & after target code
• Fuzzer needs to know original instruction before it was
breakpointed (really just the first byte)
• When breakpoint traps to the VMM, replace breakpoint with
original content
• Fuzz!
27. 27
PCI-passthrough devices & fuzzing
• Making sure your target code is in the right state can be difficult
• Kernel modules may only fully initialize if physical device is present
• We can attach device to parent VM!
• Kernel module fully initializes & actively drives device
• Harness & fork works just the same!
• Only parent VM has access to device
• VM fork can’t corrupt device
• VM fork can’t access the device
29. 29
Detecting doublefetches
• We can define any condition as a “crash”
• Detecting doublefetch conditions is very difficult
• Sometimes introduced by the compiler so source review is not sufficient
• We are already hooked into the VMM pagefault handler
• We can detect doublefetches using EPT
1. Remove R/W permissions from page suspected of being doublefetched from
2. When an access faults, record page offset, reset permission & singlestep
3. In singlestep handler remove permissions & continue
4. If next access fault is at the same offset: doublefetch detected!
31. 31
Code released as open-source (MIT)
VM forking is upstream in Xen 4.14
Kernel Fuzzer for Xen Project (kfx):
https://github.com/intel/kernel-fuzzer-for-xen-project