DDTrace

Distribution Statement A: Approved for public release; distribution is unlimited.
Distributed Tracing
Graeme Jenkinson @gcjenkinson
University of Cambridge
dtrace.conf
May 25, 2016
San Francisco, CA
© 2016. All rights reserved.
This material is based upon work sponsored by the Air Force Research Laboratory (AFRL) and the Defense Advanced Research Projects
Agency (DARPA) under Contract No. FA8650-15-C-7558. The views expressed are those of the authors and do not reflect the official policy
or position of the Department of Defense or the U.S. Government.

NAME
ddtrace – distributed tracing framework
SYNOPSIS
ddtrace [NODE]... [QUERY]
DESCRIPTION
ddtrace distributes event query expressions over
many hosts to track inter-node information flows
and temporal sequences, implementing post-hoc
trace aggregation, or as needed, tagging of TCP/IP
packets, filesystem RPCs, and application-layer
protocols with temporal and information-flow
labels.
AUTHOR
Written by Graeme Jenkinson
SEE ALSO
Pivot tracing, Dapper, X-Trace, Magpie

Capture distributed tracing use cases
Design space exploration
Prototype and refine designs
Trial on real world problems
Roadmap for distributed dtrace
Focus to
date

Security Event and Incident Management
Observed provenance
Monitoring client/server protocol
Scheduling for warehouse-scale computing
Performance monitoring/debugging computational finance
Use Cases
Transparent
computing
$$$
OPUS

Monitor client/server
protocols
#dtrace – n ‘fbt::tcp_state_change:entry {...}’

Key requirements
Production safe
Performance proportionality
Track causal relationships between nodes
Simply to package and deploy
Zero probe effect
when inactive
Which causal
relationships?
How to track causal
relationships?

Design principles
Log - append only
totally ordered sequence
of records
first next record
Record what
happened when
Update global log/
other data structures

Prototype
ddtrace
Machine readable
dtrace output

Separate stream
processing from
packaging a deployment
Minimise number of
moving parts

Prototype
ddtrace
Analyst
tools
d script compiled
here for arch
independence

Tracking causal relationships
Within
Per-cpu buffers Between nodes
Between
per-cpu buffers
A B
A happens-before B
A
B
tTSC(A) < tTSC(B)
A happens-before B
A CB
A happens-before B
Distributed commit log

TCP sequence
numbers
snd_nxt/
rcv_next
IPsec AH
sequence
number1
2
2
A 0
A B
A++ B
IP

header
AH

header
TCP

header
Data
int ipsec_checkreplay(u_int32_t seq, …);

Is a distributed commit log the right abstraction? What
are the semantics and performance required (how do they
compare to what Kafka gives)?
Is a framework the right approach to solve a range of
problems?
What infrastructure should we expect that people will
stand up? Is software running on a JVM OK (sometimes,
always)?
How do we get people interested and using our approach
on real world problems?
How do we deal will reliability? How to best get event
records out of the kernel?
Open questions

DDTrace

Recommended

Recommended

More Related Content

Similar to DDTrace

Similar to DDTrace (20)

Recently uploaded

Recently uploaded (20)

DDTrace