SlideShare a Scribd company logo
1 of 26
Download to read offline
Instrumenting the
real-time web:
Node.js, DTrace and the
Robinson Projection
Bryan Cantrill
VP, Engineering

bryan@joyent.com
@bcantrill
Node.js

   • node.js is a JavaScript-based framework for building
     event-oriented servers:
      var http = require(‘http’);

      http.createServer(function (req, res) {
             res.writeHead(200, {'Content-Type': 'text/plain'});
             res.end('Hello Worldn');
      }).listen(8124, "127.0.0.1");

      console.log(‘Server running at http://127.0.0.1:8124!’);
The energy behind Node.js

   • node.js is a confluence of three ideas:
      • JavaScriptʼs rich support for asynchrony (i.e. closures)
      • High-performance JavaScript VMs (e.g. V8)
      • The system abstractions that God intended (i.e. UNIX)
   • Because everything is asynchronous, node.js is ideal for
     delivering scale in the presence of long-latency events
Node Knockout

   • In August of last year, Joyent hosted “Node Knockout”, a
    programming competition for the nascent node.js
    environment
   • Weekend-long competition in which teams of one to four
    endeavored to build something complete with node
   • For Joyent, this presented an opportunity to understand
    and observe the new environment in the wild
   • What could we learn about these systems and what
    could we convey to the contestants in real-time?
A Node Knockout leaderboard?

   • Even though the contest was judged, could we provide a
    real-time leaderboard?
   • @ryahʼs idea: instrument incoming connections, trace
    the remote IP address and then geo-locate in real-time
   • Would allow a leaderboard to reflect number of unique
    IPs per contestant -- and where theyʼre coming from
   • Would need instrumentation to be entirely transparent;
    log analysis and other offline techniques are both
    suboptimal and overly invasive
   • These constraints are a natural fit for DTrace...
DTrace

   • Facility for dynamic instrumentation of production
     systems originally developed circa 2003 for Solaris 10
   • Open sourced (along with the rest of Solaris) in 2005;
     subsequently ported to many other systems
   • Support for arbitrary actions, arbitrary predicates, in
     situ data aggregation, statically-defined instrumentation
   • Designed for safe, ad hoc use in production: concise
     answers to arbitrary questions
   • But how to use DTrace to instrument contestants?
Node + DTrace

   • DTrace instruments the system holistically, which is to
    say, from the kernel, which poses a challenge for
    interpreted environments
   • User-level statically defined tracing (USDT) providers
    describe semantically relevant points of instrumentation
   • Some interpreted environments e.g., Ruby, Python,
    PHP) have added USDT providers that instrument the
    interpreter itself
   • This approach is very fine-grained (e.g., every function
    call) and doesnʼt work in JITʼd environments
   • We decided to take a different tack for Node
Node + DTrace

   • Given the nature of the paths that we wanted to
    instrument, we introduced a function into JavaScript that
    Node can call to get into USDT-instrumented C++
   • Introduces disabled probe effect: calling from JavaScript
    into C++ costs even when probes are not enabled
   • Use USDT is-enabled probes to minimize disabled
    probe effect once in C++
   • If (and only if) the probe is enabled, prepare a structure
    for the kernel that allows for translation into a structure
    that is familiar to node programmers
Node USDT Provider

   • Example one-liners:
     dtrace -n ‘node*:::http-server-request{
        printf(“%s of %s from %sn”, args[0]->method,
            args[0]->url, args[1]->remoteAddress)}‘

     dtrace -n http-server-request’{@[args[1]->remoteAddress] = count()}‘

     dtrace -n gc-start’{self->ts = timestamp}’ 
        -n gc-done’/self->ts/{@ = quantize(timestamp - self->ts)}’


   • A more interesting script:
     http-server-request
     {
            self->ts[args[1]->fd] = timestamp;
     }

     http-server-response
     /self->ts[args[0]->fd]/
     {
            @[zonename] = quantize(timestamp - self->ts[args[0]->fd]);
     }
Instrumenting Node Knockout

   • With a USDT provider in place for Node, we could
    instrument contestants in a meaningful way
   • But how can contestants be instrumented given that
    each is executing in their own virtualized environment?
OS Virtualization

                   • The Joyent cloud uses OS virtualization to achieve high
                    levels of tenancy without sacrificing performance:

                          AMQP agents
AMQP message bus




                          (global zone)
                                          Virtual OS                  Virtual OS                  Virtual OS                  ...   Virtual OS
                           Provisioner

                           Heartbeater    Virtual NIC                                                                                                           Compute node

                                                        Virtual NIC


                                                                      Virtual NIC


                                                                                    Virtual NIC


                                                                                                  Virtual NIC


                                                                                                                Virtual NIC




                                                                                                                                    Virtual NIC


                                                                                                                                                  Virtual NIC
                                               ...                                                                                                              Tens/hundreds per




                                                                           ...




                                                                                                       ...




                                                                                                                                         ...
                              ...                                                                                                                                  datacenter




                                                             ZFS-based multi-tenant filesystem
                        SmartOS kernel




                   • Allows for transparent instrumentation of all virtual OS
                    instances from the global zone via DTrace
Leaderboard architecture

   • Define connection establishment/teardown to be “ticks”
   • Have a daemon instrument all virtual OS instances from
     each compute nodeʼs global zone, recording remote IP
     address and collecting ticks in a ring buffer
   • Poll the data periodically from a centralized server,
     pulling together a merged stream of ticks and geo-
     locating IPs
   • Have HTTP clients periodically poll the server, and
     rendering new connections on a world map
Leaderboard architecture

                                          Virtual     Virtual   Virtual   Virtual
                             tickerd       OS          OS        OS        OS


                                          Virtual     Virtual   Virtual   Virtual
                            .d     data
                                           OS          OS        OS        OS


                                                    DTrace
               leaderd

                                          Virtual     Virtual   Virtual   Virtual
HTTP                         tickerd       OS          OS        OS        OS
       LB
                                          Virtual     Virtual   Virtual   Virtual
                            .d     data
                                           OS          OS        OS        OS

               leaderd
                                                    DTrace




                                          Virtual     Virtual   Virtual   Virtual
                             tickerd       OS          OS        OS        OS


                                          Virtual     Virtual   Virtual   Virtual
                            .d     data
                                           OS          OS        OS        OS


                                                    DTrace
Leaderboard architecture
                 1,000 tick ring buffer

            10,000 tick ring buffer                               Virtual     Virtual   Virtual   Virtual
                                                     tickerd       OS          OS        OS        OS


                                                                  Virtual     Virtual   Virtual   Virtual
                                                    .d     data
                                                                   OS          OS        OS        OS


                                                                            DTrace
                           leaderd

                                                                  Virtual     Virtual   Virtual   Virtual
HTTP                                                 tickerd       OS          OS        OS        OS
       LB                                    HTTP

                                                                  Virtual     Virtual   Virtual   Virtual
                                                    .d     data
                                                                   OS          OS        OS        OS

                           leaderd
                                                                            DTrace




                                                                  Virtual     Virtual   Virtual   Virtual
                                                     tickerd       OS          OS        OS        OS
                     every 500 ms
                     every 100 ms                                 Virtual     Virtual   Virtual   Virtual
                                                    .d     data
                                                                   OS          OS        OS        OS
                     every 100 ms

                            700 ms latency                                  DTrace
Building it

    • Necessitated a libdtrace add-on for node for tickerd:
         https://github.com/bcantrill/node-libdtrace

    • Used existing node-geoip add-on for leaderd, but
     ultimately wrote a (much) simpler add-on:
         https://github.com/bcantrill/node-libgeoip

    • Used HTTP + Keep-alive for leaderd/tickerd
    • Simple architecture; very quick to build: ~400 lines of
     node for leaderd, ~500 lines of node for tickerd
    • Surprisingly, most time-consuming and brittle part was
     adding git statistics to tickerd!
Front-end challenges

   • How to present the geo-located IP connection
     information (latitude and longitude) visually?
   • When a sphere is projected onto a flat surface,
     something has to give: distance, shape, size, bearing
   • The two projections most commonly used to visualize
     location are both undesirable...
Equirectangular Projection
Mercator Projection
Robinson Projection FTW!
Robinson “Projection”

   • Youʼd be forgiven for assuming that the Robinson is
     actually a projection; quite the contrary:
     “I started with a kind of artistic approach. I visualized the best-looking
     shapes and sizes. I worked with the variables until it got to the point
     where, if I changed one of them, it didn't get any better. Then I figured
     out the mathematical formula to produce that effect. Most mapmakers
     start with the mathematics.”
                                                      - Arthur H. Robinson

   • Not surprisingly, implementing this is a mess...
   • ...and if you get it only slightly wrong, itʼs obvious
   • But Joyentʼs @rob_ellis stepped up and pulled it off:
        http://github.com/silentrob/Robinson-Projection
Robinson-based Leaderboard!
Experiences

   • Leaderboard very quickly got 1,000+ active users
   • CPU utilization remained negligible (< 6% of one CPU),
    but network utilization became “interesting”
   • Over the 48 hours of the contest (and for the week
    afterward), no tickerd failed; leaderd died twice due to
    memory leaks in Node (since fixed)
   • Most significant issue was a per-contestant graph
    updating in real-time that caused the browser to crash
    after ~15 minutes (graph was removed Sat. AM)
   • Interesting (mesmerizing?) to watch real-time
    geo-located connection data as contestantsʼ entries
    went globally viral
Epitome of a broader shift?

    • As the competition unfolded, it became clear that the
     leaderboard typified the entrants: many were data-
     intensive real-time systems
    • Also typified many of the early adopters of node.js:
     many came from environments that had unacceptable
     outliers when used in data-intensive real-time systems
    • Acronym clearly called for; CRUD, ACID, BASE, CAP:
     meet DIRT!
    • That node.js is such a fit for DIRT highlights that long
     latency events (and not CPU time) are the impediment
     to web-facing real-time systems
The primacy of latency

   • As a reminder, a real-time system is one in which the
     correctness of the system is relative to its timeliness
   • In such a system, it does not make sense to measure
     operations per second!
   • The only metric that matters is latency
   • This is dangerous to distill to a single number; the
     distribution of latency over time is essential
   • This poses both instrumentation and visualization
     challenges!
Instrumenting the real-time web

   • Weʼve taken a swing at this with the new cloud analytics
     facility in our no.de environment, a public node.js PaaS:




   • ...but thereʼs much more to be done to understand the
     coming breed of DIRTy applications!
Thank you!

   • Node Knockout Leaderboard shout-outs: @rob_ellis,
    @jahoni, @yoheis and @brianleroux
   • Node Knockout guys: @visnup and @gerad
   • Node DTrace USDT integration: @ryah and @rmustacc
   • no.de cloud analytics: @dapsays, @rmustacc,
    @rob_ellis and @notmatt

More Related Content

What's hot

The printing processes
The printing processesThe printing processes
The printing processesJutka Czirok
 
Delamination in sheetfed offset printing
Delamination in sheetfed offset printingDelamination in sheetfed offset printing
Delamination in sheetfed offset printingHeidelberg India
 
4. essential elements for inkjet printing
4. essential elements for  inkjet printing4. essential elements for  inkjet printing
4. essential elements for inkjet printingAdane Nega
 
Adhesive Coating methods part 1 copy
Adhesive Coating methods  part 1   copyAdhesive Coating methods  part 1   copy
Adhesive Coating methods part 1 copySHRIKANT ATHAVALE
 
BOPP Presentation for Customers Meet .
BOPP Presentation for Customers Meet .BOPP Presentation for Customers Meet .
BOPP Presentation for Customers Meet .Ashish Kothavade
 
Printing techniques
Printing techniquesPrinting techniques
Printing techniquesJill Allden
 
Effects of Friction in Papermaking
Effects of Friction in PapermakingEffects of Friction in Papermaking
Effects of Friction in PapermakingPekka Komulainen
 
Evonik - Performance amaciantes Household 2016
Evonik - Performance amaciantes Household 2016Evonik - Performance amaciantes Household 2016
Evonik - Performance amaciantes Household 2016Neiri Bozzolo
 
Problem cases in press rooms
Problem cases in press roomsProblem cases in press rooms
Problem cases in press roomsHeidelberg India
 
Strength technologies for tissue makers
Strength technologies for tissue makersStrength technologies for tissue makers
Strength technologies for tissue makersKemira Oyj
 
What is offset printing
What is offset printing What is offset printing
What is offset printing Usman Ali
 
Designing Online Learning, Web 2.0 and Online Learning Resources
Designing Online Learning, Web 2.0 and Online Learning ResourcesDesigning Online Learning, Web 2.0 and Online Learning Resources
Designing Online Learning, Web 2.0 and Online Learning ResourcesSanjaya Mishra
 
Polycarbodiimides – Easy to use Crosslinkers in Aqueous Automotive Interior C...
Polycarbodiimides – Easy to use Crosslinkers in Aqueous Automotive Interior C...Polycarbodiimides – Easy to use Crosslinkers in Aqueous Automotive Interior C...
Polycarbodiimides – Easy to use Crosslinkers in Aqueous Automotive Interior C...Stahl Holdings
 

What's hot (16)

The printing processes
The printing processesThe printing processes
The printing processes
 
Delamination in sheetfed offset printing
Delamination in sheetfed offset printingDelamination in sheetfed offset printing
Delamination in sheetfed offset printing
 
4. essential elements for inkjet printing
4. essential elements for  inkjet printing4. essential elements for  inkjet printing
4. essential elements for inkjet printing
 
Adhesive Coating methods part 1 copy
Adhesive Coating methods  part 1   copyAdhesive Coating methods  part 1   copy
Adhesive Coating methods part 1 copy
 
BOPP Presentation for Customers Meet .
BOPP Presentation for Customers Meet .BOPP Presentation for Customers Meet .
BOPP Presentation for Customers Meet .
 
Piant Film formation
Piant Film formationPiant Film formation
Piant Film formation
 
Printing techniques
Printing techniquesPrinting techniques
Printing techniques
 
Effects of Friction in Papermaking
Effects of Friction in PapermakingEffects of Friction in Papermaking
Effects of Friction in Papermaking
 
Evonik - Performance amaciantes Household 2016
Evonik - Performance amaciantes Household 2016Evonik - Performance amaciantes Household 2016
Evonik - Performance amaciantes Household 2016
 
Problem cases in press rooms
Problem cases in press roomsProblem cases in press rooms
Problem cases in press rooms
 
Web offset presses
Web offset pressesWeb offset presses
Web offset presses
 
Strength technologies for tissue makers
Strength technologies for tissue makersStrength technologies for tissue makers
Strength technologies for tissue makers
 
What is offset printing
What is offset printing What is offset printing
What is offset printing
 
Designing Online Learning, Web 2.0 and Online Learning Resources
Designing Online Learning, Web 2.0 and Online Learning ResourcesDesigning Online Learning, Web 2.0 and Online Learning Resources
Designing Online Learning, Web 2.0 and Online Learning Resources
 
Polycarbodiimides – Easy to use Crosslinkers in Aqueous Automotive Interior C...
Polycarbodiimides – Easy to use Crosslinkers in Aqueous Automotive Interior C...Polycarbodiimides – Easy to use Crosslinkers in Aqueous Automotive Interior C...
Polycarbodiimides – Easy to use Crosslinkers in Aqueous Automotive Interior C...
 
Ppi intro to inks
Ppi   intro to inksPpi   intro to inks
Ppi intro to inks
 

Similar to Instrumenting the real-time web

OpenStack and OpenFlow Demos
OpenStack and OpenFlow DemosOpenStack and OpenFlow Demos
OpenStack and OpenFlow DemosBrent Salisbury
 
Learn OpenStack from trystack.cn ——Folsom in practice
Learn OpenStack from trystack.cn  ——Folsom in practiceLearn OpenStack from trystack.cn  ——Folsom in practice
Learn OpenStack from trystack.cn ——Folsom in practiceOpenCity Community
 
Hardware assisted Virtualization in Embedded
Hardware assisted Virtualization in EmbeddedHardware assisted Virtualization in Embedded
Hardware assisted Virtualization in EmbeddedThe Linux Foundation
 
Software Defined Data Centers - June 2012
Software Defined Data Centers - June 2012Software Defined Data Centers - June 2012
Software Defined Data Centers - June 2012Brent Salisbury
 
Windows Azure and the Hybrid Cloud
Windows Azure and the Hybrid CloudWindows Azure and the Hybrid Cloud
Windows Azure and the Hybrid CloudWindows Azure
 
Overview Of Parallel Development - Ericnel
Overview Of Parallel Development -  EricnelOverview Of Parallel Development -  Ericnel
Overview Of Parallel Development - Ericnelukdpe
 
Project ACRN hypervisor introduction
Project ACRN hypervisor introduction Project ACRN hypervisor introduction
Project ACRN hypervisor introduction Project ACRN
 
Cvc2009 Moscow Xd3 Fabian Kienle Final
Cvc2009 Moscow Xd3  Fabian Kienle FinalCvc2009 Moscow Xd3  Fabian Kienle Final
Cvc2009 Moscow Xd3 Fabian Kienle FinalLiudmila Li
 
Scalable Elastic Systems Architecture (SESA)
Scalable Elastic Systems Architecture (SESA)Scalable Elastic Systems Architecture (SESA)
Scalable Elastic Systems Architecture (SESA)Eric Van Hensbergen
 
Fun and Games with Mac OS X and iPhone Payloads, Black Hat Europe 2009
Fun and Games with Mac OS X and iPhone Payloads, Black Hat Europe 2009Fun and Games with Mac OS X and iPhone Payloads, Black Hat Europe 2009
Fun and Games with Mac OS X and iPhone Payloads, Black Hat Europe 2009Vincenzo Iozzo
 
Windows server 8 and hyper v
Windows server 8 and hyper vWindows server 8 and hyper v
Windows server 8 and hyper vSusantha Silva
 
Windows Sql Azure Cloud Computing Platform
Windows Sql Azure Cloud Computing PlatformWindows Sql Azure Cloud Computing Platform
Windows Sql Azure Cloud Computing PlatformEduardo Castro
 
Don't Tell Joanna the Virtualized Rootkit is Dead (Blackhat 2007)
Don't Tell Joanna the Virtualized Rootkit is Dead (Blackhat 2007)Don't Tell Joanna the Virtualized Rootkit is Dead (Blackhat 2007)
Don't Tell Joanna the Virtualized Rootkit is Dead (Blackhat 2007)Nate Lawson
 
Quantum PTL Update - Grizzly Summit.pptx
Quantum PTL Update - Grizzly Summit.pptxQuantum PTL Update - Grizzly Summit.pptx
Quantum PTL Update - Grizzly Summit.pptxOpenStack Foundation
 
Shape12 6
Shape12 6Shape12 6
Shape12 6pslulli
 
Quantum grizzly summit
Quantum   grizzly summitQuantum   grizzly summit
Quantum grizzly summitDan Wendlandt
 
Implementing SR-IOv failover for Windows guests during live migration
Implementing SR-IOv failover for Windows guests during live migrationImplementing SR-IOv failover for Windows guests during live migration
Implementing SR-IOv failover for Windows guests during live migrationYan Vugenfirer
 
Catan world and Churchill
Catan world and ChurchillCatan world and Churchill
Catan world and ChurchillGrant Goodale
 
NIAR_VRC_2010
NIAR_VRC_2010NIAR_VRC_2010
NIAR_VRC_2010fftoledo
 

Similar to Instrumenting the real-time web (20)

OpenStack and OpenFlow Demos
OpenStack and OpenFlow DemosOpenStack and OpenFlow Demos
OpenStack and OpenFlow Demos
 
Learn OpenStack from trystack.cn ——Folsom in practice
Learn OpenStack from trystack.cn  ——Folsom in practiceLearn OpenStack from trystack.cn  ——Folsom in practice
Learn OpenStack from trystack.cn ——Folsom in practice
 
Hardware assisted Virtualization in Embedded
Hardware assisted Virtualization in EmbeddedHardware assisted Virtualization in Embedded
Hardware assisted Virtualization in Embedded
 
Software Defined Data Centers - June 2012
Software Defined Data Centers - June 2012Software Defined Data Centers - June 2012
Software Defined Data Centers - June 2012
 
Windows Azure and the Hybrid Cloud
Windows Azure and the Hybrid CloudWindows Azure and the Hybrid Cloud
Windows Azure and the Hybrid Cloud
 
Overview Of Parallel Development - Ericnel
Overview Of Parallel Development -  EricnelOverview Of Parallel Development -  Ericnel
Overview Of Parallel Development - Ericnel
 
Windows Server 2012 Hyper-V Networking Evolved
Windows Server 2012 Hyper-V Networking Evolved Windows Server 2012 Hyper-V Networking Evolved
Windows Server 2012 Hyper-V Networking Evolved
 
Project ACRN hypervisor introduction
Project ACRN hypervisor introduction Project ACRN hypervisor introduction
Project ACRN hypervisor introduction
 
Cvc2009 Moscow Xd3 Fabian Kienle Final
Cvc2009 Moscow Xd3  Fabian Kienle FinalCvc2009 Moscow Xd3  Fabian Kienle Final
Cvc2009 Moscow Xd3 Fabian Kienle Final
 
Scalable Elastic Systems Architecture (SESA)
Scalable Elastic Systems Architecture (SESA)Scalable Elastic Systems Architecture (SESA)
Scalable Elastic Systems Architecture (SESA)
 
Fun and Games with Mac OS X and iPhone Payloads, Black Hat Europe 2009
Fun and Games with Mac OS X and iPhone Payloads, Black Hat Europe 2009Fun and Games with Mac OS X and iPhone Payloads, Black Hat Europe 2009
Fun and Games with Mac OS X and iPhone Payloads, Black Hat Europe 2009
 
Windows server 8 and hyper v
Windows server 8 and hyper vWindows server 8 and hyper v
Windows server 8 and hyper v
 
Windows Sql Azure Cloud Computing Platform
Windows Sql Azure Cloud Computing PlatformWindows Sql Azure Cloud Computing Platform
Windows Sql Azure Cloud Computing Platform
 
Don't Tell Joanna the Virtualized Rootkit is Dead (Blackhat 2007)
Don't Tell Joanna the Virtualized Rootkit is Dead (Blackhat 2007)Don't Tell Joanna the Virtualized Rootkit is Dead (Blackhat 2007)
Don't Tell Joanna the Virtualized Rootkit is Dead (Blackhat 2007)
 
Quantum PTL Update - Grizzly Summit.pptx
Quantum PTL Update - Grizzly Summit.pptxQuantum PTL Update - Grizzly Summit.pptx
Quantum PTL Update - Grizzly Summit.pptx
 
Shape12 6
Shape12 6Shape12 6
Shape12 6
 
Quantum grizzly summit
Quantum   grizzly summitQuantum   grizzly summit
Quantum grizzly summit
 
Implementing SR-IOv failover for Windows guests during live migration
Implementing SR-IOv failover for Windows guests during live migrationImplementing SR-IOv failover for Windows guests during live migration
Implementing SR-IOv failover for Windows guests during live migration
 
Catan world and Churchill
Catan world and ChurchillCatan world and Churchill
Catan world and Churchill
 
NIAR_VRC_2010
NIAR_VRC_2010NIAR_VRC_2010
NIAR_VRC_2010
 

More from bcantrill

Predicting the Present
Predicting the PresentPredicting the Present
Predicting the Presentbcantrill
 
Sharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of ToolmakingSharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of Toolmakingbcantrill
 
Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...bcantrill
 
I have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systemsI have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systemsbcantrill
 
Towards Holistic Systems
Towards Holistic SystemsTowards Holistic Systems
Towards Holistic Systemsbcantrill
 
The Coming Firmware Revolution
The Coming Firmware RevolutionThe Coming Firmware Revolution
The Coming Firmware Revolutionbcantrill
 
Hardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden AgeHardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden Agebcantrill
 
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator tracesTockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator tracesbcantrill
 
No Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's LawNo Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's Lawbcantrill
 
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software EngineeringAndreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software Engineeringbcantrill
 
Visualizing Systems with Statemaps
Visualizing Systems with StatemapsVisualizing Systems with Statemaps
Visualizing Systems with Statemapsbcantrill
 
Platform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system softwarePlatform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system softwarebcantrill
 
Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?bcantrill
 
dtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the uniondtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the unionbcantrill
 
The Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systemsThe Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systemsbcantrill
 
Papers We Love: ARC after dark
Papers We Love: ARC after darkPapers We Love: ARC after dark
Papers We Love: ARC after darkbcantrill
 
Principles of Technology Leadership
Principles of Technology LeadershipPrinciples of Technology Leadership
Principles of Technology Leadershipbcantrill
 
Zebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data pathZebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data pathbcantrill
 
Platform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyondPlatform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyondbcantrill
 
Debugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mindDebugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mindbcantrill
 

More from bcantrill (20)

Predicting the Present
Predicting the PresentPredicting the Present
Predicting the Present
 
Sharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of ToolmakingSharpening the Axe: The Primacy of Toolmaking
Sharpening the Axe: The Primacy of Toolmaking
 
Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...Coming of Age: Developing young technologists without robbing them of their y...
Coming of Age: Developing young technologists without robbing them of their y...
 
I have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systemsI have come to bury the BIOS, not to open it: The need for holistic systems
I have come to bury the BIOS, not to open it: The need for holistic systems
 
Towards Holistic Systems
Towards Holistic SystemsTowards Holistic Systems
Towards Holistic Systems
 
The Coming Firmware Revolution
The Coming Firmware RevolutionThe Coming Firmware Revolution
The Coming Firmware Revolution
 
Hardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden AgeHardware/software Co-design: The Coming Golden Age
Hardware/software Co-design: The Coming Golden Age
 
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator tracesTockilator: Deducing Tock execution flows from Ibex Verilator traces
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
 
No Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's LawNo Moore Left to Give: Enterprise Computing After Moore's Law
No Moore Left to Give: Enterprise Computing After Moore's Law
 
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software EngineeringAndreessen's Corollary: Ethical Dilemmas in Software Engineering
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
 
Visualizing Systems with Statemaps
Visualizing Systems with StatemapsVisualizing Systems with Statemaps
Visualizing Systems with Statemaps
 
Platform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system softwarePlatform values, Rust, and the implications for system software
Platform values, Rust, and the implications for system software
 
Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?Is it time to rewrite the operating system in Rust?
Is it time to rewrite the operating system in Rust?
 
dtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the uniondtrace.conf(16): DTrace state of the union
dtrace.conf(16): DTrace state of the union
 
The Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systemsThe Hurricane's Butterfly: Debugging pathologically performing systems
The Hurricane's Butterfly: Debugging pathologically performing systems
 
Papers We Love: ARC after dark
Papers We Love: ARC after darkPapers We Love: ARC after dark
Papers We Love: ARC after dark
 
Principles of Technology Leadership
Principles of Technology LeadershipPrinciples of Technology Leadership
Principles of Technology Leadership
 
Zebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data pathZebras all the way down: The engineering challenges of the data path
Zebras all the way down: The engineering challenges of the data path
 
Platform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyondPlatform as reflection of values: Joyent, node.js, and beyond
Platform as reflection of values: Joyent, node.js, and beyond
 
Debugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mindDebugging under fire: Keeping your head when systems have lost their mind
Debugging under fire: Keeping your head when systems have lost their mind
 

Instrumenting the real-time web

  • 1. Instrumenting the real-time web: Node.js, DTrace and the Robinson Projection Bryan Cantrill VP, Engineering bryan@joyent.com @bcantrill
  • 2. Node.js • node.js is a JavaScript-based framework for building event-oriented servers: var http = require(‘http’); http.createServer(function (req, res) { res.writeHead(200, {'Content-Type': 'text/plain'}); res.end('Hello Worldn'); }).listen(8124, "127.0.0.1"); console.log(‘Server running at http://127.0.0.1:8124!’);
  • 3. The energy behind Node.js • node.js is a confluence of three ideas: • JavaScriptʼs rich support for asynchrony (i.e. closures) • High-performance JavaScript VMs (e.g. V8) • The system abstractions that God intended (i.e. UNIX) • Because everything is asynchronous, node.js is ideal for delivering scale in the presence of long-latency events
  • 4. Node Knockout • In August of last year, Joyent hosted “Node Knockout”, a programming competition for the nascent node.js environment • Weekend-long competition in which teams of one to four endeavored to build something complete with node • For Joyent, this presented an opportunity to understand and observe the new environment in the wild • What could we learn about these systems and what could we convey to the contestants in real-time?
  • 5. A Node Knockout leaderboard? • Even though the contest was judged, could we provide a real-time leaderboard? • @ryahʼs idea: instrument incoming connections, trace the remote IP address and then geo-locate in real-time • Would allow a leaderboard to reflect number of unique IPs per contestant -- and where theyʼre coming from • Would need instrumentation to be entirely transparent; log analysis and other offline techniques are both suboptimal and overly invasive • These constraints are a natural fit for DTrace...
  • 6. DTrace • Facility for dynamic instrumentation of production systems originally developed circa 2003 for Solaris 10 • Open sourced (along with the rest of Solaris) in 2005; subsequently ported to many other systems • Support for arbitrary actions, arbitrary predicates, in situ data aggregation, statically-defined instrumentation • Designed for safe, ad hoc use in production: concise answers to arbitrary questions • But how to use DTrace to instrument contestants?
  • 7. Node + DTrace • DTrace instruments the system holistically, which is to say, from the kernel, which poses a challenge for interpreted environments • User-level statically defined tracing (USDT) providers describe semantically relevant points of instrumentation • Some interpreted environments e.g., Ruby, Python, PHP) have added USDT providers that instrument the interpreter itself • This approach is very fine-grained (e.g., every function call) and doesnʼt work in JITʼd environments • We decided to take a different tack for Node
  • 8. Node + DTrace • Given the nature of the paths that we wanted to instrument, we introduced a function into JavaScript that Node can call to get into USDT-instrumented C++ • Introduces disabled probe effect: calling from JavaScript into C++ costs even when probes are not enabled • Use USDT is-enabled probes to minimize disabled probe effect once in C++ • If (and only if) the probe is enabled, prepare a structure for the kernel that allows for translation into a structure that is familiar to node programmers
  • 9. Node USDT Provider • Example one-liners: dtrace -n ‘node*:::http-server-request{ printf(“%s of %s from %sn”, args[0]->method, args[0]->url, args[1]->remoteAddress)}‘ dtrace -n http-server-request’{@[args[1]->remoteAddress] = count()}‘ dtrace -n gc-start’{self->ts = timestamp}’ -n gc-done’/self->ts/{@ = quantize(timestamp - self->ts)}’ • A more interesting script: http-server-request { self->ts[args[1]->fd] = timestamp; } http-server-response /self->ts[args[0]->fd]/ { @[zonename] = quantize(timestamp - self->ts[args[0]->fd]); }
  • 10. Instrumenting Node Knockout • With a USDT provider in place for Node, we could instrument contestants in a meaningful way • But how can contestants be instrumented given that each is executing in their own virtualized environment?
  • 11. OS Virtualization • The Joyent cloud uses OS virtualization to achieve high levels of tenancy without sacrificing performance: AMQP agents AMQP message bus (global zone) Virtual OS Virtual OS Virtual OS ... Virtual OS Provisioner Heartbeater Virtual NIC Compute node Virtual NIC Virtual NIC Virtual NIC Virtual NIC Virtual NIC Virtual NIC Virtual NIC ... Tens/hundreds per ... ... ... ... datacenter ZFS-based multi-tenant filesystem SmartOS kernel • Allows for transparent instrumentation of all virtual OS instances from the global zone via DTrace
  • 12. Leaderboard architecture • Define connection establishment/teardown to be “ticks” • Have a daemon instrument all virtual OS instances from each compute nodeʼs global zone, recording remote IP address and collecting ticks in a ring buffer • Poll the data periodically from a centralized server, pulling together a merged stream of ticks and geo- locating IPs • Have HTTP clients periodically poll the server, and rendering new connections on a world map
  • 13. Leaderboard architecture Virtual Virtual Virtual Virtual tickerd OS OS OS OS Virtual Virtual Virtual Virtual .d data OS OS OS OS DTrace leaderd Virtual Virtual Virtual Virtual HTTP tickerd OS OS OS OS LB Virtual Virtual Virtual Virtual .d data OS OS OS OS leaderd DTrace Virtual Virtual Virtual Virtual tickerd OS OS OS OS Virtual Virtual Virtual Virtual .d data OS OS OS OS DTrace
  • 14. Leaderboard architecture 1,000 tick ring buffer 10,000 tick ring buffer Virtual Virtual Virtual Virtual tickerd OS OS OS OS Virtual Virtual Virtual Virtual .d data OS OS OS OS DTrace leaderd Virtual Virtual Virtual Virtual HTTP tickerd OS OS OS OS LB HTTP Virtual Virtual Virtual Virtual .d data OS OS OS OS leaderd DTrace Virtual Virtual Virtual Virtual tickerd OS OS OS OS every 500 ms every 100 ms Virtual Virtual Virtual Virtual .d data OS OS OS OS every 100 ms 700 ms latency DTrace
  • 15. Building it • Necessitated a libdtrace add-on for node for tickerd: https://github.com/bcantrill/node-libdtrace • Used existing node-geoip add-on for leaderd, but ultimately wrote a (much) simpler add-on: https://github.com/bcantrill/node-libgeoip • Used HTTP + Keep-alive for leaderd/tickerd • Simple architecture; very quick to build: ~400 lines of node for leaderd, ~500 lines of node for tickerd • Surprisingly, most time-consuming and brittle part was adding git statistics to tickerd!
  • 16. Front-end challenges • How to present the geo-located IP connection information (latitude and longitude) visually? • When a sphere is projected onto a flat surface, something has to give: distance, shape, size, bearing • The two projections most commonly used to visualize location are both undesirable...
  • 20. Robinson “Projection” • Youʼd be forgiven for assuming that the Robinson is actually a projection; quite the contrary: “I started with a kind of artistic approach. I visualized the best-looking shapes and sizes. I worked with the variables until it got to the point where, if I changed one of them, it didn't get any better. Then I figured out the mathematical formula to produce that effect. Most mapmakers start with the mathematics.” - Arthur H. Robinson • Not surprisingly, implementing this is a mess... • ...and if you get it only slightly wrong, itʼs obvious • But Joyentʼs @rob_ellis stepped up and pulled it off: http://github.com/silentrob/Robinson-Projection
  • 22. Experiences • Leaderboard very quickly got 1,000+ active users • CPU utilization remained negligible (< 6% of one CPU), but network utilization became “interesting” • Over the 48 hours of the contest (and for the week afterward), no tickerd failed; leaderd died twice due to memory leaks in Node (since fixed) • Most significant issue was a per-contestant graph updating in real-time that caused the browser to crash after ~15 minutes (graph was removed Sat. AM) • Interesting (mesmerizing?) to watch real-time geo-located connection data as contestantsʼ entries went globally viral
  • 23. Epitome of a broader shift? • As the competition unfolded, it became clear that the leaderboard typified the entrants: many were data- intensive real-time systems • Also typified many of the early adopters of node.js: many came from environments that had unacceptable outliers when used in data-intensive real-time systems • Acronym clearly called for; CRUD, ACID, BASE, CAP: meet DIRT! • That node.js is such a fit for DIRT highlights that long latency events (and not CPU time) are the impediment to web-facing real-time systems
  • 24. The primacy of latency • As a reminder, a real-time system is one in which the correctness of the system is relative to its timeliness • In such a system, it does not make sense to measure operations per second! • The only metric that matters is latency • This is dangerous to distill to a single number; the distribution of latency over time is essential • This poses both instrumentation and visualization challenges!
  • 25. Instrumenting the real-time web • Weʼve taken a swing at this with the new cloud analytics facility in our no.de environment, a public node.js PaaS: • ...but thereʼs much more to be done to understand the coming breed of DIRTy applications!
  • 26. Thank you! • Node Knockout Leaderboard shout-outs: @rob_ellis, @jahoni, @yoheis and @brianleroux • Node Knockout guys: @visnup and @gerad • Node DTrace USDT integration: @ryah and @rmustacc • no.de cloud analytics: @dapsays, @rmustacc, @rob_ellis and @notmatt