Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
0-60 with Goliath: High performance web services
1. 0-60 with Goliath
Building high performance (Ruby) web-services
Ilya Grigorik
@igrigorik
0-60 with Goliath @igrigorik
2. - “Social Analytics” - Goliath == v3 API stack
- Rails frontend - Open sourced in 2011
- Ruby backend - Growing community
- 95%+ of traffic via API’s
+ +
Brief History
Goliath @ PostRank
0-60 with Goliath @igrigorik
3. Rails
HTTP API …
HTTP API HTTP API HTTP API
SQL
SQL
SQL SQL
SQL …
SQL Solr
• Separate logical & physical services
• Easy to tune, easy to maintain, easy to “scale” PRO
• Stable code, fault-tolerance
• Higher upfront ops cost
CON • Lots of cross-service communication
0-60 with Goliath @igrigorik
5. www.goliath.io
• Single responsibility web-services
• Async HTTP response streaming + progressive notifications
• Async HTTP request streaming + progressive notifications
• Multiple requests within the same VM … lower ops costs
• Keep-alive support
… full HTTP 1.1 support
• Pipelining support
• Ruby API & “X-Ruby friendly” … Ruby polyglot!
• Easy to maintain & test
0-60 with Goliath @igrigorik
6. HTTP Pipelining + Keep-Alive 101
perhaps not as simple as it may seem…
0-60 with Goliath @igrigorik
7. conn = EM::HttpRequest.new('http://oredev.org')
r1 = conn.get :path => ”2011/speakers", :keepalive => true # 250 ms
r2 = conn.get :path => ”2011/faq" # 300 ms
# wait until done …
Total execution time is: Answer:
(a) 250 ms ~ 65% truthiness
(b) 300 ms ~ 25% truthiness * All of the above!
(c) 550 ms ~ 10% truthiness **
HTTP Quiz
this is not a trick question…
0-60 with Goliath @igrigorik
8. Client Server
20 ms
TCP handshake
HTTP Request
40 ms • headers, body
processing
Multi-part body (*)
Terminate connection
+ 40ms TCP setup (network)
+ 20ms request (network)
+ 40ms processing
+ 20ms response (network) HTTP 1.0
66% of time in network overhead RFC 1945 (1996)
0-60 with Goliath @igrigorik
9. Research done at Google shows that an increase
from 5Mbps to 10Mbps results in a disappointing
5% improvement in page load times.
Or put slightly differently, a 10Mbps connection,
on average uses only 16% of its capacity.
http://bit.ly/oemX0I
0-60 with Goliath @igrigorik
10. Keep-alive
• Re-use open connection
• No multiplexing, serial
• Default to “on”
Pipelining
• No multiplexing
• Parallel requests
HTTP 1.1
RFC 2616 (1999)
0-60 with Goliath @igrigorik
11. + 40ms TCP setup (network)
+ 20ms request (network)
+ 40ms processing
+ 20ms response (network)
x 40ms TCP setup (network)
+ 20ms request (network)
+ 40ms processing
+ 20ms response (network)
200ms for two requests
Small win over HTTP 1.0
Keep-alive
RFC 2616 (1999)
0-60 with Goliath @igrigorik
12. Net:HTTP
Connection: close < ugh!
Keep-alive
RFC 2616 (1999)
0-60 with Goliath @igrigorik
13. + 40ms TCP setup (network)
+ 20ms request (network)
+ 40ms processing
+ 20ms response (network)
60% of time in network overhead
120ms for two requests – 50% improvement!
Pipelining
RFC 2616 (1999)
0-60 with Goliath @igrigorik
14. Connection setup: 50ms
Request 1: 300ms
Request 2: 250ms
Total time:
(a) ~250 ms
(b) ~300 ms
(c) ~350 ms
(d) ~600 ms
Pipelining Quiz
RFC 2616 (1999)
0-60 with Goliath @igrigorik
15. Benchmark client round-trip time (RTT),
not just the server processing time
* a public service announcement
0-60 with Goliath @igrigorik
16. There is just one small gotcha…
Making HTTP Pipelining
Usable on the Open Web
http://tools.ietf.org/html/draft-nottingham-http-pipeline-01
0-60 with Goliath @igrigorik
17. conn = EM::HttpRequest.new('http://gogaruco.com')
r1 = conn.get :path => "speakers.html", :keepalive => true # 250 ms
r2 = conn.get :path => "schedule.html" # 300 ms
Total execution time is: Keep-alive what? HTTP 1.0!
(a) 250 ms ~ 65% truthiness
(b) 300 ms ~ 25% truthiness * Good: Keep-alive + Pipelining
(c) 550 ms ~ 10% truthiness ** Bad: Keep-alive + Garbage
“I’m confused”
Keep-alive: mostly works – yay! HTTP in the wild
Pipelining: disabled (except in Opera) it’s a sad state of affairs
0-60 with Goliath @igrigorik
18. HTTP can be a high-performance transport
Goliath is our attempt to make it work
0-60 with Goliath @igrigorik
19. Client API
“Sync API” (optional) Fibers optional async
async-rack Middleware Routing
0.3 ms
Ruby, JRuby, Rubinius … (streaming) HTTP Parser HTTP 1.1
EventMachine
Network
Goliath
Optimize bottom up + minimal client API
0-60 with Goliath @igrigorik
20. EventMachine
p "Starting"
while true do
timers EM.run do
network_io p "Running in EM reactor"
other_io end
end
p ”won’t get here"
EventMachine Reactor
concurrency without thread
0-60 with Goliath @igrigorik
21. EventMachine
Non-blocking IO requires non-blocking drivers:
AMQP http://github.com/tmm1/amqp
MySQLPlus http://github.com/igrigorik/em-mysqlplus
Memcached http://github.com/astro/remcached
DNS http://github.com/astro/em-dns
Redis http://github.com/madsimian/em-redis
MongoDB http://github.com/tmm1/rmongo
HTTPRequest http://github.com/igrigorik/em-http-request
WebSocket http://github.com/igrigorik/em-websocket
Amazon S3 http://github.com/peritor/happening
And many others:
http://wiki.github.com/eventmachine/eventmachine/protocol-implementations
0-60 with Goliath @igrigorik
23. class AsyncUpload < Goliath::API (streaming) HTTP Parser
def on_headers(env, headers)
env.logger.info 'received headers: ' + headers
end
def on_body(env, data)
env.logger.info 'received data chunk: ' + data
end
def on_close(env)
env.logger.info 'closing connection'
end
def response(env)
# called when request processing is complete
end
end
Async Request Processing
don’t need to wait for the full request…
0-60 with Goliath @igrigorik
24. (streaming) HTTP Parser
class Stream < Goliath::API
def response(env)
pt = EM.add_periodic_timer(1) { env.stream_send("hello") }
EM.add_timer(10) do
pt.cancel
env.stream_send("goodbye!")
env.stream_close
end
streaming_response 202, {'X-Stream' => 'Goliath’}
end
end
Async/Streaming Response
don’t need to render full response…
0-60 with Goliath @igrigorik
25. (streaming) HTTP Parser
class Websocket < Goliath::WebSocket
def on_open(env)
env.logger.info ”WebSocket opened”
end
def on_message(env, msg)
env.logger.info ”WebSocket message: #{msg}”
end
def on_close(env)
env.logger.info ”WebSocket closed”
end
def on_error(env, error)
env.logger.error error
end
Web-Sockets
end simple backend extension
0-60 with Goliath @igrigorik
26. Middleware Routing
class Hello < Goliath::API
use Goliath::Rack::Params
use Goliath::Rack::JSONP
use Goliath::Rack::Validation::RequestMethod, %w(GET)
use Goliath::Rack::Validation::RequiredParam, {:key => 'echo'}
def response(env)
[200, {}, {pong: params['echo’]}]
end
end
Middleware
No rackup file
0-60 with Goliath @igrigorik
27. Middleware Routing
class Bonjour < Goliath::API
def response(env)
[200, {}, "bonjour!"]
end
end
class RackRoutes < Goliath::API
map '/version' do
run Proc.new { |env| [200, {}, ["Version 0.1"]] }
end
get "/bonjour", Bonjour
not_found('/') do
# run Proc. new { ... }
end Routing
end simple and powerful
0-60 with Goliath @igrigorik
28. Client API
(optional) Fibers
Middleware Routing
(streaming) HTTP Parser
EventMachine
Network
Ruby Fibers + Goliath
synchronous API for asynchronous processing
0-60 with Goliath @igrigorik
29. Ruby 1.9 Fibers are a means of creating code
blocks which can be paused and resumed by
our application (think lightweight threads,
minus the thread scheduler and less
overhead).
f = Fiber.new {
while true do
Fiber.yield "Hi”
end
}
Manual / cooperative scheduling!
p f.resume # => Hi
p f.resume # => Hi
p f.resume # => Hi
Ruby 1.9 Fibers
http://bit.ly/d2hYw0 and cooperative scheduling
0-60 with Goliath @igrigorik
30. Fibers vs Threads: creation time much lower
Fibers vs Threads: memory usage is much lower
Ruby 1.9 Fibers
and cooperative scheduling
http://bit.ly/aesXy5
0-60 with Goliath @igrigorik
31. def query(sql)
f = Fiber.current
conn = EventMachine::MySQL.new(:host => 'localhost')
q = conn.query(sql)
c.callback { f.resume(conn) }
c.errback { f.resume(conn) }
return Fiber.yield
end
EventMachine.run do Exception, async!
Fiber.new {
res = query('select sleep(1)')
puts "Results: #{res.fetch_row.first}"
}.resume
end
Untangling Evented Code with Fibers
http://bit.ly/d2hYw0
0-60 with Goliath @igrigorik
32. def query(sql)
f = Fiber.current
conn = EventMachine::MySQL.new(:host => 'localhost')
q = conn.query(sql)
c.callback { f.resume(conn) }
c.errback { f.resume(conn) }
return Fiber.yield
end
1. Wrap into a continuation
EventMachine.run do
Fiber.new {
res = query('select sleep(1)')
puts "Results: #{res.fetch_row.first}"
}.resume
end
Untangling Evented Code with Fibers
http://bit.ly/d2hYw0
0-60 with Goliath @igrigorik
33. def query(sql)
f = Fiber.current
conn = EventMachine::MySQL.new(:host => 'localhost')
q = conn.query(sql)
c.callback { f.resume(conn) }
c.errback { f.resume(conn) }
return Fiber.yield
end
2. Pause the continuation
EventMachine.run do
Fiber.new {
res = query('select sleep(1)')
puts "Results: #{res.fetch_row.first}"
}.resume
end
Untangling Evented Code with Fibers
http://bit.ly/d2hYw0
0-60 with Goliath @igrigorik
34. def query(sql)
f = Fiber.current
conn = EventMachine::MySQL.new(:host => 'localhost')
q = conn.query(sql)
c.callback { f.resume(conn) }
c.errback { f.resume(conn) } 3. Resume the continuation
return Fiber.yield
end
EventMachine.run do
Fiber.new {
res = query('select sleep(1)')
Fixed!
puts "Results: #{res.fetch_row.first}"
}.resume
end
Untangling Evented Code with Fibers
http://bit.ly/d2hYw0
0-60 with Goliath @igrigorik
35. www.goliath.io
• Single responsibility web-services
• Async HTTP response streaming + wraps each incoming request
Goliath automatically progressive notifications
• Async HTTP request streaming +allowing us to hide the async
into a Ruby fiber, progressive notifications
complexity from the developer.
• Multiple requests within the same VM
• Keep-alive support
• Pipelining support
• Ruby API & “X-Ruby friendly”
• Easy to maintain & test
0-60 with Goliath @igrigorik
36. require 'goliath'
class Hello < Goliath::API
def response(env)
[200, {}, "Hello World"]
end
end
$> ruby hello.rb -sv –p 8000 –e production
Hello World
Simple Goliath server
0-60 with Goliath @igrigorik
40. describe HttpLog do
it 'forwards to our API server' do
with_api(HttpLog, api_options) do |api|
get_request({}, err) do |c|
c.response_header.status.should == 200
c.response_header[’X-Header'].should == 'Header'
c.response.should == 'Hello from Responder'
end
end
end
end Integration Testing
simple end-to-end testing
0-60 with Goliath @igrigorik
43. gem install goliath
Goliath
https://goliath.io/
https://github.com/postrank-labs/goliath/
Peepcode
http://peepcode.com/products/eventmachine-ii
http://peepcode.com/products/eventmachine-i
Phew, time for questions?
hope this convinced you to explore the area further…
0-60 with Goliath @igrigorik