Presented at PyCon UK 2018 (18 September 2018, Cardiff).
The slides are incomplete.
Recording available at:
https://www.youtube.com/watch?v=-weU0Zy4Yd8
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
PyConUK 2018 - Journey from HTTP to gRPC
1. A journey from HTTP
to gRPC
Tati Al-Chueyr
@tati_alchueyr
PyCon UK 2018
15 September 2018, Cardiff
2. tati_alchueyr.__doc__
● computer engineer by Unicamp
● senior data engineer at BBC
(previously engineer at EF, globo.com &
Ministry of Science and Technology of Brazil)
● open source enthusiast
● pythonist since 2003
● Amanda’s mummy
3. bbc.datalab.mission
“Bring the BBC’s data together accessible
through a common platform, along with flexible
and scalable tools to support machine learning
to enable content enrichment and deeper
personalization”
16. the recommender platform started being built
view orchestrate recommend
content database
the first
generation of
microservices
was developed
using
version
1
17. the microservices started to grow up
view orchestrate recommend
content database
they happily
learned
how to talk to
each other using
JSON over
HTTP
http http http
http http
json json json
json
json
version
1
18. and their contracts were defined using swagger
view orchestrate recommend
content database
version
1
19. however...
view orchestrate recommend
content database
althought for
some tasks they
performed well,
for others they
would have
huge latencies
if there were
concurrent
users
version
1
20. import time
from flask import Flask
app = Flask(__name__)
@app.route("/io")
def io():
time.sleep(2)
return "IO bound task completed"
@app.route("/cpu")
def cpu():
total = 0
for i in range(19840511):
total += i*i
return "CPU bound task completed"
$ FLASK_APP=flask_api.py flask runFlask==1.0.2
flask_api.py
21. $ boom http://127.0.0.1:5000/io -c 10 -n 10
Running 10 queries - concurrency 10
-------- Results --------
Successful calls 10
Total time 2.0495 s
Average 2.0302 s
Fastest 2.0227 s
Slowest 2.0377 s
Amplitude 0.0150 s
Standard deviation 0.004716
RPS 4
BSI :(
-------- Status codes --------
Code 200 10 times.
-------- Legend --------
RPS: Request Per Second
BSI: Boom Speed Index
22. $ boom http://127.0.0.1:5000/cpu -c 10 -n 10
Running 10 queries - concurrency 10
-------- Results --------
Successful calls 10
Total time 19.2738 s
Average 18.8230 s
Fastest 18.1451 s
Slowest 19.2623 s
Amplitude 1.1172 s
Standard deviation 0.379198
RPS 0
BSI :(
-------- Status codes --------
Code 200 10 times.
-------- Legend --------
RPS: Request Per Second
BSI: Boom Speed Index
23. why?
Flask uses Werkzeug as WSGI
● By default Werkzeug uses threads
Python has GIL (Global Interpreter Lock)
● Only one thread can execute python bytecode a time
● the threads fight for CPU - blocking one another all the way,
returning the response to the user almost in the same time
● This does not affect I/O - but does affect computations
26. a gunicorn microservice
version
2
micoservice master
worker
worker
worker
n
and with
synchronous
gunicorn
workers, each
microservice
could now
respond to n
concurrent
requests
s
s
s
27. import time
from flask import Flask
app = Flask(__name__)
@app.route("/io")
def io():
time.sleep(2)
return "IO bound task completed"
@app.route("/cpu")
def cpu():
total = 0
for i in range(19840511):
total += i*i
return "CPU bound task completed"
$ gunicorn --bind 0.0.0.0:5000 --workers 1 flask_api:appFlask==1.0.2
gunicorn==19.9.0
flask_api.py
unchanged
changed
28. $ boom http://127.0.0.1:5000/io -c 10 -n 10
Running 10 queries - concurrency 10
-------- Results --------
Successful calls 10
Total time 20.0589 s
Average 11.0316 s
Fastest 2.0171 s
Slowest 20.0579 s
Amplitude 18.0407 s
Standard deviation 5.754321
RPS 0
BSI :(
-------- Status codes --------
Code 200 10 times.
-------- Legend --------
RPS: Request Per Second
BSI: Boom Speed Index
1 x
s
29. $ boom http://127.0.0.1:5000/io -c 10 -n 10
Running 10 queries - concurrency 10
-------- Results --------
Successful calls 10
Total time 10.0679 s
Average 6.0398 s
Fastest 2.0494 s
Slowest 10.0412 s
Amplitude 7.9918 s
Standard deviation 2.823781
RPS 0
BSI :(
-------- Status codes --------
Code 200 10 times.
-------- Legend --------
RPS: Request Per Second
BSI: Boom Speed Index
2 x
s
30. $ boom http://127.0.0.1:5000/io -c 10 -n 10
Running 10 queries - concurrency 10
-------- Results --------
Successful calls 10
Total time 2.0541 s
Average 2.0300 s
Fastest 2.0233 s
Slowest 2.0432 s
Amplitude 0.0199 s
Standard deviation 0.007658
RPS 4
BSI :(
-------- Status codes --------
Code 200 10 times.
-------- Legend --------
RPS: Request Per Second
BSI: Boom Speed Index
10 x
s
31. $ boom http://127.0.0.1:5000/cpu -c 10 -n 10
Running 10 queries - concurrency 10
-------- Results --------
Successful calls 10
Total time 18.3119 s
Average 10.0470 s
Fastest 1.8415 s
Slowest 18.3008 s
Amplitude 16.4593 s
Standard deviation 5.244141
RPS 0
BSI :(
-------- Status codes --------
Code 200 10 times.
-------- Legend --------
RPS: Request Per Second
BSI: Boom Speed Index
1 x
s
32. $ boom http://127.0.0.1:5000/cpu -c 10 -n 10
Running 10 queries - concurrency 10
-------- Results --------
Successful calls 10
Total time 10.5875 s
Average 6.2530 s
Fastest 2.8555 s
Slowest 10.5788 s
Amplitude 7.7232 s
Standard deviation 2.670508
RPS 0
BSI :(
-------- Status codes --------
Code 200 10 times.
-------- Legend --------
RPS: Request Per Second
BSI: Boom Speed Index
3 x
s
33. $ boom http://127.0.0.1:5000/cpu -c 10 -n 10
Running 10 queries - concurrency 10
-------- Results --------
Successful calls 10
Total time 10.1825 s
Average 7.6599 s
Fastest 5.9517 s
Slowest 10.1776 s
Amplitude 4.2260 s
Standard deviation 2.038286
RPS 0
BSI :(
-------- Status codes --------
Code 200 10 times.
-------- Legend --------
RPS: Request Per Second
BSI: Boom Speed Index
5 x
s
34. why?
Gunicorn, by default, spawns synchronous processes for each
worker
● the number of workers limits the amount of concurrent requests, for
this reason I/O was affected in a negative way (compared to
previous “no limit”)
● for CPU, there is a significant improvement to pure Flask/Werkzeug,
since the service can now handle requests concurrently without
having to wait for all of them to finish
● There is a limit of the # workers (suggested: (2 x $num_cores) + 1).
After a point they will start thrashing system resources decreasing
the throughput of the entire system
36. stronger forces decided to replace them by
recommend
content database
version
1
view orchestrate
with the
promise of
higher
performance
and
free type
checking
37. along came the new generation...
view orchestrate recommend
content database
there was a
learning curve
towards gRPC
version
3
38. the microservices started to grow up
view orchestrate recommend
content database
and in two
months time
most of the
microservices
started talking
protocol buffers
over tcp
http tcp tcp
tcp http
json pb3 pb3
pb3
json
version
3
pb3
json
39. syntax = "proto3";
message Empty {
}
service Sample {
rpc IntenseProcess(Empty) returns (Empty) {}
rpc IntenseIO(Empty) returns (Empty) {}
}
grpc.proto
$ pip install grpcio==1.15.0 grpcio-tools==1.15.0
$ python -m grpc_tools.protoc -I. --python_out=. --grpc_python_out=.
grpc.proto
$ ls
grpc_pb2.py
grpc_pb2_grpc.py The 2 in pb2 indicates that the generated code
is following Protocol Buffers Python API
version 2. It has no relation to the Protocol
Buffers Language version, which is the one
indicated by syntax the .proto file.
40. grpc_server.py
$ python grpc_server.py
grpcio==1.15.0
grpcio-tools==1.15.0
import time
from concurrent import futures
import grpc
import grpc_pb2
import grpc_pb2_grpc
class SampleServicer(grpc_pb2_grpc.SampleServicer):
def IntenseProcess(self, request, context):
total = 0
for i in range(19840511):
total += i*i
return grpc_pb2.Empty()
def IntenseIO(self, request, context):
time.sleep(2)
return grpc_pb2.Empty()
server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
grpc_pb2_grpc.add_SampleServicer_to_server(SampleServicer(), server)
print('Starting server. Listening on port 5001.')
server.add_insecure_port('[::]:50051')
server.start()
try:
while True:
time.sleep(86400)
except KeyboardInterrupt:
server.stop(0)
48. import time
from flask import Flask
app = Flask(__name__)
@app.route("/io")
def io():
time.sleep(2)
return "IO bound task completed"
@app.route("/cpu")
def cpu():
total = 0
for i in range(19840511):
total += i*i
return "CPU bound task completed"
$ gunicorn --bind 0.0.0.0:5000 --workers 1 flask_api:app
-k gevent
Flask==1.0.2
gunicorn==19.9.0
flask_api.py
unchanged
changed
49. $ boom http://127.0.0.1:5000/io -c 10 -n 10
Running 10 queries - concurrency 10
-------- Results --------
Successful calls 10
Total time 2.0366 s
Average 2.0219 s
Fastest 2.0185 s
Slowest 2.0252 s
Amplitude 0.0066 s
Standard deviation 0.001698
RPS 4
BSI :(
-------- Status codes --------
Code 200 10 times.
-------- Legend --------
RPS: Request Per Second
BSI: Boom Speed Index