This one of a kind presentation that compares Linux Host, Oracle WebLogic Server 12c, and Oracle Database 18c performance on leading compute cloud providers that include Oracle Cloud, Amazon Web Services, Microsoft Azure, Google Cloud, and IBM Cloud. Join us to see actual results and findings as it pertains to IaaS performance.
This is practically the only presentation of its kind with actual published results of numerous performance metrics against the 5 leading compute cloud providers. Attendees will learn about provisioning challenges as well as non-performance factors in terms of cloud provider selection.
Compute Cloud Performance Showdown: Amazon Web Services, Oracle Cloud, IBM Cloud, Google Cloud, Microsoft Azure
1. Session ID:
Prepared by:
Remember to complete your evaluation for this session within the app!
176
Compute Cloud
Performance Showdown
Amazon Web Services, Oracle
Cloud, IBM Cloud, Google Cloud,
Microsoft Azure
Wednesday, April 10, 2019 @ 8:00am
CC 2ND FL 210B
Ahmed Aboulnaga
@Ahmed_Aboulnaga
4. About Me
Ahmed Aboulnaga
• Master’s degree in Computer Science from George Mason
University
• Recent emphasis on cloud, DevOps, middleware, and security
in current projects
• Oracle ACE, OCE, OCA
• Author, Blogger, Presenter
• @Ahmed_Aboulnaga
6. Disclaimer
• The test results documented in this presentation should not be considered
definitive
• Several details surrounding setup, configuration, and assumptions regarding
the test cases are not documented in this presentation for space consideration
• Results can vary with repeated testing:
– Ongoing/unknown backend and hardware changes at each provider
– Varying load on backend hardware due to multitenancy
– Lack of backend access makes analyzing results difficult at times
• Testing limitations exist (see next slides)
7. Objective
• Conduct tests to compare compute cloud performance across 5 cloud providers:
– Amazon Web Services, Oracle Cloud, IBM Cloud, Google Cloud, Microsoft Azure
– Against comparable medium-sized instance types
• Compare performance of the following:
– Linux Host
– Oracle WebLogic Server 12c
– Oracle Database 18c
8. Testing Limitations
• Tested on single virtual machine (i.e., did not recreate instances and retest)
• Did not test different instance types
• Did not test across different data centers
• All virtual machines are multitenant (i.e., no dedicated hardware)
• All instances configured identically, but no tuning or provider specific
optimizations performed
9. Performance Results
• Difficult to perform apples-to-apples comparison of cloud providers as it
pertains to performance as well as cost-value
• Nothing alarming in the performance results; more powerful CPUs yielded
better performance
• Host performance:
– AWS has the slight processing edge due to newer and higher end CPU model
• Oracle WebLogic Server performance:
– Azure slightly underperforms compared to the other providers
• Oracle Database performance:
– Azure consistently has the poorest throughput and performance
10. Conclusion
• Performance for medium-sized compute cloud footprints not a driver in cloud
provider selection, however:
– Consider alternatives to Microsoft Azure, as it is the lowest performing of all
• Other non-performance related factors can affect the overall experience
– Consider Amazon Web Services to experience the least amount of issues
– Consider Oracle Cloud for cost reasons
– Consider alternatives to IBM Cloud and Google Cloud to avoid instance loss
– Consider alternatives to Google Cloud for support reasons
18. Billing Estimates
• Unchanged is the fact that billing remains confusing for pay-as-you-go plans,
and difficult to predict or estimate
• Can interpret costs on the AWS and Oracle billing dashboards
• Difficult to interpret costs on the IBM, Google, and Azure billing dashboards
19. Compute Cost Comparison
• Costs below are based on official pricing sheets, not actuals
• Though instance specs are near identical, some variances (in yellow) exist
Cloud Provider Instance Type vCPUs RAM Cores Cost
(per hour)
Cost
(per month)
Amazon Web Services m5.4xlarge 16 64 GB 8 $0.8980 $646.56
Oracle Cloud VM.Standard2.8 16 120 GB 8 $0.5104 $367.49
IBM Cloud B1.16x64 16 64 GB 16 $0.7722 $555.98
Google Cloud (Custom) 16 64 GB 8 $0.9183 $661.18
Microsoft Azure D16s_v3 16 64 GB 8 $0.8980 $646.56
* CPU/memory cost only (excludes disk, firewall, static IPs, load balancers, etc.)
20. Monthly Compute Cost Comparison
$647
$367
$556
$661 $647
AWS Oracle IBM Google Azure
Cost Per Month
CPU/memory only, based on official pricing sheets, not actuals
O
f
f
i
c
i
a
l
P
r
i
c
i
n
g
S
h
e
e
t
s
O
n
l
i
n
e
22. Amusing Experiences
• Google Cloud:
– Does not allow the use of a Google Voice number during registration
– Requires your date of birth and gender during signup
• IBM Cloud:
– You must be 16 years or older to use IBM Cloud
– Support has direct access to your VM to help with problem resolution
• Microsoft Azure:
– Credit card page only works in Internet Explorer
– Microsoft Azure wants you to use RDP when connecting to Linux GUI
– “You can't sign up here with a work or school email address. Use a personal
email, such as Gmail or Yahoo!, or get a new Outlook email.”
23. Amazon Web Services: Account Upgrade Delay
• Took 2 calls to Support to get upgraded
– Normally 15 minutes, took 3 hours
24. Oracle Cloud: Major Account Upgrade Delay
• Upgrading from free account to paid account took 8 days
– Account upgrade from all other providers completed the same day
– Despite escalation via 4 emails threads and a Sev1 SR
• Oracle Sev1 SR:
– Oracle Support: “Please, if you need additional help or if you want to upgrade for
an Enterprise one, please contact Oracle Sales Representative and
I am sure they will provide all information you need.”
– Me: “I have 4 email threads about this issue going on with various
people at Oracle, including Oracle Sales, Technical Solutions
Engineer, Account Manager, and Cloud Trial Coordinator. Two of
them said to create an Oracle SR for it because they don't know
why I'm unable to create the shape that I want.”
25. IBM Cloud: Poor Firewall Management
• IBM Cloud only has 4 non-customizable firewall rules to choose from
– HTTP 80, HTTPS 443, SSH 22, or all ports
– Or upgrade to a $1,000 to $2,000 firewall that can only be paid for via PayPal
26. IBM Cloud: Limited Network Speed Options
• Max network speed is 1 Gbps
– Vs. 10 Gbps for AWS, 5.6-8.2 Gbps for Oracle Cloud
27. IBM Cloud: OS Reload Issue
• OS Reload reloads and restores the OS to its original working order, or to
reconfigure a device with different software
• States that it would take 49 minutes, ran 26+ hours and never finished
• Destroyed VM, rendering it inaccessible and eventually auto-deleted it
28. IBM Cloud: Account Disabled
• Account disabled after a few days with no indication, just logs you out when
clicking links
• IBM Cloud Support first stated it was disabled due to DDoS, but later re-enabled
it without question
29. IBM Cloud: Invalid Estimated Cost
• Dashboard showed estimated charge of $726 for 15 minutes of use
30. IBM Cloud: Registration Email not from cloud.ibm.com
• IBM uses SendGrid for email registration: https://u2042770.ct.sendgrid.net/
31. Google Cloud: Poor Support
• Confusing: Billing Account, Billing Profile, and Payment Profile are different
things
• After upgrading from a free to a paid account, Google Cloud revoked all access
to all services after 7 days, and required “verification”
• Google Cloud then deleted my billing id (due to bug?), lost everything, then
refused to support me because I have no billing id
• Every support call and ticket will get back to you in “24-48 hours” regardless of
severity
32. Google Cloud: Complete Instance Loss
• Performed a Red Hat OS update, which rendered VM inaccessible and
unavailable after reboot (i.e., full instance loss)
– Same operation worked fine with Oracle and Azure
33. Microsoft Azure: Delayed Firewall Rules
• Several times, firewall rules do not take effect immediately
– Often up to 10 minutes
– Forced to use “IP flow verify” on multiple occasions for troubleshooting and verification
34. Microsoft Azure: Frequent Console Errors
• Two instances of console errors:
– Services and/or data unavailable (see screenshot)
– Support console error, lost all access to existing ticket (still accessible via email though)
35. Free Technical Support
• All cloud providers offer multiple paid support plans
• Only Oracle Cloud and IBM Cloud provide technical support at no extra cost
37. Virtual Machine Specifications
• Instances configured identically, with some variances (see yellow highlights)
Amazon Web Services Oracle Cloud IBM Cloud Google Cloud Microsoft Azure
Region N. Virginia US-ASHBURN-AD-1 NA East (WDC01) us-east4 (Northern Virginia) East US
Profile / Type / Shape m5.4xlarge VM.Standard2.8 B1.16x64 (custom) D16s_v3
vCPU 16 16 16 16 16
Memory 64 GB 120 GB 64 GB 64 GB 64 GB
Kernel 3.10.0-957.el7.x86_64 4.14.35-1818.5.4.el7uek.x86_64 3.10.0-957.1.3.el7.x86_64 3.10.0-957.1.3.el7.x86_64 3.10.0-957.1.3.el7.x86_64
Operating System RHEL 7.6 OL 7.6 RHEL 7.6 RHEL 7.6 RHEL 7.6
CPU 8 cores / 16 threads 8 cores / 16 threads 16 cores / 16 threads 8 cores / 16 threads 8 cores / 16 threads
Model Intel Xeon Platinum 8175M CPU Intel Xeon Platinum 8167M CPU Intel Xeon CPU E5-2683 v3 Intel Xeon CPU Intel Xeon CPU E5-2673 v3
MHz 2.50 GHz 2.00 GHz 2.00 GHz 2.20 GHz 2.40 GHz
38. Virtual Machine Variances
• Slight variations in VMs:
– Oracle Cloud VM has 120 GB memory (vs. 64 GB all others)
– IBM Cloud VM has 16-cores (vs. 8-cores all others)
– All Linux kernels identical except for Oracle Cloud (because of Oracle Linux)
• CPU model variance:
– None of the CPU models are identical, which explains variance in performance
– Google Cloud VM has unknown CPU model
• Tried lshw, dmidecode, cpuid, inxi, /proc/cpuinfo
• All data centers in Northern Virginia (go Ashburn!)
• Results are generally reproducible (except for I/O)
40. Testing Tool
• stress-ng
yum install stress-ng
• Simple workload generator that will stress test a server for the following
features:
– CPU compute
– Cache thrashing
– Drive stress
– I/O syncs
– VM stress
– Socket stressing
– Context Switching
– Process creation and termination
– Much more
stress-ng: info: [12157] successful run completed in 322.04s (5 mins, 22.04 secs)
stress-ng: info: [12157] stressor bogo ops real time usr time sys time bogo ops/s bogo ops/s
stress-ng: info: [12157] (secs) (secs) (secs) (real time) (usr+sys time)
stress-ng: info: [12157] cpu 637597 309.01 5134.20 0.00 2063.32 124.19
bogo
Bogus operations per second, are not comparable between different stressors.
bogo ops/s (real time)
Total bogo operations per second based on wall clock run time. The wall clock time reflects the apparent run
time. The more processors one has on a system the more the work load can be distributed onto these and hence
the wall clock time will reduce and the bogo ops rate will increase. This is essentially the “apparent” bogo ops rate
of the system.
41. Test
• Load Test:
– Number of Tests: 2
– Types: CPU, Memory, I/O, Large File Copy
– Duration: 15 minutes
42. CPU Stress Test
• All CPU tests run when 15-minute system load under 0.05
• Findings:
– Nothing alarming in the results; more powerful CPUs yielded better performance
– Difficult to perform a one-on-one comparison due to CPU model differences
• Command (other variations tested too):
stress-ng --cpu 2000 --timeout 15m --verbose --metrics-brief
0
500
1,000
1,500
2,000
2,500
3,000
3,500
AWS Oracle IBM Google Azure
CPU Stress Test
(higher is better)
Test 1 Test 2
43. Memory Stress Test
• Findings:
– Nothing alarming; comparable results
• Command (other variations tested too):
stress-ng --vm 8 --vm-bytes 6G --timeout 15m --metrics-brief
0
50,000
100,000
150,000
200,000
250,000
AWS Oracle IBM Google Azure
Memory Stress Test
(higher is better)
44. I/O Stress Test
• Findings:
– Inconsistent results (see graph), impossible to determine source of variance
– Fluctuations of 85% (AWS), 35% (Oracle), 70% (IBM), 90% (Google), 89% (Azure)
– IBM consistently much worse than the rest
• Command (other variations tested too):
stress-ng --io 16 --timeout 15m --verbose --metrics-brief
stress-ng --io 8 --timeout 5m --verbose --metrics-brief
0
10,000
20,000
30,000
AWS Oracle IBM Google Azure
I/O Stress Test
(higher is better)
Test 1 Test 2 Test 3
45. Large File Copy Stress Test
• Findings:
– Oracle Cloud and Microsoft Azure considerably better than the others
• Command (other variations tested too):
stress-ng --hdd 8 --hdd-bytes 2G --timeout 15m --metrics-brief
0
50,000
100,000
150,000
200,000
250,000
AWS Oracle IBM Google Azure
Large File Copy Stress Test
(higher is better)
Test 1 Test 2
46. Conclusion – Linux Host Performance
• CPU performance:
– AWS outperformed the others (but had a more recent CPU model)
– Oracle, IBM, Google performed identically
– Azure slightly slower
• Memory performance:
– All providers performed identically
• I/O performance:
– Inconsistent results (see graph), impossible to determine source of variance
– IBM consistently much worse than the rest
• Large file copy performance:
– Oracle and Azure considerably better than the other providers
48. Software Version
• Oracle WebLogic Server 12.2.1.3
• JDK 8u191
• Single node (no cluster, no load balancer)
49. Testing Tool
• Apache JMeter 5.0
– https://jmeter.apache.org
• “The Apache JMeter application is open source software, a 100% pure Java
application designed to load test functional behavior and measure performance.”
50. Test
• Type of Test:
– Minimalistic ADF application (2 pages, RESTful services)
– Uses standard HR schema in the Oracle Database 18c
• Load Test:
– Transactions: 100,000
– Parameters: ./load-run-aws.sh -Jusers=500 -Jloops=100 -Jrampup=120
./load-run-oci.sh -Jusers=500 -Jloops=100 -Jrampup=120
./load-run-ibm.sh -Jusers=500 -Jloops=100 -Jrampup=120
./load-run-gc.sh -Jusers=500 -Jloops=100 -Jrampup=120
./load-run-ms.sh -Jusers=500 -Jloops=100 -Jrampup=120
51. Results – Transaction Counts
• Oracle Cloud, IBM Cloud, and Google Cloud completed in exactly the same length of
time
• All errors were:
Non HTTP response code: javax.net.ssl.SSLHandshakeException/Non HTTP
response message: Remote host closed connection during handshake
Number of Transactions Number of Errors Duration
(minutes)
AWS 100,000 2 11:42
Oracle 100,000 2 11:23
IBM 100,000 3 11:23
Google 100,000 3 11:23
Azure 100,000 6 12:23
52. Results – Average Response Times & Throughput
• Microsoft Azure is the poorest performer
5,500
6,000
6,500
7,000
7,500
AWS Oracle IBM Google Azure
Average Response Time (ms)
(lower is better)
125
130
135
140
145
150
AWS Oracle IBM Google Azure
Throughput
(higher is better)
53. Results – Transactions Per Second
• No major findings, but Microsoft Azure approximately 7% lower transactions per second
0
20
40
60
80
100
120
140
160
180
200
Transactions Per Second
(higher is better)
AWS Oracle IBM Google Azure
54. Results – Managed Server CPU Usage (%)
• No findings on managed server CPU usage
AWS
Oracle
IBM
Google
Azure
55. Results – Managed Server Requests (per minute)
• No findings on requests per minute
AWS
Oracle
IBM
Google
Azure
56. Results – Managed Server Heap Usage (MB)
• No major findings on heap usage
• AWS had double the heap usage as the rest
AWS
Oracle
IBM
Google
Azure
57. Results – Managed Server Data Sources
• No major findings on data source statistics
• IBM Cloud and Google Cloud had more open connections than the rest
AWS
Oracle
IBM
Google
Azure
58. Conclusion – Oracle WebLogic Server 12c Performance
• Generally speaking, AWS, Oracle Cloud, IBM Cloud, and Google Cloud had
comparable throughput
• Microsoft Azure underperformed compared to the other providers:
– Completed all transactions in the longest length of time
– Had the largest response times
– Had the lowest throughput
61. Testing Tool
• SwingBench 2.6
– http://www.dominicgiles.com/swingbench.html
• “Swingbench is a free load generator (and benchmarks) designed to stress test
an Oracle database (11g,12c)”
62. Test
• Stress Test:
– Number of Tests: 2
– Users: 100
– Duration: 48 minutes
– Load Ratio: Select (40%)
Insert (15%)
Update (30%)
Delete (10%)
– Time: Test 1 (2:45-3:30pm EST, peak)
Test 2 (7:45-8:30pm EST, non-peak)
– Database Setup: Single node (no RAC)
File system datafiles (no ASM)
All testing against CDBROOT
Default DBCA configuration
63. Results – Throughput
• AWS, Oracle, IBM, and Google appear to have comparable throughput
– AWS a little bit of an edge, IBM a little bit less
• Azure consistently had the lowest throughput
0
10,000,000
20,000,000
30,000,000
40,000,000
50,000,000
60,000,000
70,000,000
AWS Oracle IBM Google Azure
Total Completed Transactions
(higher is better)
Test 1 Test 2
64. Results – Throughput Breakdown
0
10,000,000
20,000,000
30,000,000
AWS Oracle IBM Google Azure
Total SELECT Transactions
(higher is better)
Test 1 Test 2
0
5,000,000
10,000,000
15,000,000
AWS Oracle IBM Google Azure
Total INSERT Transactions
(higher is better)
Test 1 Test 2
0
10,000,000
20,000,000
30,000,000
AWS Oracle IBM Google Azure
Total UPDATE Transactions
(higher is better)
Test 1 Test 2
0
2,000,000
4,000,000
6,000,000
8,000,000
AWS Oracle IBM Google Azure
Total DELETE Transactions
(higher is better)
Test 1 Test 2
65. Results – Average Response Time
0.0
0.5
1.0
AWS Oracle IBM Google Azure
Average SELECT Response Times in Seconds
(lower is better)
Test 1 Test 2
0
20
40
60
AWS Oracle IBM Google Azure
Average INSERT Response Times in Seconds
(lower is better)
Test 1 Test 2
0
10
20
30
AWS Oracle IBM Google Azure
Average UPDATE Response Times in Seconds
(lower is better)
Test 1 Test 2
0
10
20
30
AWS Oracle IBM Google Azure
Average DELETE Response Times in Seconds
(lower is better)
Test 1 Test 2
66. Transactions Per Hour
• Screenshots depict TPH
(transactions per hours)
for last 30 minutes
• Some observations:
– TPH at 3:30pm and
8:30pm relatively similar
– Azure consistently has
much lower throughput
than all others
AWS
Oracle
IBM
Google
Azure
Test 1 Test 2
67. Input/Output Operations Per Second (IOPs)
• Screenshots depict IOPs
for last 30 minutes
• Some observations:
– Oracle consistently has
the highest IOPs
– Azure appears to have
little to no I/O activity
– Google disks are the only
encrypted ones AWS
Oracle
IBM
Google
Azure
Test 1 Test 2
68. Wait Times
• Screenshots depict
wait time for last 30
minutes
• Unable to interpret
results
AWS
Oracle
IBM
Google
Azure
Test 1 Test 2
69. OEM Metrics
• Oracle throughput starts off slow but peaks highest
• Azure has lowest CPU and I/O usage
– Is there any throttling going on?
– Does this explain its low throughput?
Test 2
1
2
2
1
2
70. Conclusion – Oracle Database 18c Performance
• Zero errors or rollbacks in all tests on all providers
• Impossible to conclusively determine a leader in performance:
– CPU models are not identical (see earlier slides)
– Throughput and performance varied based on several factors, such as time of day,
shared hardware due to multitenancy, etc.
– AWS, Oracle, IBM, Google generally performed comparably
• AWS slightly higher
• IBM slightly lower
• Possible to conclusively determine a loser:
– Azure consistently performed much worse than all others