RightScale Conference Santa Clara 2011: We’ve all heard the stories of sites crashing and performing poorly, from major retailers – to iconic technology brands – to multinational airlines. lt’s only a matter of time before another story hits the headlines. Apica CEO, Sven Hammar, will review the importance of employing a strategic load testing and performance monitoring strategy to ensure that your web application doesn’t become another statistic. While outlining the actionable benefits of performance testing and analysis, Sven will touch on the common mistakes, discuss recent outages that hit the headlines, and share best practices to maintain optimal web performance and avoid system crashes.
6. Tips & Suggestions
#1 For peak and high load
Small is Fast
Have backup plan “minimalistic start/landing pages“
#2 Extensive use of Front End Cache systems
Optimize the cache solution, consider Varnish
Less traffic is less problem, no direct DB access
#3 Implement Scaling & Queuing System
Redirect excess trafic with LoadBalanser
Create Informative ”Wait” pages
Be prepared : Test the solution before launch
6
7. Why Run A Performance Load Test ?
Is the site stable?
When does it crash?
How can I make it faster?
Can my application scale?
7
10. Load test 1 to 1
Load Maximum Throughput
Point of collapse
– How many users can we handle
?
– What is a good result ? Complete
failure
Nr of
users
Behaviour in the ”Danger
Zone”
– Response time
∞
Does the application become
unstable above load maximum
?
Problem Analysis Nr of
users
– Where are the bottlenecks
– How to fix them ?
10
11. Do you have
Performance Targets?
We shall never crash due to load
We shall be compareble with the best in
class sites for ...
Our peak time response time shall be
better then site www.YYY
Level : We shall handle100.000 page views
per hour with :
Better then 4 sec average response time
95% of our users shall make a selection for
purchase of a (ticket, service etc .. ) in less
then 30 seconds
11
12. LoadTest findings
#1 Identify the Backend Calls
Database calls don’t kill your application
Lack of caching does!
#2 Check Static Content Delivery
Optimize the cache solution, consider Varnish
Consider using a CDN, if needed
#3 Web Infrastructure
Load Balancer
Server model
Bandwidth
Scaling & Failover
12
14. By the numbers
The need for a baseline
My startpage, Login, Book a flight
Response Time – Average 3.2 sec
Typical Values – Median 2.5 sec
Standard Deviation 2.8 sec
SLA % 99.9
95 % is better then 11,8 sec
14
15. WebPerformance Monitoring for the
Cloud
1 UP/Down 2 Browser 3 Application 4 Correlation
Basic Monitor Browser Analytics Inside
Alerting Scenarios Trend
monitoring
Vital Signs
Up – down Analytics Complex
Basic SLA Response time
Application Drill Down
SLA on Root cause
applications Consolidation
with other
system
15
17. Tips & Suggestions
#1 Set Goals
Uptime
Performance
#2 Hate the average
Work with the exceptions
Remove the 10 Worst transactions
every month
#3 Fire drill
Help identifying problems
Correlation of data
17
18. To sum it up
1 2 3 4
A load test Know your Plan for Fire Drill
before numbers... the - Be
release … unexpected prepared
...
All systems have a weak spot – what is
yours ?
18