In today’s world where there are over 3 billion internet users worldwide, with 38 million of those users coming from the UK alone, the concept of scalable applications is really important. The application we created for Vote for Policies needs to sustain the amount of visitors needed for a potential 5 million completed surveys by the general election on May 7th. Using load testing means we can guarantee that we’ll be able to support these heavier than average loads.
Locust.io allows us to write python scripts that emulate a single web user, then we can swarm a website with thousands of those users, during which we’re presented with any errors that occur, the average response time, and the total amount of requests sent to a single endpoint.
Our script uses sleeps and probability to dictate what a user is most likely to do on the Vote for Policies website.
We have assumed the following:
Initial entry
- 53% visit the index page
- 21% take the survey
- 20% view someone’s results
When taking the survey
- It takes 20 seconds to choose policy issues
- It takes 10 seconds to choose a country
- It takes 7 to 10 minutes to complete a survey
- It takes 30 seconds to view a result
- There is a 98% chance that a valid postcode is entered
- There is a 40% chance that an email for that result is requested
Setup
Web Application
We’re hosting on Amazon Web Services (AWS) using their Elastic Load Balancing service to distribute traffic across multiple auto-scaled Amazon EC2 instances.
Our initial budget was scaling between 1 and 10 t2.small instances. Our scaling policy would add an instance when the average CPU usage was above 60% for 3 minutes and remove an instance when the average CPU usage was below 50% for 5 minutes. This was our starting point as we’d used a similar policy with another project.
Locust.io
As we wanted to test 5000 concurrent users, we would have had over 300 requests per second, which is a lot of bandwidth, so we decided to do distributed testing on multiple t2.micro instances to ease the load off our own network.
We had 1 master and 8 slaves that we were able to start on the fly with a simple linux service definition (below) that searched for the master server and connected to it.
Testing
We ran our first test with 5000 concurrent users and received these results:
From the above we deduced that although there was a very low amount of failures in comparison to successes, ideally we wanted as few failures as possible. Looking at the actual failures we noticed that the majority of them were HTTP 503 errors, which were occurring when an instance’s network load went above 70Mbps. Based on this, we adjusted our scaling policy to scale up when network load went above 50Mbps and re-ran our test with 5000 users.
These results show that the website was much more reliable with only 4 failed responses to requests.
Conclusion
Acknowledging that our test is more intensive than what real-world usage would be like, we created our final scaling policy of:
– When the average CPU usage > 60% for 60 seconds, add an instance
– When the highest incoming network traffic is > 50Mbps for 60 seconds, add an instance
– When the highest outgoing network traffic is > 50Mbps for 60 seconds, add an instance
– When the highest outgoing network traffic is below < 30Mbps for 15 minutes, remove an instance
We have 1 minute intervals between our scaling policies so that we can react quicker to sudden bursts of traffic, if say a celebrity tweets about Vote for Policies or if it’s mentioned on television. So far over 166,000 surveys have been completed putting us well on the way to achieving the target of 5 million by the general election.