Performance Stress Testing - Exit Criteria and what % Contingency is allowed between Peak Load and Stress numbers

Hi all,
I currently looking to tighten up our Exit Criteria on Load and Stressing testing. Currently our vendors run Load testing to the Peak Load defined in the system requirements. Exit Criteria is clear on this element. But on our Stress Testing numbers we are only seeing some systems having a contingency of 20-30% of Peak Load which is a concern if there is ever a surge.

e.g. Peak Load would be 50 concurrent users. Stress Test number are showing errors (system does recover gracefully) at 70 users.

What is the norm when it comes to Stress Test Numbers in terms of that contingency % of Peak Load?

What do you use as the Exit criteria of a Stress Test? 1

I imagine this is different to all systems. As a company we have protective measures in front of public facing systems such as Queue-It from Cloudflare.

What you have here is the difference between stress and capacity tests:
Stress Tests tend to have an open ended load profile until the system breaks or the response times reach a pre-defined fail time.
Capacity tests tend to look at the maximum limits of the system that meet the requirements.

This looks more like a capacity test definition issue, a question you should ask is ‘what are the requirements we have to meet from a capacity point of view.?’

You could use a stress test where you find the limits of the system may well define the requirements of the capacity test after analysis has been done if you don’t have a defined set of exit criteria / requirements,

Without knowing the system set up its hard to say what these would be, ie. an auto scaling cloud based solution would be different from a fixed resource data centre solution.

I’ve been in places that the system needed to take 3 times peak load but recover as long as the response times as still acceptable.

As with everything, until you get the requirements sorted out, you’ll never know what to test.

Hope this helps