Better Production Testing
Better Production Testing Can Save Millions
Good production testing is an integral part of today's highly automated computer systems. We've recently seen how poor testing can adversely affect computer systems worldwide. Yes we're talking about the CrowdStrike issue of July 19 2024, that shutdown airports, financial institutions, hospitals, banks, and media networks, to name a few.
What is Pre-Production Testing?
Pre-Production Testing refers to testing test systems that look like your production systems. This may be dozens to hundreds of test systems. Because there may be a variety of systems in production, its important to ensure your test systems represent the whole spectrum of systems in production, and not just a small sample of them.
Pre-production testing is equivalent (or related) to:
Staging Testing
Non-production Testing
Integration Testing
System Testing
Quality Assurance (QA) Testing
Sandbox Testing
An example testing environment may have a dozen of each major type of server or system you have in production. It may also have a variety of system states or prior-updates. Having the full spectrum of differences in place ensure that your tests can expose any potential defects of an update.
Once your pre-production tests have passed, you can go on to production testing, which we discuss next.
What is Production Testing?
Production Testing, also known as Live Testing, refers to testing live systems in a way that only exposes a small percentage of systems to the updates being applied. In this way you can see if an update has any adverse affects before apply the updates to a larger population.
An example production testing and deployment scenario could be:
Try the update on a few thousand systems
Try the update on 1% of the available systems
Try the update on 5% of the available systems
Try the update on 10% of the available systems
Proceed to update the systems group-by-group
If any of these steps or substeps fail, the team can go back, analyze the problem before going to the next step, and ensure that your update does not cause significant problems. If no problems are seen, it may be safe to roll out the update to the entire population, but still in a systematic way that operates on controlled groups of systems. In this way you minimize any damage to running servers.
How would Production or Pre-Production Testing Helped?
Having pre-production and production testing in place can catch many of the issues of the upgrade process. Engineers would at worst see problems manifest at the production testing phase and be able to stop the update.
Having production testing in place can catch and stop major damage to systems before updates reach systems world-wide.


