Test code is an important part of the software development process – it can drastically improve the likelihood that the system that gets deployed actually does what it’s supposed to do. All of the test code, from unit tests through to acceptance and performance tests plays an important part in the validation of the system.
So, having gone to the trouble of creating all of these tests and watching as they help detect bugs in the system, it doesn’t seem to make sense to just dump them as soon as your code leaves the test environment. If the test code was useful in the development environments it should also be useful if you need to find bugs in production So obviously your test code should be included in the deployment packages and should be deployed alongside your software.
Or should it?
Going back to basic principles, any software you deploy to production should be both necessary and secure; what does that mean?
Necessary
What makes code “necessary”? If your system won’t work properly without it, it’s necessary. If your system works without it, it isn’t necessary and you shouldn’t be installing it. Installing unnecessary code just wastes space and memory – yes they’re cheap but that’s not a good reason to waste them – and creates a broader attack surface if anyone gains unauthorized access to your system.
You might feel that the usefulness of the tests outweighs these disadvantages, so let’s look at this a bit more.
Unit tests don’t interact with the environment outside the code – if they do then they’re not unit tests. If your build system is set to run them properly and they worked there, they will work in production. So, logically there’s no point in installing unit tests in production.
Functional and acceptance tests can be affected by the environment they run in, but they have a nasty habit of interfering with databases, file systems, networks and other parts of the system. Running them on a production system is usually a bad idea for those reasons – I’ve worked on systems where acceptance tests were accidentally run in the wrong environment and as a result the environment had to be recreated. If that had been the production environment, it would have been even less fun than it was.
However much you tell yourself to the contrary, your test code really is not necessary in a production environment.
Secure
Any code deployed to production should be secure, or at least as secure as you can make it. Ideally you should design and implement your code using secure methods and practices, and then use tools as part of your build process to scan for any remaining vulnerabilities. You may be doing this already, and that’s great, but are you running security scans against your tests? You’re probably not, and that’s fine too: tests are meant to do just that – test your code. They’re not particularly meant to be secure themselves. Installing insecure test code could make it possible for attackers to use it as a vector to access even more of your system.
Even if the test code were to be secure, imagine how much easier it would be for intruders to understand your system if all of the test code is on the same machine.
But Lots Of Downloadable Libraries And Packages Include Test Code!
Looking around at downloadable libraries, many of them include tests when you download them, so maybe that’s the right thing to do. But these libraries are usually intended for you to include in your own code, and to make changes and updates. By including the tests in the distribution the author makes it easier for you to use and make changes to the code. If you change the original code then you only need to write or change the tests that cover the updates you’ve made. Then you can either use the code, submit the changes back to the source repository in a pull request, or both.
So, if you’re creating a library for other developers to download and use, and possibly update, maybe you should include the tests.
On the other hand, if the system you’re creating isn’t intended to be distributed externally but is solely for your own production environment, you don’t need to distribute the tests with it.
But I Need To Test The System In Production!
Yes, you do. You should already have a good idea of how well the system works through the use of unit, functional, acceptance and user testing. You will have automated everything and practised your deployments to duplicate systems to make sure they all work. There shouldn’t be any big surprises when you deploy to production. You have read Eliminating Failed Deployments – Part 1 – Replication! Complication? Automation! - The Other RCA! and Eliminating Failed Deployments – Part 2 – Automate Your Obsession haven’t you? If not I’d recommend them!
Testing in production is different from testing in any other environment though. Most of the time the production system is being used by your users; you can’t just switch things on and off at random to see that they work – anything like that needs to be scheduled. Even worse, the data itself is often subject to any number of privacy and financial laws depending upon where you are, and where it is. You can’t just throw test data into regulated systems at will, and if test data is sent there accidentally it takes a lot of signatures from various company officers before you can delete it. Don’t ask me how I know this!
Each system will need a custom approach to testing in production. It might be very involved, it might not. Trust me though, this testing won’t be anything like most of the testing you did during development. Your unit and functional testing won’t be any use for this. Their work is done before this stage.
Something Strange Is Happening In Production! I Need My Tests!
Calm down, stop panicking and let’s go through this logically:
- All the way through your test environments the automated tests worked
- Just before you deployed your code to the test environments, the automated tests worked
- When you first tested the system in production, it presumable worked. Probably because your test environments are a replication of production. Aren’t they?
So, your automated tests worked until just before the system was deployed to the production environment. Why do you think they’ll suddenly fail in just the right place in production and show you where the problem is? If they were to do that it implies your tests and/or code are pretty awful, so let’s skip over that.
I’m certainly not saying you don’t need tests in production, and automating them would be really good too, but they need to be different from the ones you used during development.
But Production Keeps Falling Over, And I’m Sick Of Restarting It!
Alright, you definitely need something, apart from better soak testing! Again though, any tests you used in development aren’t going to help much. You want to keep an eye on the system while it’s running, and for that you don’t need tests, you need monitoring systems. You probably want to check the memory usage, CPU usage, network and disk I/O rates of the production system, and maybe configure a watchdog to restart the system when it crashes. You’ll need extra logging too. You might also want to run a system that checks web pages on your server, and maybe one that occasionally adds data to a separate dummy account, although you need to be sure that you won’t cause any legal or regulatory problems if you do that. You probably want to try to replicate a smaller version of production to do extra testing. These are all useful things, but none of them are anything like the tests you run in development.
Summary
The various kinds of tests you run in development and test environments are valuable resources in your development skill set. These environments are generally disposable: their corruption or destruction is just “something that happens” and shouldn’t be much of an inconvenience because you will have mostly automated their recreation.
The production environment is rather more “special” than these others. Crashing, corrupting or destroying it means you’re almost certainly losing money immediately. You might have configured it with all kinds of redundancy, replication and scripted creation, but if you can’t fix it quickly then you might need to restore the latest back-up. You did check that they exist and work didn’t you? Even then you’ve probably lost data, which brings its own problems.
Faced with the possibility of all of this, deploying code that really isn’t any use outside the development and test environments is a really bad idea. Make life easier for yourself, and your colleagues and users, and leave the test code back in the development and test environments where it belongs.