How to approach writing a 'test' for a series of input->output sets, that will later be used to verify correctness of a new system?

Question 1

I have a system ("an engine", which is a piece of code) that takes in several parameters returns some output. The input parameters are several distinct and constrained sets. I can enumerate all of them (currently 1064 sets of inputs) and run them through the engine to produce the same number of outputs.

The engine code will be replaced by a new system, coding of which is coming from a different department. The purpose of the test is to record and store the original mapping of input to output, and once the engine has been replaced, re-run the test with new engine in place.

I am having some trouble figuring how to approach this. Is this a "unit test"? should it be treated as a unit test? (I think that's what I am trying to cast it as but having trouble). Should it be a separate "module" of the code? Should it be a one-off script that I run?

Since there are a lot of input->output mappings I will likely have to create a database table for the test, and store results in the table.

Question:

I am looking for a testing framework or an approach to this sort of "test" that will first help me store "what is expected", and then later, at a much different time, test whether the new system abides by the expectations.

Or, I could also run both systems at the same time and compare the results that way, I likely will not need to store results temporarily in this case. However, in my case the new engine is not ready yet, so I cannot refer to both at the same time, as only the old engine is available.

Question 2

What prevents you from simply writing a function or method that hands your new engine the parameters, retrieves the results, compares them with the results from the original engine and tells you whether it succeeded or not?

Question 3

there are no expected results at the moment... at least in a form of automated tests. Arguably the original engine holds the results "the way they were", or they can be defined via other means as well. I want to attempt to use the original engine as a fail-safe, to manage any "changes" that results from the engine switch

Question 4

The expected results are the output from the original engine, given a certain set of parameter inputs. The test you write would spin up the new engine, hand it a set of parameters, and compare the resulting output to the output from the original engine (given the same set of parameters).

Question 5

The Samba team has an aggressive approach to this: they have a test machine with two network cards. One card is connected to a Windows Server, the other card is connected to a Linux server running Samba. A fuzz script generates SMB interactions and sends them identically across both interfaces, then compares the responses. Any time it detects a difference, it starts a backtracking search to generate the simplest sequence of interactions that leads to the same result, and when it has found it, it logs the command sequence in a bug tracker and starts from scratch.

Question 6

You are trying to prepare a regression test, in order to ensure that the new implementation exhibits the existing behaviour.

As a testing method, an end to end test of your component seems to be what you're trying to do. It is possible to store all test cases in a database. In my experience, using plain text files is better, because they can be easily shared, emailed around, and version controlled. These files describe the input and record the outputs, e.g. as a JSON, YAML, or XML document.

You would then have a test driver, which is a small program that reads the test case file, runs the engine with the inputs, and compares the outputs. You can later reimplement the same test driver for the new engine, or make the test driver configurable.

Once you have both engines, you ought to run both engines in parallel for a while. Both are fed all real-world, production inputs. If there is a discrepancy, you log the inputs and the two outputs, and review it manually. This phase tends to discover many bugs in the old implementation, so the new implementation may very well produce "more correct" output. Initially you will prefer to keep using the outputs of the old engine. After a while you will be confident enough to switch to the new engine.

Your input space is probably far too large to test exhaustively (assuming those are 1064 independent variables, not just 1064 input values). Therefore, you will need to generate examples. One great strategy is to log inputs from production, sanitize/anonymize them, and use those as a test case. However, this will miss interesting edge cases. Based on the logged examples, any existing specifications or test cases, and based on the input of domain experts, manually construct test cases for important scenarios that were not observed during logging.

amon amon 136k27 gold badges295 silver badges386 bronze badges · Accepted Answer · 2017-12-20 17:17:45Z

You are trying to prepare a regression test, in order to ensure that the new implementation exhibits the existing behaviour.

As a testing method, an end to end test of your component seems to be what you're trying to do. It is possible to store all test cases in a database. In my experience, using plain text files is better, because they can be easily shared, emailed around, and version controlled. These files describe the input and record the outputs, e.g. as a JSON, YAML, or XML document.

You would then have a test driver, which is a small program that reads the test case file, runs the engine with the inputs, and compares the outputs. You can later reimplement the same test driver for the new engine, or make the test driver configurable.

Once you have both engines, you ought to run both engines in parallel for a while. Both are fed all real-world, production inputs. If there is a discrepancy, you log the inputs and the two outputs, and review it manually. This phase tends to discover many bugs in the old implementation, so the new implementation may very well produce "more correct" output. Initially you will prefer to keep using the outputs of the old engine. After a while you will be confident enough to switch to the new engine.

Your input space is probably far too large to test exhaustively (assuming those are 1064 independent variables, not just 1064 input values). Therefore, you will need to generate examples. One great strategy is to log inputs from production, sanitize/anonymize them, and use those as a test case. However, this will miss interesting edge cases. Based on the logged examples, any existing specifications or test cases, and based on the input of domain experts, manually construct test cases for important scenarios that were not observed during logging.

Stack Exchange Network

How to approach writing a 'test' for a series of input->output sets, that will later be used to verify correctness of a new system?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

How to approach writing a 'test' for a series of input->output sets, that will later be used to verify correctness of a new system?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions