Adequate testing is a critical success factor for your project. Often, good teams produce inferior products due to inadequate testing.
A frequent question from students: "How many test cases should we write?"; Answer: "As many as you need". There is no need to write a single test case if you are sure that your system works perfectly. In practice, it is not unusual to have more testing code (i.e. test cases) than functional code. On the other hand, the number of test cases by itself does not guarantee a good product. It is the quality of the test cases that matters. In fact, it is the quality of the product that matters. Good testing is just one means of getting there. However, evaluators rarely have enough time to measure the quality of your product rigorously and might use the quality of test cases as a measure of the quality of the product. That is one more reason to write good test cases.
You should adjust your level of testing based on the following:
Most student teams underestimate the testing effort. Testing coupled with subsequent debugging and bug fixing will take the biggest bite out of your project time. However, most students start by allocating to testing a much smaller share of resources than they allocate for development.
When correctness is essential, at least ~25-35% of the schedule should be for system level testing, not counting developer testing. Another good rule of thumb (mentioned in the book UML Distilled by Martin Fowler) is that unit tests should be at least as the same size as the production code.
"We test everything" is not a test plan. "We test from 13th to 19th" is not detailed enough. A test plan is not a list of test cases either. A test plan includes what will be tested in what order by whom, when will you start/finish testing, and what techniques and tools will be used.
Testing is so important that you should put someone in charge of it even if you do not have anyone in charge of other aspects of the project. However, this does not mean only that person will do testing. Doing so will create a single point of failure for your project; if the so called 'tester' does not do his job well, all the good work done by others will be in vain.
If you have to move 100 bricks from point A to B within 10 hours, which method do you prefer: carry all 100 bricks and run from A to B during the last 10 minutes, or walk 10 times from A to B carrying 10 bricks at a time? If you prefer the latter, insist that the team follows an iterative development process; it will make the testing feel like walking with 10 bricks rather than running with 100 bricks.
Code Complete (the book by Steve Mcconnell) says "... test-first programming is one of the most beneficial software practices to emerge during the past decade and is a good general approach". Test-Driven Development (TDD), referred to as 'test-first programming' in this quote, advocates writing test cases before writing the code. While TDD has its share of detractors, it is considered an exciting way to develop code, and its benefits outweigh the drawbacks (if any). It is certainly suitable for student projects. It might feel a bit counter-intuitive at first, but it feels quite natural once you have got used to it.
Define testing policies for your project (e.g. how to name test classes, the level of testing expected from each member, etc.). This can be done by the testing guru [see tip 5.5]. Here are some reasonable test policies you could adopt (examples only):
Students often ask "Isn't it enough just to do system testing; if the whole works, then parts must surely work, right?"
Cross-testing means you let a teammate test a component you developed. This does not mean you do not test it yourself; cross-testing is done in addition to your own testing. Cross-testing is additional work, delays the project, and is against the spirit of "being responsible for the quality of your own work". You should use it only when there is a question of 'low quality work' by a team member or when the component in question is a critical component. Any bug found during cross-testing should go on the record, and should be counted against the author of the code.
Everyone must unit-test their own code, and do a share of the integration/system testing as well. If your course allows choosing dedicated testers, choose competent testers. While you may or may not choose your strongest team members as testers, testing is too important to entrust to the weakest ones either.
While the instructor might not insist on fully automated testing, note the following:
Furthermore, it is natural to automate unit and integration testing as lower level components cannot be tested manually anyway because they do not have a user interface.
A 'smoke test' is a set of very fundamental test cases that you frequently run to make sure at least the very basic functionality of the product remains solid. It does not prove software as release-worthy. But failing a smoke test proves the software is definitely NOT release worthy (as in, 'seeing smoke when you power up an appliance is a sure sign it is busted'). Since you want to do the smoke test frequently, it makes sense to automate it, if possible.
Testability depends on factors such as controllability, observability, availability (of executables, information), simplicity, stability, separation of components, and availability of oracles. This section gives some tips on increasing testability.
Simpler designs are often easier to test. That is one more reason to choose simplicity over complexity. Besides, we are more likely to mess up when we try to be clever.
Use exceptions where appropriate. Use assertions liberally. They are not the same (find out the difference), and the two are not interchangeable.
It has been claimed that MSOffice 2007 has about 250,000 assertions (about 1%). Various past studies have shown up to 6% of code as assertions in various software programs. Microsoft found that code with assertions has a lesser defect density (imagine the situation if they did not use assertions!).
Testability does not come for free. You may have to write more code to increase testability. Some examples:
For correctness-critical parts, you can develop the same function using multiple algorithms and include them all in the system. During runtime, the system will use an internal voting mechanism to make sure all algorithms give the same result. If results differ, it should fire an assertion.
User interface testing is harder to automate. Decoupling the UI from the system allows us to test the rest of the system without UI getting in the way. Here are some hints:
If your system periodically writes to a log file, this file can be a valuable resource for testing and debugging. Sometimes, you can simply use the log file to verify the system behaviour during a test case.
Make sure you do enough developer testing (unit and integration testing) before you release the code to others. Note that debugging is not developer testing.
In the design by contract [http://tinyurl.com/wikipedia-dbc] type coding, users of your component are responsible for the validity of the input values passed to your component. Your component does not guarantee anything if input values are wrong. That means you do not have to test the component for invalid input. If the language does not have in-built support for DbC, you can use assertions to enforce validity of input values.
In defensive coding [http://tinyurl.com/wikipedia-dp], you do not assume that others will use your code the way it is supposed to be used; you actively prevent others from misusing it. Testing should follow the same philosophy. Test the component to see how it handles invalid inputs.
System testing is when you test the system as a whole against the system specification. System testing requires a different mindset than integration testing. That is why system tests are usually done by a separate QA team. Student projects often do not have a QA team. However, try to change your mindset when you transition from integration testing to system testing. Another trick that could help here is to plan system testing so that each team member tests functionalities implemented by someone else in the team.
Here are some tactics to mitigate the risk of critical bugs sneaking through:
After a developer says he is done writing test cases for a component, you can purposely insert subtle bugs into the component (this is called error seeding) and rerun the test cases to see how the test cases respond. If no test case fails, you do not have enough test cases. You can make it a fun game in which you get points for sneaking in a bug that is not caught by the test cases.
While testing needs to be meticulous and thorough, there is no point being fanatical about it. If you get bogged down by trying to achieve high coverage over one part of the system, you might end up not testing some other parts at all.
Test broadly before you go deep. Check all parts of the program quickly before focussing. Start with core/primitive functions. You may have to fix those before you test the rest.
Start with obvious and simple tests. If any such case fails, developers will want to know it sooner rather than later. While you need to check invalid and unexpected input, check valid and expected input first.
Before you implement test cases in code, you may want to define them at a conceptual level. These are easier to cross-check against the specifications. Doing this early will help you in test planning and estimating the testing effort.
Here are some example test cases specified at a conceptual level:
As you can see from the above examples, these can later go into the documentation of test cases.
Hunt where the birds are, fish where the fish are, test where bugs are likely to be [adapted from the book Pragmatic Software Testing]. Be smart about where you focus your testing energy. Do not test areas randomly. Do not test operations in the order they appear in the API. Examples of places where bugs are likely to be:
Before you can report the bug or start debugging yourself, you should be able to reproduce the error. If you are doing scripted testing (i.e. test cases are pre-determined), it is easier to reproduce the error. If you are doing exploratory testing (i.e. making up test cases as you go), the audit log can tell you what you were doing when the error was noticed. If the system does not produce an audit log (it should), you have to either memorise or record down everything you do. Alternatively, you can keep a screen recorder running to automatically record all steps performed on the UI. Having a screen recording is immensely useful if the bug falls in the dreaded category of 'intermittent faults' (i.e. happens only now and then) as it gives you concrete proof of the bug.
If your project team uses a proper issue-tracking system to track bugs, you will have to write bug reports for bugs you find in others' code. Writing proper bug reports helps to get them fixed quicker, gives you a chance to practise the important skill of writing good bug reports, and shows evaluators how professionally you have done your project.
Your bug report should not stop at saying, "hey, I found a bug"; it should help your colleague as much as it can to locate the bug. When a test case fails, investigate further. See if similar failures occur elsewhere in the system. Figure out the minimum required steps that can reproduce the failure. Try to characterise the failure in a more general manner. Make the report self-explanatory to the developer who will eventually handle it or he will pester you for more details after you have forgotten all about it. It is good to give at least one example situation. Take care to word the report so that it will not offend the developer.
Here are some examples:
"Test case 1103 failed", "Registration feature doesn't work" [not very helpful]
"System fails when I enter the text '100% coverage expected <line break> terms fixed' into the description field" [better than the above]
"System fails when I enter symbols such as '%' into any text field in the 'new policy' UI. This does not happen with other UIs" [much more helpful]
"Validation mechanism is lousy" [likely to offend the developer]
Some useful policies for bug reporting:
The system may need to be checked for performance, usability, scalability, installability, uninstallability, portability, etc.
It is not enough to test the software by running it from the IDE. You should also test it by running the executable produced.
It is not enough to test the 'debug' version of the executable; you should also test the 'release' version of the software.
It is not enough to test on machines used for developing the software; before your release, you should also test it on a 'clean' machine (i.e. a machine that did not have your software before).
Writing random test cases that just 'feel right' is not good enough if you are serious about the quality of your software.
Do not expect a piece of code to be perfect just because you wrote it yourself. There is no such thing as perfect code).
Very small and simple code segments can introduce errors. Even if that code segment is too trivial to have bugs in it, it can still break some other part of the system. Such bugs are so easy to overlook precisely because you do not expect such code to have bugs.
Being the author of the code, you tend to treat it gingerly and test it only using test cases that you (subconsciously) know to work. Instead, good testing requires you to try to break the code by doing all sorts of nasty things to it, not try to prove that it works. The latter is impossible in any case. (Program testing can be used to show the presence of bugs, but never to show their absence! --Edsger Dijkstra.)
It is unlikely that you will write a document that systematically describes every test case you have. But you could still make your test code self-documenting. Add on comments (or some other form of external documentation) for information that is not already apparent from the code.
For example, the following two test cases (written in xUnit fashion) execute the same test, but the second one is more self-documenting than the first because it contains more information about the test case.
Test case 1
||assertEquals(p.getTotalSalary("1/1/2007", "1/1/2006"), 0);|
Test case 2
print ("testing getTotalSalary(startDate, endDate) of Payroll class");
assertEquals(p.getTotalSalary("1/1/2007", "1/1/2006"), 0,
"Case 347 failed: system does not return 0 when end date is earlier than start date");
When you are testing how a system responds to invalid inputs, each test case should have no more than one invalid input. For example, the following test case uses two invalid values at the same time, one for startDate and one for endDate.
If we wrote two test cases like this instead, we get to learn about the error handling for startDate as well as endDate.
Note that if we want to test the error handling of each component of the date (i.e. day, month, and year), we have to write more test cases (yes, that is right; that is why testing is hard work :-)
Testing must be both efficient (i.e. do not use more test cases than necessary) and effective (i.e. maximise the number of bugs found).You can use techniques such as equivalence partitioning and boundary value analysis to increase the effectiveness and effectiveness of testing.
Every test case you write must strive to find something about the system that the rest of the existing test cases do not tell you). That is the 'objective' of the test case. Document it somewhere. This can also deter students trying to boost their LOC count by duplicating existing test cases with minor modifications. For example, if one test case tests a component for a typical input value, there is no point having another test case that tests the same component for another typical input value. That is because objectives for the two cases are the same. Our time is better spent writing a test case with a different objective, for example, a test case that tests for a boundary of the input range.
Consider these two test cases for the getTotalSalary(startDate, endDate). Note how they repeat the same input values in multiple test cases.
|id||startDate||endDate||Objective and expected result|
|345a||1/1/2007||40/40/1000||Tests error handling for endDate, error message expected|
|345b||40/40/1000||1/1/2007||Tests error handling for startDate, error message expected|
Now, note how we can increase the variety of our test cases by not repeating the same input value. This increases the likelihood of discovering something new without increasing the test case count.
|id||startDate||endDate||Objective and expected result|
|345a||1/1/2007||40/40/1000||Tests error handling for endDate, error message expected|
|345b||-1/0/20000||8/18/2007||Tests error handling for startDate, error message expected|
A test case has an expected output. When writing a test case, the proper way is to have the expected output calculated manually, or by some means other than using the system itself to do it. But this is hard work. What if you use the system being tested to generate the expected output? Those test cases - let us call them unverified test cases - are not as useful as verified test cases because they pass trivially, as the expected output is exactly the same as the actual output (duh!). However, they are not entirely useless either. Keep running them after each refactoring you do to the code. If a certain refactoring broke one of those unverified test cases, you know immediately that the behaviour of the system changed when you did not intend it to! That is how you can still make use of unverified test cases to keep the behaviour of the system unchanged. But make sure that you have plenty of verified test cases as well.
Submit your bank of test cases as part of the deliverable. Add a couple of sample test cases to your report. Be sure to pick those that show your commitment to each type of testing. It is best to showcase those interesting and challenging test cases you managed to conquer, not those obvious and trivial cases. You can use a format similar to the following:
Test Purpose: Explain what you intend to test in this test case.
Required Test Inputs: What input must be fed to this test case.
Expected Test Results: Specify the results expected when you run this test case.
Any Other Requirements: Describe any other requirements for running this test case; for example, to run a test case for a program component in isolation from the rest of the system, you may need to implement stubs/drivers.
Sample Code: Illustrate how you implemented the test case.
Comments: Why do you think this case is noteworthy?
Do not forget to allocate enough time for debugging. It is often the most unpredictable project task.
When you notice erroneous behaviour in your software, it is OK to venture one or two guesses as to what is causing the problem. If a quick check invalidates your guesses, it is time to start a proper debugging session.
Debugging is not a random series of 'poke here, tweak there' speculative experiments. It is a systematic and disciplined process of starting from the 'effect' (i.e. the erroneous behaviour noticed during testing) and tracking down the 'cause' (i.e. the first erroneous behaviour that set off the chain of events that caused the 'effect') by following possible execution paths that could cause the effect.
Note that the cause and the effect can be separated by location as well as time. For example, the 'cause' could be an error in saving the data in the first shutdown of the software while the 'effect' could be a system freeze during the start up of the 11th run (It could happen. For example, if the software keeps history data for the last 10 runs of the software and tries to overwrite the history data for the first run when starting up for the 11th run).
Use the debugger of your IDE when trying to figure out where things go wrong. This is a far superior technique than inserting print statements all over the code.
When a system/integration test failure is traced to your component, it usually means one thing: you have not done enough developer testing! There is at least one unit/integration test case you should have written, but did not.
Debugging should not involve extensive modifications to the code. Using a proper debugging tool can minimise the need for such modifications. In any case, do not change code like a bull in a china shop. If you experiment with your code to find a bug (or, to find the best way to fix a bug), use your SCM tool [see tip 9.4] to rollback those experimental modifications that did not work.
When the system is not behaving as expected, do not patch the symptoms. For example, if the output string seems to have an extra space at the end of it, do not simply trim it away; fix the problem that caused the extra space in the first place.
Sometimes, we see only what we want to see. If you cannot figure out the bug after a reasonable effort, try getting someone else to have a look. The reason is, you may be too involved in the code in concern (may be because you wrote it yourself) and be 'blind' to the flaw that someone else with a more detached perspective can detect easily.
Bugs do not go away by themselves. If you did not fix it, it is still there even if you cannot seem reproduce it. If you cannot fix it, at least report it as a known bug.
If there is no time to fix all bugs, choose some (but choose wisely) to be simply reported as 'known bugs'. All non-trivial software has a list of known bugs.
Any suggestions to improve this book? Any tips you would like to add? Any aspect of your project not covered by the book? Anything in the book that you don't agree with? Noticed any errors/omissions? Please use the link below to provide feedback, or send an email to damith[at]comp.nus.edu.sg
---| This page is from the free online book Practical Tips for Software-Intensive Student Projects V3.0, Jul 2010, Author: Damith C. Rajapakse |---