Skip to main content

If it's not good testing, it's not good regression testing either.

Pick a coin from your pocket, and hold it at arms length. Take a good look. Now take another one, of the same denomination and hold it out at arms length as before. Based on your observations alone - can you say they are the identical?

Lets go a step further. If someone had given you one coin to look at, then exchanged it for another, could you have determined whether they are the same or different coins? Maybe, yes? If the differences had been large enough e.g. one coin was heavily tarnished or scratched, then the different coins would be identifiable. Or if you'd been given the opportunity to examine the coin using magnifying equipment, you probably could of found differences.

But lets assume our only test was a standard set of checks i.e.: viewing at arms length and comparing what we see with our notes/records. It's better than nothing, I would see some differences, some might be important ones. For example if my next coin was blank: I might have suspected an issue with my coin supply, and investigated.

What about my next coin... it is blank on one side. Unfortunately it's not the side I check when I hold it at arms length. So as far as my checks are concerned there has been no regression in the quality of the coins being produced by my pocket. So until I go 'live' and try and spend my coins out in the real world of shopkeepers, I'm none the wiser.

Do you see the flaw in our logic here? If we noticed a degradation in coin quality the testing is good. If the testing does not find an issue, it still must be good, because previously those checks found a different issue. Because I was only performing one test or one set of tests I was blind to issues that I can't see with that one test.

If we'd been testing the coins independently, we probably would of been more critical. We might of thought: sure it looks good in the arm length test, what about the weight: maybe thats wrong. We'd try a number of different tests trying to find an issue. We'd ask other people about coins, learn about their two sided nature and perform tests for it.

But as soon as we enter 'regression testing' mode, we often start to disregard this behaviour and start to mindlessly run the same tests. We avoid exploration, sometimes without noticing. Sometimes people actively avoid exploration during regression testing thinking it's inappropriate. This approach would assume that the test you have been running is some kind of super-observer, capable of helping you to see all problems.

If the system has changed significantly, with the addition or removal of complex behaviours, surely the tests might not also need to adapt? The assumption that the same test will somehow catch a change in functionality, reliability etc is based on the premise that our super-test was testing everything -before- and still is. As testers we know it didn't, doesn't and never will be that super-test. We need to adapt to each new release in an attempt to find new issues. If our tests aren't finding an issue, it's just as possible that the tests are ineffective as it is that the system isn't defective.

Comments

Post a Comment

Popular posts from this blog

The gamification of Software Testing

A while back, I sat in on a planning meeting. Many planning meetings slide awkwardly into a sort of ad-hoc technical analysis discussion, and this was no exception. With a little prompting, the team started to draw up what they wanted to build on a whiteboard.

The picture spoke its thousand words, and I could feel that the team now understood what needed to be done. The right questions were being asked, and initial development guesstimates were approaching common sense levels.

The discussion came around to testing, skipping over how they might test the feature, the team focused immediately on how long testing would take.

When probed as to how the testing would be performed? How we might find out what the team did wrong? Confused faces stared back at me. During our ensuing chat, I realised that they had been using BDD scenarios [only] as a metric of what testing needs to be done and when they are ready to ship. (Now I knew why I was hired to help)



There is nothing wrong with checking t…

A h̶i̶t̶c̶h̶h̶i̶k̶e̶r̶'s̶ software tester's guide to randomised testing - Part 1

Mostly Harmless, I've talked and written about randomisation as a technique in software testing several times over the last few years. It's great to see people's eyes light up when they grok the concept and its potential. 
The idea that they can create random test data on the fly and pour this into the app step back and see what happens is exciting to people looking to find new blockers on their apps path to reliability.
But it's not long before a cloud appears in their sunny demeanour and they start to conceive of the possible pitfalls. Here are a few tips on how to avert the common apparent blockers. (Part 1) Problem: I've created loads of random numbers as input data, but how will I know the answer the software returns, is correct? - Do I have to re-implement the whole app logic in my test code?
Do you remember going to the fun-fair as a kid? Or maybe you recall taking your kids now as an adult? If so then you no doubt are familiar with the height restriction -…

How did you find that bug? Are we sitting comfortably, then I'll begin.

How did you find that bug? - They asked with a sort of puzzled "he dun't thunk like uz" look on their faces. An expression that suggested they were unsure whether to commend the discovery or gather their pitchforks and organise a well overdue witch burning.

Likewise, I now knew why they needed me. The team members were genuinely hard working people trying to build something new and exciting. But they lacked one thing, someone exploring & asking questions - trying to find out new things about their application. Exploring is literally a step into the unknown, and that can be uncomfortable for those not experienced in how to do it well.
So how did I find that bug? It's easy to tell a story of how I tried that particular input value because... Paragraph 3 of v4.6 of the requirements document stated that the user shall indeed on occasion X given input Y in Chrome v62 do... Or spout some other overly verbose explanation of why that broken 'scenario' came to be…