Skip to main content

The arrogance of regression testing

Lets assume we know that our software is not perfect. How can it be? Its complex, mortals created it and we don’t have enough time to test every execution path & environment – so we could never be sure anyway. This is Ok - this is normal, testers deal with this situation every day.

This tends to be a typical scenario... Our team has been working on some new features. They’re looking good, initial teething issues have been fixed and the new features are considered worthwhile enough and bug-sparse enough to be released into the wild.

This is where things can get a little awkward. The team member’s opinions are often split across a wide spectrum. The relatively minor perceived impact of the work leads some to conclude that the work is ready for release as is.

Other team members, who are possibly twice shy from previous ‘minor change’ induced problems, argue for a comprehensive ‘regression test’ of the software. There is usually a range of views in between suggesting for example only ‘regression testing’ directly affected systems or those associated with the changes etc.

The oft-stated concern is: we may have broken something. Our new code might be good, the existing code might be even better, but what about emergent issues caused by the new ‘system’ we’ve created?

A common compromise proposed by teams and customers is generally translated as ‘test that it [the core features etc] still works’. This may sound reasonable, intelligent and practical. But it just doesn’t sit well with me. My unease stems from more than the assumption that testers can prove it works…again. The concept of regression testing seems arrogant and it even, worse it seems wasteful.

Arrogant? Regression testing assumes that we tested the application so well before that we found all the important areas of ambiguity and bugs. It’s like saying that when Microsoft released a patch to Vista that they only needed to ensure it was up to the high standard of the original Vista release.

The root of the problem here is a bias. We start out with the perception ‘Our system is great’, and then lets check (sic) it still is. The Congruence bias is a powerful motivator to not upset the perceived status quo. We stop looking for problems that we don’t think are caused by the new changes. Whereas we might investigate these ‘issues’ in other circumstances, we don’t even think to look deeper unless they are practically labeled “BROKEN BY THE NEW CODE”. How many issues are overlooked?

Wasteful? By misdirecting ourselves from the system as a whole – we are missing an opportunity to test again. A second chance to find those bugs we didn’t find last time. We could be capitalizing on another chance to learn new and old areas of the application.

When I’ve been in this situation I’ve also fallen foul of the biases at play, I’m human. But as a tester, I’ve had to come up with a few strategies to help remedy the problem.
  • Firstly, just be aware of the bias - don't let it lead you blindly.
  • If you’re lucky enough to have another tester in your team, split the problem with them. One of you can test the system as a whole, the other the ‘affected areas’.
  • Deliberate opposition. A technique I use when I’ve got a definite checklist of the affected areas. Deliberately pick things not on the list, or that are the direct opposite of what is on the list. E.g.: If the change affects Logging in as user X, What does logging out as user Y do? Or can you avoid being logged in all together?
  • Randomness. Choose a path or some data at random. For good randomness is a useful source.

I find the above a useful means of breaking out of the tunnel vision of regression testing. They give you new paths to follow, that if explored can often yield new and old bugs.


  1. Good thoughts!

    I was recently asked how I could "guarantee" that the system still works after a sw change (by use of an automated regression suite). To which I replied that I couldn't guarantee anything unless the guarantee stated that the narrow "paths" traversed by any automated suite would be the exact same that every possible customer would use (same starting states, exact same behaviour wrt timing between actions, data transmitted (being exactly the same), user provisioned data in the databases, etc, etc.).

    A very confused and disappointed face greeted me.

    The upshot was that I asked the person to be careful with language - try and be precise even - and definitely don't promise anything that you don't understand the implications of... I could tell the persons lots about the sw and the testing of it, but I'd probably never use the word "guarantee".

    Talking about testing, what it means and its implications is a big effort! And it's something that testers need to devote just as much time, effort and understanding to as any other part of their repertoire.

  2. Thanks Simon, Yes I've been in your situation also, It can be a challenge to manage expectations. Using the 'right' language can help a lot. I also find keeping a mental list of 'similar issues' can be useful. Real failures often speak louder than words.

    For example: "You're right this looks like a simple change, But remember when we released 'The XYZ', and that was a change to the same sub-system. It turned out to affect 'The MNO' - badly"


Post a Comment

Popular posts from this blog

The gamification of Software Testing

A while back, I sat in on a planning meeting. Many planning meetings slide awkwardly into a sort of ad-hoc technical analysis discussion, and this was no exception. With a little prompting, the team started to draw up what they wanted to build on a whiteboard.

The picture spoke its thousand words, and I could feel that the team now understood what needed to be done. The right questions were being asked, and initial development guesstimates were approaching common sense levels.

The discussion came around to testing, skipping over how they might test the feature, the team focused immediately on how long testing would take.

When probed as to how the testing would be performed? How we might find out what the team did wrong? Confused faces stared back at me. During our ensuing chat, I realised that they had been using BDD scenarios [only] as a metric of what testing needs to be done and when they are ready to ship. (Now I knew why I was hired to help)

There is nothing wrong with checking t…

Software development is in the Doldrums

"Don't get off the boat."

"Seriously, never get off the boat," The instructor said, leaning forward and looking at each of us in turn.

"But surely if it's sinking..." We reply, somewhat confused and slightly incredulous. We've seen Titanic, we think to ourselves, we know how this sea survival stuff works...

"OK" He concedes, If things get really bad, "Get on the life raft if you can step-up from the boat to the life raft".

"But, But... the yacht is like 37ft long, Do we want to wait until that whole boat is lower than the life-raft? When less than 1ft of the yacht is above the surface? Meanwhile all the time the life raft is just there... floating happily alongside."

"Pretty much, yes," he said nodding.

That was about 15 years ago. Not much has changed since. The reasons are manifold. Firstly, the yacht is a decent shelter. The thin plastic of a legal minimum life-raft isn't going to protect you fro…

A h̶i̶t̶c̶h̶h̶i̶k̶e̶r̶'s̶ software tester's guide to randomised testing - Part 1

Mostly Harmless, I've talked and written about randomisation as a technique in software testing several times over the last few years. It's great to see people's eyes light up when they grok the concept and its potential. 
The idea that they can create random test data on the fly and pour this into the app step back and see what happens is exciting to people looking to find new blockers on their apps path to reliability.
But it's not long before a cloud appears in their sunny demeanour and they start to conceive of the possible pitfalls. Here are a few tips on how to avert the common apparent blockers. (Part 1) Problem: I've created loads of random numbers as input data, but how will I know the answer the software returns, is correct? - Do I have to re-implement the whole app logic in my test code?
Do you remember going to the fun-fair as a kid? Or maybe you recall taking your kids now as an adult? If so then you no doubt are familiar with the height restriction -…