Skip to main content

A Fair Witness

About 10 years ago, I was working with a client who were in the process of developing a new ecommerce website. The new website and servers were designed to replace an entire existing suite of systems, and provide a platform for the company's future expansion. As you might imagine, the project was sprawling, new front end servers, new middleware and a host of back-end business to business systems. The project had expanded during its course, as new areas of old functionality were identified and the business requested new systems to support new ventures.

This isn't an unusual scenario for software development projects, it is for exactly this type of situation that many companies now look to agile methodologies for help. But don't worry this isn't a rant about the benefits of one methodology over another. What interested me was the how the project team members performed and viewed their testing efforts.

Each build of the code would include new functionality, [hopefully] ready for testing. As the release date approached the rate of delivery would increase. More builds were delivered to the testers, in an effort to get more feedback, more quickly. As is normal, as we approached a deadline compromises were made. Programmers and testers would have their allotted time reduced. Programmers were given less time to program and unit test, and the testers less time for their own system testing.

The change in how the software and systems were developed was gradual. For a while things seem to continue on as before, maybe with a few more deadlines looking achievable. But once in a while there would be slip ups. For example, a bug may slip into the system, not have been uncovered during testing and have escaped into live. This situation would shocked the team, but luckily no serious damage was done. After-all bugs like this had occurred even when programmers and testers had been given more time, so who knows whether we 'would' of caught it before timelines were trimmed.

As features were 'completed' and the team moved on to new features, an interesting cognitive dissonance took place. Although a particular system was well known to of been coded in a hurry, and pushed live with only minimal testing, I noticed our views changed over time. Over time the dissonance appeared to cause us to assume we 'must of tested it properly' otherwise we wouldn't of released it. Despite at the time of release, every team member being quite uneasy with the release, due to its limited testing.

Another effect was the normalisation of risk, an accommodation we gained to the ever increasing levels of risk we were taking. As bugs were discovered in Production, or systems failed on the production network we gradually came to see these as 'everyday' occurrences. In a way this manner of thinking was 'correct', the bugs, outages and similar incidents were occurring more and more frequently, if not quite everyday.

This had a powerful affect on the testers. Initially they, like everyone involved, were alarmed at the bugs and outages. They gradually adapted to the new norm, Only more spectacular bugs surprised them and stood-out from the noise of the everyday failures. This environment had a detrimental affect on the ability of the testers to test effectively. While still highly skilled, the heuristics they were working with were getting more vague. For example at the start of the project, merely having not had a feature tested was grounds for heated discussion and probable release delay. It gradually became a question of whether any testing was needed for a new or modified feature.

Or for example, an error message in the logs, was originally always worthy of investigation. It was either a 'real bug' or at best 'correct code' mistakenly reporting errors. Either way it was worth reporting. But when several releases have gone out-the-door, and unfixed error messages are commonplace in the logs, do you report each one? How many were present in the last release? We accepted those errors, so are these important? Surely a failure is a failure, then and now?

What we can do as testers, is question the risk taking, the same as we question everything else. The change in behaviour is information in itself. Risk taking is part of a software testers job. We gamble time against coverage, in an effort to win bugs and new knowledge of the systems we test. When asked to reduce the testing time, maybe highlight how the risk of you missing bugs might change. You can suggest mitigations or conversely inform people of just what won't be tested. Are they comfortable with that?

When stakeholders are unhappy that you won't be testing something, that you missed a bug or just found a serious bug at the last minute, you could suggest that the teams risk taking is probably higher than everyone [including them] is comfortable with. If the testing was organised differently, ran for longer, or better resourced maybe the risk of those near misses could be reduced. The bug might be found half way through testing, given more time, new skills or a helping pair of hands.

As a fair witness of the events, software problems and risks affecting the project we are a valuable resource for gaining a view of a project's status and less an unwanted burden on project plans and deadlines.


Post a Comment

Popular posts from this blog

Why you might need testers

I remember teaching my son to ride his bike. No, Strike that, Helping him to learn to ride his bike. It’s that way round – if we are honest – he was changing his brain so it could adapt to the mechanism and behaviour of the bike. I was just holding the bike, pushing and showering him with praise and tips.

If he fell, I didn’t and couldn’t change the way he was riding the bike. I suggested things, rubbed his sore knee and pointed out that he had just cycled more in that last attempt – than he had ever managed before - Son this is working, you’re getting it.

I had help of course, Gravity being one. When he lost balance, it hurt. Not a lot, but enough for his brain to get the feedback it needed to rewire a few neurons. If the mistakes were subtler, advice might help – try going faster – that will make the bike less wobbly. The excitement of going faster and better helped rewire a few more neurons.

When we have this sort of immediate feedback we learn quicker, we improve our game. When the f…

Thank you for finding the bug I missed.

Thank you to the colleague/customer/product owner, who found the bug I missed. That oversight, was (at least in part) my mistake. I've been thinking about what happened and what that means to me and my team.

I'm happy you told me about the issue you found, because you...

1) Opened my eyes to a situation I'd never have thought to investigate.

2) Gave me another item for my checklist of things to check in future.

3) Made me remember, that we are never done testing.

4) Are never sure if the application 'works' well enough.

5) Reminded me to explore more and build less.

6) To request that we may wish to assign more time to finding these issues.

7) Let me experience the hindsight bias, so that the edge-case now seems obvious!

Google Maps Queue Jumps.

Google Maps directs me to and from my client sites. I've saved the location of the client's car parks, when I start the app in the morning - it knows where I want to go. When I start it at the end of the day, Google knows where I want to go.
This is great! It guides me around traffic jams, adjusts when I miss a turn and even offers faster routes en-route as they become available.
But sometimes Google Maps does something wrong. I don't mean incorrect, like how it sometimes gets a street name wrong (typically in a rural area). I don't mean how its GPS fix might put me in a neighbouring street (10m to my left - when there are trees overhead).
I mean wrong - As in something unfair and socially unacceptable. An action, that if a person did it, would be frowned upon.
Let’s assume a road has a traffic jam, so instead of the cars doing around 60 mph, we are crawling at <10 mph.
In the middle of this traffic jam, the road has a junction, an example is shown here: