A Good Run!

“We got a good run from the tests” the tester stated.

“So what’s the story?” the scrum master asked.

“85% Pass” comes the reply, meekly.

“OK, just need to fix that 5% then.” The scrum master announces before striding off to announce that the team is only a couple of % away from success.

Our tester takes a moment to try and process the exchange…

Firstly, their own words:

“We got a good run”

Why had they said that? Well - in a sense - it was true. They had executed the tests before, and they had returned a much higher failure rate. But the code being checked was the same...

OK, so there were at least 3 obvious ways to interpret the data.

The app code meets the criteria checked by the tests. ( Based on test run 2 )

The app code does not meet the criteria checked by the tests. ( Based on test run 1 )

The tests are as reliable a the toss of the coin. ( Based on both test runs )

Its surprising how unlikely people are to choose (3).

Secondly, the scrum master’s words:

“just need to fix that 5%”

Our tester assumes this relates to the de-facto “threshold” that is usually considered as good enough to release. As if the results were a linear scale, such as height or weight. If your code gets over 90% then it gets to pass the gate and get on the release roller-coaster.

The threshold tends to be arbitrary, I worked with a client that thought 86% was good but 83% was just not fit for purpose! Their use tends to indicate a problem. Why are we caring about a number rather than a possibly broken feature? What features or risks do the failing 10% represent? Why do we have so many routine failures?

Do you hear these sort of conversations in your team? If so, then your team might need some coaching.

Comments

Can Gen-AI understand Payments?

When it comes to rolling out updates to large complex banking systems, things can get messy quickly. Of course, the holy grail is to have each subsystem work well independently and to do some form of Pact or contract testing – reducing the complex and painful integration work. But nonetheless – at some point you are going to need to see if the dog and the pony can do their show together – and its generally better to do that in a way that doesn’t make millions of pounds of transactions fail – in a highly public manner, in production. (This post is based on my recent lightning talk at PyData London ) For the last few years, I’ve worked in the world of high value, real time and cross border payments, And one of the sticking points in bank [software] integration is message generation. A lot of time is spent dreaming up and creating those messages, then maintaining what you have just built. The world of payments runs on messages, these days they are often XML messages – and they ...

Text to SWIFT - making data from prose (What possible use could Gen AI be to me? - Part 2)

As I write this, my dog is grumpily moving around the room pausing intermittently to give me disappointed looks - looks that only my elderly mother could compete with. She (my dog) is annoyed by the robot vacuum cleaner. Its not been run for a while in that room - and its making a noisy foray into dark corners in a valiant effort to cleanse the mess. Its grinding gears and the cloud of dust in its wake is not helping to ease the dogs nerves. The dog's pleading puppy dog eyes & emotions have of course been anthropomorphised - at least a bit - by me (My dog is 7 years old and weighs over 20kg - so has little to fear). That is - I've taken human feelings and mapped them onto my dog. I know she has emotions - but she lacks language - or at least a language that (1) we humans understand, (2) maps to the same phrases or concepts I'm using. But I'm human, That's how I think and how I interact with people and sometimes - machines. Deciphering the problem and representi...

Don't be a Vogon, make it easy to access your test data!

The beginning of the hitch-hikers guide to the galaxy leads with an alien ship about to destroy the Earth, and the aliens saying we (mankind) should have been more prepared – as a notice had been on display quite clearly – on Alpha Centauri the nearby star system, for 50 years. Seriously, people - what are you moaning about – get with the program? The book then continues with the theme of bureaucratic rigidity and shallow interpretations of limited data. E.g. The titular guide’s description of the entire Earth is one word: “Harmless”, but after extensive review the new edition will state: “Mostly harmless”. Arthur Dent argues with the Vogons about poor data access This rings true for many software testing work, especially those with externally developed software, be that external to the team or external to the company. The same approaches that teams use to develop their locally developed usually don’t work well. This leads to a large suite of shallow tests that are usually h...

investigating software

Search This Blog

A Good Run!

Labels

Comments

Post a Comment

Popular posts from this blog

Can Gen-AI understand Payments?

Text to SWIFT - making data from prose (What possible use could Gen AI be to me? - Part 2)

Don't be a Vogon, make it easy to access your test data!