Skip to main content

Being a square keeps you from going around in circles.

After a weary few hours sorting through, re-running and manually double checking the "automated test" results, the team decide they need to "run the tests again!", that's a problem to the team. Why? because they are too slow. The 'test' runs take too long and they won't have the results until tomorrow.

How does our team intend to fix the problem? ... make the tests run faster. Maybe use a new framework, get better hardware or some other cool trick.
The team get busy, update the test tools and soon find them selves in a similar position. Now of course they need to rewrite them in language X or using a new [A-Z]+DD methodology. I can't believe you are still using technology Z , Luddites!

Updating your tooling, and using a methodology appropriate to your context makes sense and should be factored into your workflow and estimates. But the above approach to solving the problem, starts with the wrong problem. As such, its not likely to find the right answers
.
The team are spending hours unpicking the test results. The results can't be trusted and need to be rerun or manually reviewed. They are the problems. Until you address the reliability, accuracy and precision of the automated checks they will always be a major source of failure demand

That dream of freeing up the team to move quicker or let the testers do more exploratory or security focused testing will remain a dream - while the team spend excessive time picking through the bones of your test results.

Your "automated tests" are a measuring tool. They help you measure the quality of your app. Imagine if your ruler reported a different length every 3rd time you used it! You'd blame the ruler and build or buy a better ruler. Rather than bemoan the time is takes to get an accurate measurement - while re-measuring objects to get "best of three!".

Try fixing or just disabling the flaky tests. Test your automated tests. Don't "create a failing test then see it pass" - investigate whether it was failing for the right reasons and then passing for the right reasons. Speak to your team mates e.g.: "How can I create Problem X realistically to check that my tests pick it up reliably?"

Do you hear these sort of conversations in your team? If so, then your team might need some coaching.

Comments

Popular posts from this blog

Can Gen-AI understand Payments?

When it comes to rolling out updates to large complex banking systems, things can get messy quickly. Of course, the holy grail is to have each subsystem work well independently and to do some form of Pact or contract testing – reducing the complex and painful integration work. But nonetheless – at some point you are going to need to see if the dog and the pony can do their show together – and its generally better to do that in a way that doesn’t make millions of pounds of transactions fail – in a highly public manner, in production.  (This post is based on my recent lightning talk at  PyData London ) For the last few years, I’ve worked in the world of high value, real time and cross border payments, And one of the sticking points in bank [software] integration is message generation. A lot of time is spent dreaming up and creating those messages, then maintaining what you have just built. The world of payments runs on messages, these days they are often XML messages – and they ...

Don't be a Vogon, make it easy to access your test data!

 The beginning of the hitch-hikers guide to the galaxy leads with an alien ship about to destroy the Earth, and the aliens saying we (mankind) should have been more prepared – as a notice had been on display quite clearly – on Alpha Centauri the nearby star system, for 50 years. Seriously, people - what are you moaning about – get with the program?  The book then continues with the theme of bureaucratic rigidity and shallow interpretations of limited data. E.g. The titular guide’s description of the entire Earth is one word: “Harmless”, but after extensive review the new edition will state: “Mostly harmless”. Arthur Dent argues with the Vogons about poor data access This rings true for many software testing work, especially those with externally developed software, be that external to the team or external to the company. The same approaches that teams use to develop their locally developed usually don’t work well. This leads to a large suite of shallow tests that are usually h...

Can 'reasoning' LLMs help with recs data creation?

  A nervous tourist, glances back and forth between their phone and the street sign. They then rotate their phone 180 degrees, pauses, blink and frown. The lost traveller, flags a nearby ‘local’ (the passer-by has a dog on a lead.   “Excuse me…” she squeaks, “How may I get to Tower Hill?” “Well, that’ s a good one” ponders the dog walker, “You know…” “Yes?” queries the tourist hopefully. “Yeah…” A long pause ensues then, “Well I wouldn’t start from here” He states confidently. The tourist almost visibly deflates and starts looking for an exit. That’s often how we start off in software testing. Despite the flood of methodologies, tips on pairing, power of three-ing, backlog grooming, automating, refining and all the other … ings ) We often find ourselves having to figure out and therefore ‘test’ a piece of software by us ing it. And that’s good. Its powerful, and effective if done right. But, like our dog walker, we can sometimes find ourselves somewhere unfamiliar...