Skip to main content

How to avoid testing in circles.

I once had an interesting conversation with a colleague who worked in a company selling hotel room bookings. The problem was interesting. Their profits depended on many factors. Firstly, fluctuating demand e.g.: Holidays, Weekends, Local events etc. Secondly, varying types of demand e.g. Business customers, Tourists, Single night bookings or e.g.: 11 day holidays. They also had multiple types of contracts on the rooms. For some, they might have had the exclusive right to sell [as they had pre-paid], for others they had an option to sell [at a lower profit] etc.

My naive view had been they priced the room bookings at a suitable mark up, upping that markup for known busy times etc. For example a tourist hotel hotel near the Olympics would be a high mark up, the tourist hotel room in winter would have incurred less of a markup. Better to get some money than none at all).

He smiled and said some places do that, but he didn't. He had realised his team had a bias towards making a healthy profit. "That seems errh good..." I replied, not sure what I was missing. He explained the problem wasn't making a profit. They made a profit, They could do that. There's good enough demand, and limited enough supply in a business and tourist centre like London to make a profit. The problem was maximising profit. What he had done, was present the profit to the team as a proportion of the theoretical maximum profit. That is, the profit given the perfect combination of bookings at peak rates.

It was understood that this was an unlikely, but doable, goal. The benefit was the team could more easily see whether they were making as much money as they could. For example, that business hotel room-stock was making £1 million profit (more than the others), but we should be able to get £10 million (Whereas the others are already at 90% of theoretical max profit.).

This struck me as a useful way of looking at the world. Maybe it was just in-tune with my tester mind-set. In software development, we often try and view what we have achieved, and we see the stories we have completed. And when we come to a release at the end of the week, sprint, month, quarter etc, we test those stories. We also fix and test bugs considered important and we usually regression test the system as a whole. At this point, most teams I've worked with, start trying to regression test - but soon end up retesting the same areas, going in circles around the same code-changes.

The problem is often that our view of the system has been primed. We fall foul of the anchoring bias, and can not easily see that 90% of the system has not been examined. Our testing returns again and again to checking the recent changes and their surrounds. Even when these are probably the best and most recently tested parts of the system. Much like the person in the audience called to the stage in a magic show - I've been primed "to pick a number any number, could be 5, could be 11 ...any number you like.". I'm unlikely to suggest any negative or very large numbers by the time I reach the stage.

What I've found to be useful is applying the same concept the hotel sales team used. To help reduce that bias, I invert the game. Instead of looking at the SCRUM/KANBAN board or bug tracking system, that lists the known stories or known defects. I look at a different list or even several lists. These are usually either a checklist of system areas or a something deliberately not in the affected system areas. I then pick an item and investigate its behaviour as if it was new. The very fact I'm not repeating the same ground as everyone else is increasing the chances that I will find an issue. Whats better, these are just the sort of not quite-related but somehow broken-indirectly bugs that regression testing is aimed at.

So rather than having a board of defects and stories that you are itching to remove, instead have a board with a card for each section of your software. Divide up your time before the release and start working through the items. You can prioritise your testing however you like, but remember you have already focused a lot of time on certain areas, in specific release-change related testing. What's left are the unexplored areas, the undiscovered bugs.

Comments

  1. Awesome Pete. I like the analogy.

    Just wondering... what type of stopping heuristics do you most commonly use when testing that 'unexplored' area?

    Also, do you ever cop any grief for testing an 'un-affected' area? By those that may not be in the know when it comes to the benefits of doing so.

    ReplyDelete
    Replies
    1. Thanks, For regression testing my stopping heuristic, tends to be time. There's a lot of system to cover, so I try and time-box each area. If I uncover a rich seam of bugs, though I usually try and flag/report it and recommend further investigation, before then moving on to the next area. I guess its a 'recon'.

      Yes, Sometimes people question "Why are you bothering with that? we didn't change that.". Its a sensible question, and I can usually give examples of seemingly 'unrelated areas' that were found to be broken in previous releases. Also people soon notice that what I do "may seem back-to-front but you know what? it works".

      Delete

Post a Comment

Popular posts from this blog

What possible use could Gen AI be to me? (Part 1)

There’s a great scene in the Simpsons where the Monorail salesman comes to town and everyone (except Lisa of course) is quickly entranced by Monorail fever… He has an answer for every question and guess what? The Monorail will solve all the problems… somehow. The hype around Generative AI can seem a bit like that, and like Monorail-guy the sales-guy’s assure you Gen AI will solve all your problems - but can be pretty vague on the “how” part of the answer. So I’m going to provide a few short guides into how Generative (& other forms of AI) Artificial Intelligence can help you and your team. I’ll pitch the technical level differently for each one, and we’ll start with something fairly not technical: Custom Chatbots. ChatBots these days have evolved from the crude web sales tools of ten years ago, designed to hoover up leads for the sales team. They can now provide informative answers to questions based on documents or websites. If we take the most famous: Chat GPT 4. If we ignore the

Can Gen-AI understand Payments?

When it comes to rolling out updates to large complex banking systems, things can get messy quickly. Of course, the holy grail is to have each subsystem work well independently and to do some form of Pact or contract testing – reducing the complex and painful integration work. But nonetheless – at some point you are going to need to see if the dog and the pony can do their show together – and its generally better to do that in a way that doesn’t make millions of pounds of transactions fail – in a highly public manner, in production.  (This post is based on my recent lightning talk at  PyData London ) For the last few years, I’ve worked in the world of high value, real time and cross border payments, And one of the sticking points in bank [software] integration is message generation. A lot of time is spent dreaming up and creating those messages, then maintaining what you have just built. The world of payments runs on messages, these days they are often XML messages – and they can be pa

Is your ChatBot actually using your data?

 In 316 AD Emperor Constantine issued a new coin,  there's nothing too unique about that in itself. But this coin is significant due to its pagan/roman religious symbols. Why is this odd? Constantine had converted himself, and probably with little consultation -  his empire to Christianity, years before. Yet the coin shows the emperor and the (pagan) sun god Sol.  Looks Legit! While this seems out of place, to us (1700 years later), it's not entirely surprising. Constantine and his people had followed different, older gods for centuries. The people would have been raised and taught the old pagan stories, and when presented with a new narrative it's not surprising they borrowed from and felt comfortable with both. I've seen much the same behaviour with Large Language Models (LLMs) like ChatGPT. You can provide them with fresh new data, from your own documents, but what's to stop it from listening to its old training instead?  You could spend a lot of time collating,