Skip to main content

Don't be a Vogon, make it easy to access your test data!

 The beginning of the hitch-hikers guide to the galaxy leads with an alien ship about to destroy the Earth, and the aliens saying we (mankind) should have been more prepared – as a notice had been on display quite clearly – on Alpha Centauri the nearby star system, for 50 years. Seriously, people - what are you moaning about – get with the program? 

The book then continues with the theme of bureaucratic rigidity and shallow interpretations of limited data. E.g. The titular guide’s description of the entire Earth is one word: “Harmless”, but after extensive review the new edition will state: “Mostly harmless”.

Man argues with Aliens
Arthur Dent argues with the Vogons about poor data access

This rings true for many software testing work, especially those with externally developed software, be that external to the team or external to the company. The same approaches that teams use to develop their locally developed usually don’t work well. This leads to a large suite of shallow tests that are usually hard to maintain and often not trusted. The tests are usually an awkward and brittle addition like a cheap ill-fitting smart phone case, they half look the part but let’s be honest – your phone might work better smashed than with that “indestructible” (and virtually unusable) case. 

Let’s take a quick step back and ask what’s wrong? What’s the problem that teams hit first? Usually – the first hurdle is getting that information. What information? Anything you need to find out if there is a bug. Upon finding they don’t have easy access to a data source (a DB, a log, an API etc) teams will usually start rationing access (let’s not bother querying that…) or even avoiding getting whole swathes of relevant data. For example, they’ll reduce the project to only accessing an app via the UI, or via limited APIs or Message Queues already provided. 

This tendency to avoid the backend-data access is usually exacerbated by the monolithic nature of many externally bought software systems. (or the many in-house microservice based systems, that have ossified into a monolith)

Even if the team get that data access, they often assume that they are the only consumers and in fact they feel they should probably gate-keep that data. On the contrary its often better to make those data sources accessible via APIs or a convenient fast data store. Why? – so everyone so you can build more tests more easily. 

It’s not about the greater good, making the data accessible makes your life easier and has the added benefit that other teams get to test their integrations with your system more easily thanks to that handy API… The common anti pattern is to build for example a series of database queries into your test code directly. Much better to place that in a re-usable service that you can use in other tests, other teams can use, no matter their programming language (Yes, your team may use JavaScript or TypeScript to test your front-end but something else for the backend – and they all want to access the data!). Your test infrastructure is then loosely coupled and easily reusable.

These investments in data access and data generation are usually better than the diminishing returns gained from adding more slow and flaky tests via the user interface or hand rolling large quantities of XML or JSON test data – in each test.

Its not uncommon for teams to build large impenetrable test frameworks and huge test suites. The first step when you find yourself in this situation is not to build more or even rebuild again in the latest tools. It’s to find out what data you really need to test and build around that guiding principle of open and easy access.


Comments

Popular posts from this blog

Can Gen-AI understand Payments?

When it comes to rolling out updates to large complex banking systems, things can get messy quickly. Of course, the holy grail is to have each subsystem work well independently and to do some form of Pact or contract testing – reducing the complex and painful integration work. But nonetheless – at some point you are going to need to see if the dog and the pony can do their show together – and its generally better to do that in a way that doesn’t make millions of pounds of transactions fail – in a highly public manner, in production.  (This post is based on my recent lightning talk at  PyData London ) For the last few years, I’ve worked in the world of high value, real time and cross border payments, And one of the sticking points in bank [software] integration is message generation. A lot of time is spent dreaming up and creating those messages, then maintaining what you have just built. The world of payments runs on messages, these days they are often XML messages – and they ...

What possible use could Gen AI be to me? (Part 1)

There’s a great scene in the Simpsons where the Monorail salesman comes to town and everyone (except Lisa of course) is quickly entranced by Monorail fever… He has an answer for every question and guess what? The Monorail will solve all the problems… somehow. The hype around Generative AI can seem a bit like that, and like Monorail-guy the sales-guy’s assure you Gen AI will solve all your problems - but can be pretty vague on the “how” part of the answer. So I’m going to provide a few short guides into how Generative (& other forms of AI) Artificial Intelligence can help you and your team. I’ll pitch the technical level differently for each one, and we’ll start with something fairly not technical: Custom Chatbots. ChatBots these days have evolved from the crude web sales tools of ten years ago, designed to hoover up leads for the sales team. They can now provide informative answers to questions based on documents or websites. If we take the most famous: Chat GPT 4. If we ignore the...

Manumation, the worst best practice.

There is a pattern I see with many clients, often enough that I sought out a word to describe it: Manumation, A sort of well-meaning automation that usually requires frequent, extensive and expensive intervention to keep it 'working'. You have probably seen it, the build server that needs a prod and a restart 'when things get a bit busy'. Or a deployment tool that, 'gets confused' and a 'test suite' that just needs another run or three. The cause can be any number of the usual suspects - a corporate standard tool warped 5 ways to make it fit what your team needs. A one-off script 'that manager' decided was an investment and needed to be re-used... A well-intended attempt to 'automate all the things' that achieved the opposite. They result in a manually intensive - automated process, where your team is like a character in the movie Metropolis, fighting with levers all day, just to keep the lights on upstairs. Manual-automation, manu...