Skip to main content

Programmers & Testers, two roles divided by a common skill-set.

When we switch people from programming to testing and vice versa may reduce the quality of our software.

I’ll get some quick objections out of the way:
  1. But, A person can be a great tester and programmer.  - Yes I agree.
  2. But, Programmers do a lot of good testing. - Yes I agree.
None of the above are in conflict with my conjecture.

Programming or writing software automates things that would be expensive or hard to do otherwise. Our software might also be faster or less error prone at doing whatever it replaces. It may achieve or enable something that couldn't be done before the software was employed.

Writing software involves overcoming and working around constraints to achieve improvement. That is, looking for ways to bypass limitations imposed by external factors or limitations in the tools we use. For example, coping with a high latency internet connection, legacy code or poor quality inputs. A programmer might say they were taking advantage of the technologies’ features to create a faster/more-stable system. A skilled and experienced programmer has learnt to deal with complexity and produce reliable and effective software. That's why we hire them.

A good example might be WhatsApp. Similar messaging systems existed before WhatsApp. But WhatsApp brought together the platform (mobile iOS & Android), cost (free at point of use), security (e2e encryption) and ease of use that people wanted. These features were tied together and the complexities and niggles were smoothed over. For example, skilled programmers automated address book integration and secure messaging instead of a user having to try and use multiple apps to achieve the same.

But the complexities or constraints are often leads to bugs. Leads that are easy to not fully appreciate. The builder's approach: It does that so I need to do this - can override the more investigative approach of - Puzzling over what does a systems apparent behaviour means about its underlying algorithm or supporting libraries? A good tester hypothesizes about what behaviours might be possible from the software. For example: Could we get the app to do an unwanted thing or the right thing at wrong time?

Alternatively a tester may observe the software in action, but bear in mind that certain symptoms may be caused by the constraints within the software or its construction. The psychological ‘priming’ can make them more likely to spot such issues during the course of their examination of the app.

A common response at this point in the debate is, “The person writing the app/automated tests/etc would be able to read the code and see directly what the algorithm is!” But, that argument is flawed for 2 main reasons:

  1. Many testers can read and write good code. This sort of investigation is and always has been an option - whether we are currently implementing the app or an ‘automated test’ or neither. The argument is often a straw man suggesting that all testers can not write (and therefore can’t read ) code.
  2. In a system of any reasonable complexity, there are situations where it’s easier to ascertain a system’s actual behaviour empirically. An attempt to judge its behaviour by purely examining the code is likely to miss a myriad of bugs caused by 3rd party libraries, environmental issues  and a mish-mash of different programmers work.

For example...


Programmers can and do - test their own and colleagues code. They do so as programmers, focused on those very difficult constraints. One of the key constraints is time. Not just, can the code handle time zones etc. But, how long do I have to implement this? What can I achieve in that time? Can I refactor that dodgy library code? Or do I just ‘treat it as good’? Time and the other constraints guide the programmer down a different testing road. Their rationed time leads them to focus on what can be delivered. Their focus is on whether their code met the criteria in the ticket given the complexities that they had to overcome.

A classic symptom of this is the congruence bias, programmers implement code and tests that ascertain whether the system is functioning as they expect. That’s good. That can tell us the app can achieve what was intended. A good example might be random number generation. A team might be assigned to produce an API that provides a randomiser for other parts of the app. The team had been told the output needed to be securely random. That is very random.

The team, Knowing about such things use their operating system’s built in features to generate the numbers. (For example on Linux that might be /dev/random ). Being belt and braces kind of people, they would implement some unit tests that would perform a statistical analysis of their function. This would likely pass with every build once they had fixed the usual minor bugs and all would be good.


Luckily for the above team, they had a tester. That tester loved a challenge. Of course she checked the randomness of the system, and yes that looked OK. She also checked the code in conjunction with other systems, and again the system worked OK. The tester also checked if the code fast enough, and once again the system was fine. The tester then set up a test system with a high load. Boom. The log was full of timeout errors, performance was now atrocious and she knew she had struck gold. A little investigation would show that some operating system random number generators are ‘blocking’.

A blocking algorithm will cause subsequent requests to be queued (‘blocked’) until its predecessor has finished. Even if the algorithm is fast, there will be a tipping point when suddenly more requests are coming in than can be serviced. At that point the number of successful requests (per second, for example) will cease to keep up with demand. Typically we might expect a graph of the requests being effectively handled by our system to show a plateau at this point.

Our tester, had double checked the code could do the job. But also questioned the code in ways the team had not thought to look for. Given that there are tools and techniques to aid the measurement of randomness, this confirmatory step would likely be relatively short. A greater period of time would likely have been spent investigating [areas that at the time were ] unknowns. Therefore, The question is less can the tester read the code or validated that it performs how we predicted. The question is more can they see what the software might have been? How might it fall short? What could or should we have built?

Our tester had a different mindset, she stepped beyond what was specified. We can all do this, but we get better the more we do it. We get better if we train at it. Being a good systems programmer, and training at that - comes at the cost of training in the tester mindset. Furthermore, the two mindsets are poles, each at opposite end of our cognitive abilities. Striving at one skillset might not help with the other. A tester that writes code has great advantages. They can and do create test tools - to tease out the actual behaviour of the system. These programming skills have a investigative focus. They may even have a exploratory or exploitative (think security testing) focus, but not a construction focus.

For those that are screaming BUT THEY COULD HAVE HAD A BDD SCENARIO FOR HIGH LOAD or similar, I’ll remind you of the Hindsight bias ( tl;dr:  “the inclination, after an event has occurred, to see the event as having been predictable”)

While the programmer and the tester often share the same headline skills, e.g. they can program in language X, understand and utilise patterns Y & Z appropriately. They apply these skills differently, to different ends.

The change from tester to programmer is more than a context switch. It's a change in your whole approach. People can do this, but it has a cost. That cost might be paid in slower delivery, bugs missed, or features not implemented.


Popular posts from this blog

The gamification of Software Testing

A while back, I sat in on a planning meeting. Many planning meetings slide awkwardly into a sort of ad-hoc technical analysis discussion, and this was no exception. With a little prompting, the team started to draw up what they wanted to build on a whiteboard.

The picture spoke its thousand words, and I could feel that the team now understood what needed to be done. The right questions were being asked, and initial development guesstimates were approaching common sense levels.

The discussion came around to testing, skipping over how they might test the feature, the team focused immediately on how long testing would take.

When probed as to how the testing would be performed? How we might find out what the team did wrong? Confused faces stared back at me. During our ensuing chat, I realised that they had been using BDD scenarios [only] as a metric of what testing needs to be done and when they are ready to ship. (Now I knew why I was hired to help)

There is nothing wrong with checking t…

Manumation, the worst best practice.

There is a pattern I see with many clients, often enough that I sought out a word to describe it: Manumation, A sort of well-meaning automation that usually requires frequent, extensive and expensive intervention to keep it 'working'.

You have probably seen it, the build server that needs a prod and a restart 'when things get a bit busy'. Or a deployment tool that, 'gets confused' and a 'test suite' that just needs another run or three.

The cause can be any number of the usual suspects - a corporate standard tool warped 5 ways to make it fit what your team needs. A one-off script 'that manager' decided was an investment and needed to be re-used... A well-intended attempt to 'automate all the things' that achieved the opposite.

They result in a manually intensive - automated process, where your team is like a character in the movie Metropolis, fighting with levers all day, just to keep the lights on upstairs. Manual-automation, manumatio…

Scatter guns and muskets.

Many, Many years ago I worked at a startup called (a European online travel company, back when a travel company didn't have to be online). For a while, I worked in what would now be described as a 'DevOps' team. A group of technical people with both programming and operational skills.

I was in a hybrid development/operations role, where I spent my time investigating and remedying production issues using my development, investigative and still nascent testing skills. It was a hectic job working long hours away from home. Finding myself overloaded with work, I quickly learned to be a little ruthless with my time when trying to figure out what was broken and what needed to be fixed.
One skill I picked up, was being able to distinguish whether I was researching a bug or trying to find a new bug. When researching, I would be changing one thing or removing something (etc) and seeing if that made the issue better or worse. When looking for bugs, I'd be casting…