When it comes to rolling out updates to large complex banking systems, things can get messy quickly. Of course, the holy grail is to have each subsystem work well independently and to do some form of Pact or contract testing – reducing the complex and painful integration work. But nonetheless – at some point you are going to need to see if the dog and the pony can do their show together – and its generally better to do that in a way that doesn’t make millions of pounds of transactions fail – in a highly public manner, in production. (This post is based on my recent lightning talk at PyData London ) For the last few years, I’ve worked in the world of high value, real time and cross border payments, And one of the sticking points in bank [software] integration is message generation. A lot of time is spent dreaming up and creating those messages, then maintaining what you have just built. The world of payments runs on messages, these days they are often XML messages – and they ...
It's an excellent point and a wonderful way of showing it, Pete. A few refinements:
ReplyDelete1) We don't break software; the software was broken when we got it.
2) We don't create tests designed to cause failure; we create tests designed to expose the failures that are lurking.
3) The illusion that the software wasn't broken and the illusion that we're creating failure are among the most important illusions we testers need to dispel.
I'm delighted at the steady stream of excellent posts, and especially chuffed that it started to flow just after the Rapid Software Testing course in London. That was a rare group!
---Michael B.
Thanks Michael, Thank you for your support - and yes the RST course definitely helped motivate me! I recommend the course to testers, programmers and project managers!
ReplyDeleteI agree with your main point - that the software is essentially broken before it reaches the tester. The tester finds out that these problems are present in the system, and reports them.
1) In the blurb when I refer to breaking the software, I'm describing how the process appears to others. i.e.: "to a non-tester why you appear to be intent on breaking their..."
I tend not to use the phrase myself, except lightheartedly.
2&3) I'm not so sure about these... for example: a judgement, coding or configuration mistake was made before the system is examined by the tester - But the system may not 'fail' until we perform certain actions. By fail I'm thinking: Displeases or confuses user, performs slowly, crashes or loses data etc.
The incident on the Silver Bridge springs to mind (http://en.wikipedia.org/wiki/Silver_Bridge#Wreckage_analysis ) A contributing factor in the bridges 'failure' was a problem in the manufacture of a constituent part. Although this problem was in the system for many years, along with others such as a lack of redundancy, they did not 'fail' until December 15 1967.
If we were testing such a system, might we not add higher than expected load in an attempt to 'cause a failure'?
Though I can see that this engineering style language in a software setting is far from a perfect fit. Issues such as corrosion and decay don't apply. Though unplanned-for user load and change in usage do apply. I'm going to think about this...