Skip to main content

Wasting your time with Test Automation

Software Testing is essentially an infinitely time consuming task being attempted in a finite time. The 'test space' is almost always vast and near infinite. Your time to test is usually counted in hours. There's an obvious mismatch there. We're testers, we are hired to to help marry the two. We need to find as many issues, and important issues, in that vast test-space in less than a few hours. Thats fine, thats testing, thats what testers work (live?) for.

Roll on Test Automation, our saviour, it can check vast areas of the test space rapidly and efficiently. It can use its 'Data' ( http://en.wikipedia.org/wiki/Data_%28Star_Trek%29 ) like abilities to test the application tirelessly. No? Well Ok, It could 'check' the application tirelessly, for a set of expected results. This itself is potentially valuable, and could examine a range of combinations or test data that we could not be reach alone.

Why do many of the test automation efforts I've witnessed on customer sites, not deliver this? Many customers seem to put in the hard work, they've made a serious time and effort contribution to the problem. So why do they seem to be spending more and more time fixing the tests? slowly lowering their expected 'pass rate'? or re-running the tests until they 'give the right result'.

One major issue is reliability, the 'tests' are just not reliable enough, for what they were intended to do. The large array of checks should be slowly helping the testers more and more by helping them reach those hard to reach parts of the application. But the unreliable nature of the tests mean that each newly implemented check is actually just adding to the signals noise and the maintenance burden.

Lets look at an example, Say we have 300 'automated tests'.

Lets assume these tests are 95% accurate, and only give a false positive 5% of the time.
(This 5% could be down to a plethora of causes such as flakiness in the test-tool itself, problems with the wider system/support systems, network issues, out-of-date 'expected results' - or even poorly written test code.)

Also, lets say the tests are applied in the correct areas and would highlight 15 bugs in the system.
I'll be even more generous and say that if the tests find a bug when testing a buggy area- then there definitely is a bug.
(no false negatives)

That means that the checks will correctly flag the 15 real bugs.
They will also false-flag (300-15) x 5% = 14 fake-bugs

When the checks finish: 14 of 29 or 48% of the results will be false positives or incorrect indications of a bug.

So in summary, even if the tests are 95% reliable (thats high from my experience) then approximately half of the testers work, reviewing the results, will potentially be a waste of time. Time that could be spent looking at the system under test is instead spent looking at flaky test results and bad test code. Those precious few hours of testing are misspent.

The solutions are not as simple as just 'making the tests more reliable'. While having well written code, and good infrastructure can help immensely, the problems tend to be more fundamental. Some of the problems and solutions I've seen are:


  1. The checks are asking binary questions [of complicated systems]. Try giving reports back instead, rather than hard pass/fail results. For example: is a HTTP 302 response a FAIL when you expected a HTTP 200? It might just be that the application has changed. A report covering the actual findings and all the other information that you get for free might be more useful. For example: How long did that response take? what was the size of that response? You could view those results directly or even analyse / graph them as you see fit - looking for patterns/issues. PASS/FAIL checks often seem to be 'change detection' systems rather than 'bug detection' systems.
  2. Keep the checks simple, really simple. Its difficult to write the complicated code needed to handle the various inputs, outputs and state changes a real system undergoes. It's the very reason we find work as testers, we are not immune to the problem of complex code. 
  3. Be aware that these are just 'checks' and all they can do is report. They can never find the 'human stuff'. They can't question or investigate the system. Leave time and resource for using your own testing skill to tackle the system. For example, trying to get your checks to do a visual 'layout' check for example can lead to time-consuming problems. See (4) You, yourself, could perform such a test in seconds, and probably provide better feedback.
  4. Beware of using test automation with GUI's. GUI's are uniquely designed for human use, they:
  •  Update in human-time frames, not at machine speeds - So you may 'check' at the wrong point in time. 
  •  Report information visually, and so use visual effects that can make test automation messier. 
  •  Are often out-of-sync with back-end server systems (Their code runs in a browser or in a separate thread etc) 
  •   Require the test tool to emulate user behaviour when 'automating' a check, programming the keyboard events, mouse clicks/events etc making the test-code more complicated. Look into accessing the System through other more programatic interfaces, for example use the XML, JSON or RMI API etc 'behind' the GUI if its available. You maybe able to check much of the systems logic through this route. And if you can't - You might have found an issue - i.e.: Lots of the application logic is in the GUI, when it might be better off on the server. 


In summary, use the test-code for what its good at, and don't be afraid to report information back to a human who can look for issues. The computer can do the heavy lifting and grunt work, and we can do the smart work of interpreting the feedback.

Comments

  1. Good stuff, Pete. I now have another link to point people to who tell me I should "automate everything" to fix my testing problems. Nice.

    ReplyDelete
  2. The #1 reason I've seen for poor return on investment of automated tests (eg. they take way too much time to maintain, they don't find regression failures) is poor design. Too often, testers without adequate design skills and without good automation frameworks are told to automate tests. They end up with tests that do way too much per script, so that it's hard to pinpoint problems when the test fails. If something changes, they have to change it in 100 places instead of 1 because the test code isn't DRY.

    Rather than give people an excuse to avoid test automation, we should educate the people automating the tests on how to design them well, how to decide what to automate, how to continually refactor the test code for maintainability. Having a programmer and tester pair on automation tasks is ideal.

    My teams have been getting super ROI from tests for more than a decade. On my current team, 7 years after having zero automation, we have several regression suites running at all levels from unit to API to GUI, many times per day, alerting us immediately when something breaks. The tests also provide living documentation - they have to pass, so we have to keep them up to date.

    Please encourage people to learn good ways to automate tests, and don't give the impression that test automation is somehow a bad thing. That's probably not what you mean to say, but people looking for a reason not to have to learn something that's hard for them may interpret it that way.

    ReplyDelete

Post a Comment

Popular posts from this blog

Betting in Testing

“I’ve completed my testing of this feature, and I think it's ready to ship” “Are you willing to bet on that?” No, Don't worry, I’m not going to list various ways you could test the feature better or things you might have forgotten. Instead, I recommend you to ask yourself that question next time you believe you are finished.  Why? It might cause you to analyse your belief more critically. We arrive at a decision usually by means of a mixture of emotion, convention and reason. Considering the question of whether the feature and the app are good enough as a bet is likely to make you use a more evidence-based approach. Testing is gambling with your time to find information about the app. Why do I think I am done here? Would I bet money/reputation on it? I have a checklist stuck to one of my screens, that I read and contemplate when I get to this point. When you have considered the options, you may decide to check some more things or ship the app

Test Engineers, counsel for... all of the above!

Sometimes people discuss test engineers and QA as if they were a sort of police force, patrolling the streets of code looking for offences and offenders. While I can see the parallels, the investigation, checking the veracity of claims and a belief that we are making things safer. The simile soon falls down. But testers are not on the other side of the problem, we work alongside core developers, we often write code and follow all the same procedures (pull requests, planning, requirements analysis etc) they do. We also have the same goals, the delivery of working software that fulfills the team’s/company's goals and avoids harm. "A few good men" a great courtroom drama, all about finding the truth. Software quality, whatever that means for you and your company is helped by Test Engineers. Test Engineers approach the problem from another vantage point. We are the lawyers (& their investigators) in the court-room, sifting the evidence, questioning the facts and viewing t

XSS and Open Redirect on Telegraph.co.uk Authentication pages

I recently found a couple of security issues with the Telegraph.co.uk website. The site contained an Open redirect as well as an XSS vulnerability. These issues were in the authentication section of the website, https://auth.telegraph.co.uk/ . The flaws could provide an easy means to phish customer details and passwords from unsuspecting users. I informed the telegraph's technical management, as part of a responsible disclosure process. The telegraph management forwarded the issue report and thanked me the same day. (12th May 2014) The fix went live between the 11th and 14th of July, 2 months after the issue was reported. The details: The code served via auth.telegraph.co.uk appeared to have 2 vulnerabilities, an open redirect and a reflected Cross Site Scripting (XSS) vulnerability. Both types of vulnerabilty are in the OWASP Top 10 and can be used to manipulate and phish users of a website. As well has potentially hijack a user's session. Compromised URLs, that exp