Skip to main content

Testing a maybe with machine learning.

“I figured it was just a jumbo jet.”

My son and I shake our heads & then adopt blank stares as if a non-body-snatcher has been
exposed in our midst.

“Twin engine,” I utter, as I glance skyward again.

“Single decker” My son adds as an explanation.

“It’s a plane”, she retorts, rolling her eyes.

My wife, (who is far smarter than myself) lacks my son and I’s ability to recognise aircraft. She has the typical persons ability to recognise aeroplanes. I grew up around airforce bases. I had a father who was an aircraft engineer. Years of exposure and explanations regarding aeroplanes, their mechanics and features.

Image result for identify airplanes
We took this to the next level...
My son is an avid flight sim game player and has consumed many hours of relevant youtube material on the subject. He also had the luck/misfortune of me discussing the planes that frequent the skies, above us here, near London.

Given our combined experience & expertise, we probably have a reasonable ability to recognise the make & model of the planes we see in the sky. I would have given us, combined, an accuracy rate of say 90%.

A while back I worked on a Deep Learning classification model that could recognise types of aircraft from pictures. It was fairly crude and managed an 80% accuracy rate against a test set of aircraft images. 

Is that good? That would depend on a lot of things:
  • The purpose of the tool
  • Alternative systems’ performance
  • The existing recognition system in place
  • Risks the new system might expose a company to
  • The cost
  • The time/context when the device was used
  • Ability to update the ‘model’ as new planes are released
  • Etc
Those questions are probably similar to those you would be asking regarding any software you find yourself testing. 

One subtlety that isn't always present in many software systems is the explicit accuracy. Often when a software system is performing a calculation or logical process, we assume 100% accuracy and test for it. That often makes sense, but sometimes those figures and calculations are inherently inaccurate from the user’s point of view. They are just simplified models of real-world systems. 

Contrary to much of the negative publicity Machine Learning receives, accuracy can often be measured directly with the model/tool. A machine learning approach could make the assumed inaccuracy of the tool, more explicit.

For example, we could get an overall idea of how accurate the model was (E.g.: 80% for our aircraft model) but also how sure it is (AKA the probability) of each answer it outputs (e.g.: its 95% sure its an Airbus A380, and 50% sure its a Boeing 747).

Furthermore, the data upon which it is trained can be defined and recorded for later review and analysis. This could be by programmers, testers, product owners, lawyers. prospective customers etc. That's not always easy to do if the existing system is a person.

As a tester, you might also locate or create your own data to more thoroughly test the system. Checking for edge cases and real-world situations that may have not already been modelled. E.g. What does the model classify a flock of Canada Geese as? Given realistic data, is the model biased towards giving Airbus planes a higher score? (we could test that...)

We can approach these systems as a form of mechanised heuristics. While they work slightly differently to our human-ware heuristics, they behave in a similar way. They are fallible, they are useful shortcuts that can really help in many situations.

For example, they could be replicated and deployed at will in a manner that existing people or systems can not. Will the product work better or more efficiently than the existing approach? (The answer, for example, it could be that it's less accurate than an existing person, but the unit cost is much lower - and so overall efficiency wins on a larger deployment)

While the roles being automated will still exist, there will be fewer people doing them. E.g.: Why have ten aircraft spotters when 2 who are skilled in both identifying the UFOs and updating the machine learning models might fit the business needs more efficiently?

As we continue to mechanise more business functions using machine learning, Software Testers will need to start thinking a little more in terms of comparing accuracy rather than a brittle approach of binary pass/fail correctness. Given our industry’s struggles with the Pass/Fail mentality, I suspect this will be one of our greatest challenges in the coming years.


  1. Well, here's a test. How does your system cope with something it cannot possibly be seeing as a real-world input - say, a Handley Page HP.42? And how would it deal with something that it might theoretically see, but the likelihood of seeing it is quite remote - say, a Dakota, or a Junkers Ju.52? Would it attempt a close match, or just reject input sightings that did not match the tool's parameters?

    Context is important here. Being 100% accurate isn't that important if the app is only going to be used by hobbyists. But if you were marketing it to the military, 100% accurate identification becomes much more important. IFF - Identification Friend or Foe - has been around for a long time, as I'm sure I don't need to tell you!

  2. Nice stuff, which you have shared here about the private airplanes for hire . this information very important for every person those really interesting in. If anyone looking to buy an private airplanes, then visit personalized private airplanes for hire


Post a Comment

Popular posts from this blog

A h̶i̶t̶c̶h̶h̶i̶k̶e̶r̶'s̶ software tester's guide to randomised testing - Part 1

Mostly Harmless, I've talked and written about randomisation as a technique in software testing several times over the last few years. It's great to see people's eyes light up when they grok the concept and its potential. 
The idea that they can create random test data on the fly and pour this into the app step back and see what happens is exciting to people looking to find new blockers on their apps path to reliability.
But it's not long before a cloud appears in their sunny demeanour and they start to conceive of the possible pitfalls. Here are a few tips on how to avert the common apparent blockers. (Part 1) Problem: I've created loads of random numbers as input data, but how will I know the answer the software returns, is correct? - Do I have to re-implement the whole app logic in my test code?
Do you remember going to the fun-fair as a kid? Or maybe you recall taking your kids now as an adult? If so then you no doubt are familiar with the height restriction -…

Betting in Testing

“I’ve completed my testing of this feature, and I think it's ready to ship”
“Are you willing to bet on that?”
No, Don't worry, I’m not going to list various ways you could test the feature better or things you might have forgotten.
Instead, I recommend you to ask yourself that question next time you believe you are finished. 
Why? It might cause you to analyse your belief more critically. We arrive at a decision usually by means of a mixture of emotion, convention and reason. Considering the question of whether the feature and the app are good enough as a bet is likely to make you use a more evidence-based approach.

Why do I think I am done here? Would I bet money/reputation on it? I have a checklist stuck to one of my screens, that I read and contemplate when I get to this point. When you have considered the options, you may decide to check some more things or ship the app. Either could be the right decision.
Then the app fails…
The next day you log on and find that the feature is b…

Software development is in the Doldrums

"Don't get off the boat."

"Seriously, never get off the boat," The instructor said, leaning forward and looking at each of us in turn.

"But surely if it's sinking..." We reply, somewhat confused and slightly incredulous. We've seen Titanic, we think to ourselves, we know how this sea survival stuff works...

"OK" He concedes, If things get really bad, "Get on the life raft if you can step-up from the boat to the life raft".

"But, But... the yacht is like 37ft long, Do we want to wait until that whole boat is lower than the life-raft? When less than 1ft of the yacht is above the surface? Meanwhile all the time the life raft is just there... floating happily alongside."

"Pretty much, yes," he said nodding.

That was about 15 years ago. Not much has changed since. The reasons are manifold. Firstly, the yacht is a decent shelter. The thin plastic of a legal minimum life-raft isn't going to protect you fro…