Skip to main content

AI Muggins


I play a card game called cribbage. I often play it with my son. One interesting part of the game is the muggins rule. This means that you can claim points from other players turns, if they miscount the score. 

The scoring is slightly nerve racking, with each of us double and triple checking our scores, to avoid falling foul of ‘muggins’, that’s part of the fun. 

Man tinkering with AI / Robot
But my son and I also find ourselves discussing other hands of cards, in a sort of alternate history version of the game. “So if I had a 7 instead of a 2 of hearts, then I’d get a double run and score at least 8 more points”.  

“Yes Dad, if you had different cards then you would likely have a different score, but you don’t” he says while rolling his eyes. 

This sort of bitter-sweet history rewriting is a convenient tool for us to swallow the awkward truth of the real world. We often create alternate things to object to. 

Take Chat GPT 4 and tools like Copilot X. These are powerful tools, capable of doing useful tasks quicker and more easily than other tools.   

But no... (People say), they are dangerously sentient, or not sentient, fake, poor at this one task or too good at some other thing that people get paid for. To paraphrase my son, “Yes, if they were X then they would not be Y”.  

These tools are, well..., Tools. They have their limits that we are still discovering – they also have great abilities which we are also only just realising. Unlike our existing tools we haven’t had a chance to evaluate them and find their place. A task made harder by the fact that AI technology is improving extremely fast at the moment. 

Just in the realm of software test automation alone there are many opportunities where GPT4 and Copilot could help. For example, summarising test results, and providing the results in a human readable form. E.g.: 


 Or explaining test code, without the need for cumbersome abstraction layers like cucumber, E.g.: 

This is a GPT4 API interpretation of the tests for my cribbage scorer.

The creation of basic unit tests for existing code to enable easy refactoring, or when combined with ‘function calling’ being able check the results contained in a body of text. E.g,:  

This is my Cribbage Scoring Plugin, available in ChatGPT4.

Testers & Test engineers often fall into the “Ha it can’t do this” school of thought with new tools. Rather than thinking I’ve been given access to a particularly useful text and code analysis and generation tool for a price that approaches free. 

A tool that is improving month by month, a tool that extends my reach and increases my performance (compared to those shunning it because it couldn’t do some party trick or a skill you’ve spent your career honing. 

Comments

Popular posts from this blog

Betting in Testing

“I’ve completed my testing of this feature, and I think it's ready to ship” “Are you willing to bet on that?” No, Don't worry, I’m not going to list various ways you could test the feature better or things you might have forgotten. Instead, I recommend you to ask yourself that question next time you believe you are finished.  Why? It might cause you to analyse your belief more critically. We arrive at a decision usually by means of a mixture of emotion, convention and reason. Considering the question of whether the feature and the app are good enough as a bet is likely to make you use a more evidence-based approach. Testing is gambling with your time to find information about the app. Why do I think I am done here? Would I bet money/reputation on it? I have a checklist stuck to one of my screens, that I read and contemplate when I get to this point. When you have considered the options, you may decide to check some more things or ship the app

XSS and Open Redirect on Telegraph.co.uk Authentication pages

I recently found a couple of security issues with the Telegraph.co.uk website. The site contained an Open redirect as well as an XSS vulnerability. These issues were in the authentication section of the website, https://auth.telegraph.co.uk/ . The flaws could provide an easy means to phish customer details and passwords from unsuspecting users. I informed the telegraph's technical management, as part of a responsible disclosure process. The telegraph management forwarded the issue report and thanked me the same day. (12th May 2014) The fix went live between the 11th and 14th of July, 2 months after the issue was reported. The details: The code served via auth.telegraph.co.uk appeared to have 2 vulnerabilities, an open redirect and a reflected Cross Site Scripting (XSS) vulnerability. Both types of vulnerabilty are in the OWASP Top 10 and can be used to manipulate and phish users of a website. As well has potentially hijack a user's session. Compromised URLs, that exp

Test Engineers, counsel for... all of the above!

Sometimes people discuss test engineers and QA as if they were a sort of police force, patrolling the streets of code looking for offences and offenders. While I can see the parallels, the investigation, checking the veracity of claims and a belief that we are making things safer. The simile soon falls down. But testers are not on the other side of the problem, we work alongside core developers, we often write code and follow all the same procedures (pull requests, planning, requirements analysis etc) they do. We also have the same goals, the delivery of working software that fulfills the team’s/company's goals and avoids harm. "A few good men" a great courtroom drama, all about finding the truth. Software quality, whatever that means for you and your company is helped by Test Engineers. Test Engineers approach the problem from another vantage point. We are the lawyers (& their investigators) in the court-room, sifting the evidence, questioning the facts and viewing t