AI Testing: How to Ensure Quality in Non-Deterministic Systems

AI Testing: How to Ensure Quality in Non-Deterministic Systems Episode

How do you ensure software quality when the system you’re testing doesn’t give the same output twice?

Go to https://links.testguild.com/inflectra and start your free 30-day trial, no credit card, no contract required.

That’s the core challenge facing every QA team building or testing AI-powered applications today and it’s breaking all the rules we’ve relied on for decades.

In this episode of the TestGuild Automation Podcast, I sit down with Adam Sandman, co-founder of Inflectra, to get into what non-deterministic AI testing actually means in practice, why traditional pass/fail testing no longer cuts it, and what quality professionals need to do differently right now.

We cover:

Why AI-generated code is raising the stakes for QA teams while budgets stay flat
The fundamental difference between deterministic and non-deterministic systems — and why it changes everything about how you test
How to set acceptable risk thresholds for AI systems (hint: it depends on whether you’re building an e-commerce chatbot or an air traffic control system)
Why testers who embrace AI as a tool — not a threat — will be the ones leading their organizations forward
How a live demo failure at a conference inspired Inflectra’s new non-deterministic testing tool, SureWire

If you’re a tester, QA manager, or automation engineer trying to figure out how to keep up with AI-driven development without losing your mind — or your job — this one’s for you.

[fusebox_track_player url=”https://traffic.libsyn.com/testtalks/tgaAdamAITestingHowtoEnsureQualityinNon-DeterministicSystems580.mp3″ social_linkedin=”true” social_email=”true” ]

Exclusive Sponsor Inflectra

This episode is brought to you by Inflectra: makers of SpiraTest, Rapise, and the full Spira platform for test management and software development. Real talk: if you’re still piecing together your test management with spreadsheets and hope, it’s time to check these guys out.

They’ve been at this since 2006, they’ve got 80,000+ daily users across 82 countries, and they back every subscription with free support.

Go to https://links.testguild.com/inflectra and start your free 30-day trial, no credit card, no contract required.

About Adam Sandman

Connect with Adam Sandman

Company: Inflectra Corpoation
Blog:
LinkedIn: https://www.linkedin.com/in/adamsandman/
Twitter: https://twitter.com/adammarksandman

Key Questions Answered

1. How has AI changed software quality and development?

AI has significantly lowered the barrier to entry for software development, allowing teams to produce up to 6x more functionality For QA professionals, this creates a dual challenge: you must now test a massive volume of traditional (deterministic) code generated by AI, while simultaneously figuring out how to test new, unpredictable (non-deterministic) AI features, like agentic workflows and chatbots

2. What is the difference between testing deterministic vs. non-deterministic systems?

In a traditional deterministic system, providing the exact same environment and test data will yield the exact same repeatable answer every time In a non-deterministic system, the exact same input can result in a different answer every time Because 100% testability is impossible in these environments, QA must shift from traditional pass/fail metrics to identifying the “acceptable level of risk” for specific use case.

3. How do you test non-deterministic AI applications?

Testing non-deterministic systems is closer to performance or load testing than traditional functional testing
It requires deploying automated input agents at scale to simulate millions of permutations, and output agents to analyze the results against specific parameters (e.g., helpfulness, safety, hate speech) Inflectra is actively developing a tool called SureWire specifically designed to test these AI systems at scale and provide statistical outputs on performance criteria

4. Can AI be used to modernize legacy (brownfield) applications?

Yes. AI acts as “institutional memory” by ingesting entire codebases, old test requirements, and version histories. It serves as a dispassionate judge that can explain why old code was written a certain way, assist in rewriting unit tests, and safely refactor outdated systems (like moving from IE11 requirements to modern web tech) without the fear of breaking unknown dependencies

5. How can QA leaders get their teams to adopt AI?

Start by targeting the pain points. Use AI to automate repetitive, boring tasks—like fixing brittle Selenium scripts—to give testers hours back in their day. Once the team sees that AI saves them time rather than adding to their workload, they will be much more receptive to learning advanced skills, such as testing non-deterministic systems or writing requirements

Key Takeaways & Episode Highlights

The End of Pass/Fail: For AI systems, traditional pass/fail testing is obsolete. Quality assurance is now fundamentally about risk management and determining acceptable failure rates based on the business context (e.g., an e-commerce chatbot vs. an air traffic control system)
AI for Compliance and Requirements: AI can ingest policy documents, Medicare rules, or HIPAA regulations to ensure that system requirements and code actually match legal standards, preventing massive compliance fines
Self-Healing Test Automation is Here: Long considered a “pipe dream” coming soon forever, AI vision technologies are finally making self-healing test maintenance a feasible and realistic solution to help testers keep up with rapid development
Decompose the App: When testing an AI application, separate it into testable units. You still need traditional deterministic testing for the web UI and data layers, while reserving new statistical/agent-based testing for the non-deterministic AI components
Shift from Cost Center to Value Creator: By connecting QA metrics to business outcomes (like linking software bugs to a 400% rise in support tickets), testing teams can prove their value to executives and shed the stigma of being a disposable “cost center”

Actionable Advice for Testers

Adam’s final advice to testers is to be courageous and start tinkering. Install tools like Visual Studio Code or Cursor, play around with open-source options, and don’t be afraid to break things. Embracing failure during this experimental phase is exactly how new innovations—and tools like SureWire—are born

Rate and Review TestGuild

Thanks again for listening to the show. If it has helped you in any way, shape, or form, please share it using the social media buttons you see on the page. Additionally, reviews for the podcast on iTunes are extremely helpful and greatly appreciated! They do matter in the rankings of the show and I read each and every one of them.

Transcript

Download New Tab

Scroll back to top