SHARED INTEL Q&A: AI retrieval systems can still hallucinate; deterministic logic offers a fix

By Byron V. Acohido
AI hallucination is still the deal-breaker.
Related: Retrieval Augmented Generation (RAG) strategies
As companies rush AI into production, executives face a basic constraint: you cannot automate a workflow if you cannot trust the output. A model that fabricates facts becomes a risk exposure. CISOs now have to explain this clearly to boards, who expect assurances that fabrication risk will be controlled, not hoped away.
Earlier this year, retrieval-augmented generation, or RAG, gained attention as a practical check on hallucination. The idea was straightforward: before answering, the model retrieves grounding material from a trusted source and uses that to shape its response. This improved reliability in many early use cases.
But first-generation RAG had a hidden weakness. A major academic study (“RAGTruth”) showed that even when RAG retrieves accurate source material, AI systems can still misstate it or draw the wrong conclusion. The research comes from the ACL Anthology, the leading global library for peer-reviewed AI language research.
More broadly, today’s RAG systems rely on probabilistic similarity. Small changes in how a question is asked can push the model toward different source material, meaning two users may receive different answers with no clear audit trail. That instability limits trust in regulated environments.
A second wave of RAG innovation argues for something more deterministic. Instead of inferring relationships among documents, the system traverses only the links defined in authoritative frameworks, such as regulations or internal controls. Same question. Same source path. Verifiable answer.
If this approach holds, regulated enterprises may gain a new way to trust AI in production. The Q&A that follows examines this emerging direction through the work of DivideGraph founder Tyler Messa.
LW: For leaders new to the topic, what problem was first-generation RAG trying to solve?
Messa: Think of early RAG as turning AI into an “open-book test.” Before it, models hallucinated because they were pulling answers from memory. RAG let them reference source material before responding.
For many low-risk business tasks, that was good enough. But in regulated environments, “good enough” isn’t a standard. Boards and regulators expect accuracy that can be demonstrated, not hoped for.
LW: Where did first-generation RAG fall short?
Messa
Messa: The weakness showed up when I tried using it with the Cyber Risk Institute Profile — a framework that harmonizes more than 2,500 cybersecurity requirements. I didn’t need creativity. I needed accuracy.
Instead, the AI treated the framework like searchable text rather than structured logic. Worse, it often invented relationships between requirements that didn’t exist. It could take the right source material and hallucinate itself into the wrong conclusion.
The other problem was instability. I couldn’t reliably get the same result twice, and I couldn’t get a complete audit trail. In compliance, that’s fatal. Regulators don’t accept “the AI thinks so.” They expect systems anchored to authoritative frameworks with verifiable reasoning.
LW: How is your approach different?
Messa: The analogy I use is Autocorrect vs. Google Maps.
Traditional RAG behaves like Autocorrect — it predicts what’s likely based on probability. That’s dangerous when the cost of being wrong can be billions of dollars.
DivideGraph works like Google Maps for compliance. We decomposed regulations into precise components and rebuilt the intended logic as a navigable system. When the system answers, it follows that map with a turn-by-turn audit trail.
The AI isn’t “thinking.” It’s the voice reading directions. The graph calculates the path. That means every answer is repeatable, verifiable, and anchored to frameworks regulators already recognize.
LW: Where does deterministic RAG make the biggest impact?
Messa: Anywhere a wrong answer creates real risk: fines, legal exposure, outages, breaches. More broadly, it closes the gap between policy and operations.
Compliance can become continuous instead of episodic. Change management becomes safer because the system understands dependencies. And leadership finally gets an accurate, real-time understanding of risk posture.
LW: Is anyone else doing this?
Messa: To my knowledge, no. Most platforms are still trying to predict compliance. Banks are uniquely positioned as early adopters because the industry already did foundational work: the CRI Profile provides a harmonized framework to compute against.
To adopt this model, two conditions matter: the cost of being wrong has to be high, and there has to be a standardized framework to anchor to.
LW: If this gains traction, how do you see it spreading?
Messa: Deterministic systems will become the trust layer for AI. You can’t responsibly build financial decisioning or fraud systems on probabilistic guesswork. We have decades of regulatory intelligence trapped in PDFs. Deterministic RAG operationalizes that intelligence.
This isn’t about replacing human oversight. It’s about making oversight computational.
LW: What would this change for auditors?
Messa: Everything. Today, AI compliance claims are hard to prove. You can show prompts and documents, but you can’t show reasoning because probabilistic systems don’t have explicit reasoning.
With a graph, every answer has a chain of logic. Auditors can see exactly which regulation required which control and how it maps to evidence. That levels the playing field for smaller banks who can’t afford armies of consultants. And it gives regulators the ability to examine systemic risk at a sector level.
LW: What proof will enterprises need before trusting deterministic RAG?
Messa: The most important signal is the ability to say “no.” A trustworthy system refuses requests that violate law, logic, or safe operation. It understands time, so it doesn’t reference rescinded rules. It understands concepts rather than just matching words. And it produces complete, verifiable traceability.
Confidence comes from precision.

Acohido
Pulitzer Prize-winning business journalist Byron V. Acohido is dedicated to fostering public awareness about how to make the Internet as private and secure as it ought to be.

January 21st, 2026 | Q & A | Top Stories

*** This is a Security Bloggers Network syndicated blog from The Last Watchdog authored by bacohido. Read the original post at: https://www.lastwatchdog.com/shared-intel-q-deterministic-logic-offers-a-fix/

About Author

AndyC

Andy Curtis is an award-winning security consultant, researcher and public speaker. He has been working in the computer security industry since the early 1990s, having been employed by state and federal government, leading healthcare and banking providers across three continents. He has given talks about computer security for some of the world’s largest companies, worked with law enforcement agencies on investigations into hacking groups, and is a regular voice on TV and radio explaining IT security threats.

See author's posts