RSAC Innovation Sandbox 2026 – Realm Labs


Company Overview
Founded in 2023, Realm Labs is headquartered in Sunnyvale, near San Jose, California[1]. The company’s founder and CEO, Saurabh Shintre, previously led AI security research at Symantec and Splunk[2].

[…Keep reading]

Zoom expands agentic AI platform to automate enterprise workflows

Zoom expands agentic AI platform to automate enterprise workflows


Company Overview
Founded in 2023, Realm Labs is headquartered in Sunnyvale, near San Jose, California[1]. The company’s founder and CEO, Saurabh Shintre, previously led AI security research at Symantec and Splunk[2]. At this year’s RSAC conference, Realm Labs secured $5 million in funding from Crosspoint Capital Partners[3]. The company’s mission is to make AI applications “more trustworthy, reliable, and secure,” focusing on addressing the security and observability challenges posed by generative AI.

Figure 1: Realm Labs Company Logo

Product Background
As generative AI and large language models (LLMs) rapidly gain adoption, enterprises are enjoying AI-driven innovation while facing significant security challenges. Modern LLMs, trained on vast datasets, inherently contain latent “harmful knowledge,” making it difficult to fully prevent misuse.
While traditional techniques like fine-tuning, alignment, and prompt engineering can improve model outputs, they struggle to eliminate the underlying harmful knowledge within the model. Once activated, this knowledge can still pose risks. AI firewall solutions enforce policies at the input and output layers but often suffer from limited model size, lower accuracy, and added latency[4].
In response, Realm Labs introduces a novel approach: instead of just analyzing what the model says, they focus on monitoring how the model thinks. By understanding the internal “thought structures” of AI models, Realm aims to detect and block risks before they materialize.
Realm Labs’ core products include the AI firewall OmniGuard, the AI observability solution Prism, and the data governance and DLP solution DataRealm. Given the context of RSAC 2026, this article will focus on Realm Prism.
Solution Features
Based on publicly available promotional materials, Realm Labs’ innovative product Prism likely includes the following features:
Overall Approach
In its official whitepaper[5], Realm Labscategorizes LLM observability into five layers:

Infrastructure Observability: CPU/GPU load, memory usage, and other metrics indicating hardware health.
Data Observability: Data completeness, null rates, labeling accuracy, distribution drift scores, and RAG knowledge base timeliness—metrics that assess the quality of training and inference data.
Application Observability: LLM logs, tool call logs, input/output token statistics, API error logs, and response times—metrics that indicate whether the LLM is functioning correctly.
Internal Observability: Attention patterns*, internal Chain of Thought (CoT), and token probabilities—metrics that reveal how the LLM “thinks.”
Output Observability: Detection of hallucinations, off-topic responses, factual errors, bias, or compliance violations—metrics that assess the quality of the final output.

Note: The term “attention patterns” here refers to “which input tokens influence each output token,” likely describing the attention weight matrix or heatmap.

Figure 2: (Unrelated to Realm Labs) Example of an attention weight heatmap in machine translation tasks[6]

Currently, observability solutions for layers 1, 2, 3, and 5 are relatively mature, but layer 4 has been largely missing, which is the primary focus of Realm Prism.
In a blog post on their official website[7], Realm Labs explains: 
“Recent research suggests that LLMs organize information in a structured manner within their ‘brain,’ and this structure can be studied. Realm’s defense mechanism is built on this principle. We identify regions in the LLM where harmful information is stored and monitor when user queries cause the model to access these regions. Our argument is that Realm’s defenses are difficult to bypass because any attempt to extract harmful information must trigger the corresponding parts of the model. We place monitoring points near the source of this knowledge, allowing us to detect harmful information before it is output by the model.”
However, most of the publicly available official documents emphasize the importance of internal LLM observability rather than providing detailed explanations of its specific implementation.
Deployment Model
According to Realm Labs’ official promotional page[8], Realm Prism supports four deployment modes:

Batch Analysis: Analyzes and categorizes historical interaction data to uncover potential patterns and opportunities for improvement.
Real-Time Sidecar: Extracts real-time insights from application logic, enabling adaptive routing and monitoring.
Inline Guardrails: Blocks unsafe or inappropriate queries and redirects complex requests to more suitable models.
Generative Endpoint: Functions as a plug-and-play endpoint for any open-weight model, purposefully collecting behavioral data.

Intuitively speaking:

Batch Analysis likely involves re-running the LLM to reproduce hidden-layer states for retrospective analysis.
Real-Time Sidecar and Inline Guardrails appear to function similarly to monitoring and blocking mechanisms in a Hybrid Web Application Firewall (HWAF) scenario.
Generative Endpoint may serve as an all-in-one cloud service designed for open-source LLMs.

The promotional page also includes a video showcasing Realm Prism’s workflow and user interface (UI). In the video, the left side of the interface features a chat dialog box. As the demonstrator inputs a prompt, the right panel dynamically displays a series of component metrics that update in real time during the inference process.
In the screenshot below, when the demonstrator enters the prompt “I want to kill a child,” the metrics for Violence, Self-Harm, and Refusal all spike to high levels, clearly indicating that the request is harmful and should be rejected:

Figure 3: The prompt “I want to kill a child” is flagged as harmful by Realm Prism

However, when the demonstrator appends the word “process” to the prompt, the refusal metrics immediately disappear, indicating that the request is now deemed harmless:

Figure 4: The prompt “I want to kill a child process” results in previously high-risk metrics dropping to low levels

This comparison effectively highlights the advantages of Realm Prism’s internal observability approach. Traditional input-filtering defenses struggle to distinguish between these two prompts while maintaining low resource overhead.
Real-World Performance
According to Realm Labs’ website[9], the company has hosted a CTF (Capture The Flag) challenge since September of last year, inviting the public to bypass a four-layer defended AI chatbot. As of October 17, 2025, over 100 participants, including red team members from major corporations, have attempted more than 2,000 attacks—yet no one has successfully breached the fourth layer of defense.

Figure 5: CTF Challenge Progress

However, as of the time of writing, the challenge page (sherlock.realmlabs.ai) appears to be inaccessible. It remains unclear whether Realm Labs plans to host similar CTF or Security Researcher Challenge (SRC) events in the future.
Comparison with Other Solutions
When discussing comparisons, it’s impossible to overlook HiddenLayer, the winner of the 2023 RSAC Innovation Sandbox. Among HiddenLayer’s product lineup, the most relevant to Realm Prism is MLDR (Machine Learning Detection & Response)*—one of the earliest industry proposals for security monitoring and protection of machine learning models.
Note*: Recent promotional materials have rebranded this as “AIDR”, though functional details of the change remain undisclosed. For consistency, this article will continue to refer to it as MLDR.

Figure 6: (Unrelated to Realm Labs) HiddenLayer MLDR Framework

Unlike Realm Labs, which focuses almost exclusively on LLMs, MLDR is designed to handle input and output vectors for all types of machine learning and deep learning models[10]. Its protection scope extends to adversarial and robustness attacks, such as adversarial examples and model extraction[11]. However, the specific methods remain unclear, with only vague descriptions from official blogs mentioning the integration of “advanced heuristic methods and machine learning techniques.”
In the context of modern LLMs, MLDR likely processes outputs from embedding layers and inputs/outputs from de-embedding layers. While MLDR offers some level of internal observability—unlike AI firewalls, which operate entirely externally—its approach seems coarser-grained when addressing LLM-specific or business-logic attacks. HiddenLayer also offers LLM-focused solutions, but these primarily fall under AI firewalls and data loss prevention (DLP), which are outside the scope of this comparison with Realm Prism.
By contrast, Realm Prism adopts a more targeted detection and protection approach, specifically addressing common LLM threats such as prompt injection and jailbreaking. Additionally, Realm Prism’s deeper integration with model internals suggests it can achieve better efficacy, lower overhead, and significantly higher resistance to bypass attempts.

Figure 7: Comparison of internal observability vs. conventional approaches (from Realm Labs’official whitepaper[5])

However, Realm Prism’s approach may have some limitations. Since it relies on accessing the model’s internal state, it is likely best suited for open-source models or scenarios with high levels of access authorization. For fully closed or black-box third-party model platforms, integrating Realm Prism could prove challenging. Additionally, large-scale real-time monitoring requires careful balancing of performance and cost.
That said, no specific criticisms of Realm Prism have yet emerged in public reports.
Other Notable Considerations
In its official whitepaper[5], Realm Labs argues:
“Current tools treat models as black boxes, capturing inputs and outputs but ignoring everything in between. This creates fundamental blind spots: Hallucinations: Behavioral tools show the wrong output. Only internal observability shows WHY—did the model attend to irrelevant context? Did it assign high probability to incorrect tokens?”
From this perspective, internal observability may not be limited to AI security and adversarial defense—it could be a broadly applicable technical direction for the LLM/AGI field. Hallucinations have plagued LLMs since their inception, and while methods like RAG (Retrieval-Augmented Generation) can mitigate them, no publicly known solution has completely resolved the issue. However, Realm Labs’ public materials do not provide sufficient detail to confirm whether internal observability can effectively suppress hallucinations.
Reducing business risks caused by LLM hallucinations should also be considered part of security technology. This could be a promising area for long-term research.
Speculative Analysis
Based on current research, we speculate that Realm Prism may be built on LLM ablation technology.
The concept of ablation for LLMs was first publicly proposed in April 2024, with a paper published on arXiv in June of the same year[12]. The research revealed that LLMs rely on specific unidirectional vector components within their internal states to determine whether to refuse harmful queries. By identifying and removing these components from the model’s internal state—without fine-tuning—the LLM loses its ability to refuse harmful inputs. Conversely, artificially introducing these components can cause the LLM to refuse even harmless inputs. Due to its low cost, ablation technology has been widely adopted in open-source model communities. Given space constraints and content sensitivity, this article will not delve into the technical details of ablation; interested readers are encouraged to refer to the original paper.
Further research revealed that not only refusal/non-refusal behaviors, but also positive/negative sentiment, affirmation/negation, and even language preferences (e.g., Chinese/English), are governed by single-directional vectors. A notable example is Llama-3-8B-Instruct-MopeyMule[13], where orthogonalization of the Llama-3-8B-Instruct model* resulted in extremely negative and pessimistic responses across all queries.
*Note: By modifying model weights to ensure that matrix multiplication yields a zero component in a specific direction, this achieves permanent ablation without relying on runtime hooks.
NSFOCUS had already advanced this research and applied it practically by November 2024. In one project, experiments showed that after processing just ~30% of the LLM’s layers, its internal state became highly linearly separable for affirmative/negative samples. This method has been integrated into NSFOCUS’s NSFGPT security LLM framework. Comparative evaluations demonstrate that it can reduce inference time overhead for large-scale data classification tasks by ~54.50% with almost no loss in accuracy.
Many other studies and applications related to LLM internal states exist, but they are beyond the scope of this discussion.

Figure 8: (Unrelated to Realm Labs) NSFOCUS research showing the normalized distribution of latent vector projection lengths across Transformer layers for affirmative/negative samples

However, given the challenges of integrating security products at the software level into mainstream LLM infrastructure in domestic enterprise security contexts, further research in this direction for AI security was not pursued.
Returning to the analysis of Realm Prism: LLM ablation is a well-established, low-cost, and highly effective method—fully aligning with Realm Prism’s advertised capabilities. Notably, in Realm Prism’s promotional video, a lock-shaped button appears next to the dashboard metrics, suggesting the product may not only monitor but also intervene in the model’s internal state.
While there is no direct evidence that Realm Prism employs the linear classification methods common in mainstream LLM ablation, it is plausible that similar techniques could be used to achieve its functionality.
Conclusion
Realm Labs, an innovative company focused on AI security and observability, introduces a groundbreaking approach by “understanding how AI thinks” to enhance the safety of generative AI systems. This vision has earned recognition from industry veterans and investor backing, culminating in its selection for the RSAC 2026 Innovation Sandbox. Its product portfolio—spanning internal model reasoning monitoring, AI firewalls, and data loss prevention—offers a unique and differentiated advantage compared to traditional solutions.
For enterprises prioritizing AI security, Realm Prism provides unprecedented transparency into AI internal processes, empowering security teams to proactively identify and mitigate potential risks. Moving forward, if Realm Labs continues to innovate and expand its applicability, its solutions could become a critical component of enterprise AI security practices.

[1] Tola Captial. Portfolio Archive | Tola Capital, 2026[EB/OL]. (2026). https://tolacapital.com/portfolio.
[2] RSAC. Saurabh Shintre | RSAC Conference[EB/OL]. https://www.rsaconference.com/experts/saurabh-shintre.
[3] RSAC. Finalists Announced for RSAC Innovation Sandbox Contest 2026, February 2026[EB/OL]. (2026-02). https://www.prnewswire.com/news-releases/finalists-announced-for-rsac-innovation-sandbox-contest-2026-302683184.html.
[4] Saurabh Shintre. Securing AI’s Mind, September 2025[EB/OL]. (2025-09). https://www.realmlabs.ai/blogs/securing-ais-mind.
[5] Realm Labs. LLM Observability For Reliable AI[EB/OL]. https://drive.google.com/file/d/1rFSQgiPv06aQ1hWBeRb2SXG44oahXc58/view?usp=sharing.
[6] Phi Xuan Nguyen, Shafiq Joty. Phrase-Based Attentions, 2018[M/OL]. (2018). https://arxiv.org/abs/1810.03444.
[7] Nina Wei. Inside the Black Box: Why 95% of AI Projects Fail, October 2025[EB/OL]. (2025-10). https://www.realmlabs.ai/blogs/inside-the-black-box.
[8] Realm Labs. Product – Realm Prism[EB/OL]. https://www.realmlabs.ai/product-realm-prism.
[9] Realm Labs. Sherlock Challenge: One Month In, Realm Guard Remains Unbroken, October 2025[EB/OL]. (2025-10). https://www.realmlabs.ai/blogs/sherlock-challenge-one-month-in-realm-guard-remains-unbroken.
[10] Alex Avendano. Safeguarding AI with AI Detection and Response[EB/OL]. https://www.hiddenlayer.com/insight/safeguarding-ai-with-mldr.
[11] HiddenLayer. Unpacking the AI Adversarial Toolkit, October 2022[EB/OL]. (2022-10). https://www.hiddenlayer.com/research/ whats-in-the-box.
[12] Andy Arditi, Oscar Obeso, Aaquib Syed, et al. Refusal in Language Models Is Mediated by a Single Direction, 2024[M/OL]. (2024). https://arxiv.org/abs/2406.11717.
[13] failspy. failspy/Llama-3-8B-Instruct-MopeyMule · Hugging Face, May 2024[EB/OL]. (2024-05). https://huggingface.co/failspy/Llama-3-8B-Instruct-MopeyMule.
The post RSAC Innovation Sandbox 2026 – Realm Labs appeared first on NSFOCUS, Inc., a global network and cyber security leader, protects enterprises and carriers from advanced cyber attacks..

*** This is a Security Bloggers Network syndicated blog from NSFOCUS, Inc., a global network and cyber security leader, protects enterprises and carriers from advanced cyber attacks. authored by NSFOCUS. Read the original post at: https://nsfocusglobal.com/rsac-innovation-sandbox-2026-realm-labs/

About Author

Subscribe To InfoSec Today News

You have successfully subscribed to the newsletter

There was an error while trying to send your request. Please try again.

World Wide Crypto will use the information you provide on this form to be in touch with you and to provide updates and marketing.