The State of Secrets Sprawl 2026: AI-Service Leaks Surge 81% and 29M Secrets Hit Public GitHub

In less than a year, AI-assisted coding went from novelty to habit.
What used to be a specialized workflow for experienced engineers is now accessible to almost anyone with an idea, a prompt, and a few minutes.
In 2025, that shift became impossible to ignore. Software creation sped up, public GitHub activity surged, and a new generation of services, agents, integrations, and configuration patterns entered the stack all at once. That speed came with a cost.
According to our latest “State of Secrets Sprawl” report, 28.65 million new hardcoded secrets were added to public GitHub commits in 2025 alone, a 34% increase year over year and the largest single-year jump we’ve recorded.

The year software changed forever
2025 was a banner year for software production. Public GitHub commits climbed to about 1.94 billion, up 43% year over year, and the developer base increased by 33%.
Public GitHub commits grew by 43% and active developers base grew by 33% since 2024
AI is creating a new generation of leaks
AI makes it easier and faster to build, integrate, and ship. But every new tool, API, workflow, agent, and service account also creates new credentials to manage and more surface area for attackers to target. When organizations scale creation faster than governance, secrets begin to spread everywhere.
One of the clearest signals in the data is that the composition of leaked secrets is changing. In 2025, AI service secrets reached 1,275,105, up 81% year over year. The report points to 113,000 leaked DeepSeek API keys as one example of how these windows of exposure open.
The report also found that eight of the ten fastest-growing detectors were tied to AI services. LLM infrastructure, such as orchestration, RAG, and vector storage, leaked 5× faster than core model providers.
New AI providers, wrappers, gateways, registries, and integration layers are entering production workflows quickly, often before developer protections have caught up.

Claude Code-assisted commits showed a 3.2% secret-leak rate, versus a 1.5% baseline across all public GitHub commits. That gap is meaningful, but it should not be read as a simple tool failure. Developers remain in control of what gets accepted, edited, ignored, or pushed. Even as coding assistants improve their guardrails, people can still override warnings or ask the model to behave insecurely.
The leak still happens through a human workflow. This is an important nuance.
AI is changing the pace and shape of software development, but the underlying failure mode is still familiar: people under time pressure making local decisions in complex systems.
Number of secrets per 1000 commits
Hardcoding secrets into MCP configs
Taking a closer look at AI infrastructure, we identified 24,008 unique secrets exposed in MCP-related configuration files across public GitHub, including 2,117 unique valid credentials. This is 8.8% of all MCP-related findings.

The problem is often driven by the fact that the documentation itself encourages unsafe patterns. The report notes that popular MCP setup guides often recommend putting API keys directly into configuration files, command-line arguments, or embedded connection strings. When insecure credential handling is normalized in official quickstarts, it is no surprise that sprawl follows. This is the kind of pattern security teams should pay attention to early. New standards often arrive with convenience-first examples. If those examples assume hardcoded credentials, the problem can spread at ecosystem speed.
Public leaks are only half the story
One lesson security teams should take away from this report is that public GitHub is only the visible edge of the problem. Internal repositories remain a much larger reservoir of secrets sprawl. Internal repos are roughly 6× more likely than public ones to contain hardcoded secrets. This is the security debt created by private-by-default thinking. Teams are often less disciplined inside internal codebases because the exposure feels less immediate, but that private buildup becomes the material that attackers exploit once internal systems are reached.

The story gets broader from there. About 28% of incidents originate entirely outside repositories, in places like Slack, Jira, and Confluence. Those leaks outside of code are also 13 percentage points more likely to be categorized as critical than secrets found only in code.

Secrets shared in collaboration tools are often passed around during urgent troubleshooting, incident response, or operational debugging. They are copied into messages and tickets precisely because someone needs fast access. The context is urgent, and urgent contexts tend to produce high-impact exposures.
Developer workstations are now a prime target for secrets theft
As AI agents gain deeper local access to terminals, files, editors, environment variables, and credential stores, the laptop itself becomes a more meaningful attack surface. The report connects this to prompt injection and supply-chain attacks such as Shai-Hulud, which turn local secrets into organizational risk. GitGuardian’s analysis of the Shai-Hulud 2 dataset offers a rare empirical window into what actually lives on developer machines. Across 6,943 compromised machines, the team found 294,842 secret occurrences, corresponding to 33,185 unique secrets.

Shai-Hulud 2.0: the supply chain attack that learned
On November 24, a new wave of the Shai-Hulud supply chain attack emerged. The threat actors exfiltrate stolen credentials directly to GitHub repositories created with compromised tokens.

The report also notes that 59% of the compromised machines were CI/CD runners rather than personal workstations, which expands the problem well beyond the individual endpoint.
For years, secrets management was framed mostly around shared code repositories and cloud platforms. But agentic workflows are redrawing the perimeter. When local environments hold credentials that connect across systems, the machine itself becomes part of the NHI problem.
64% of valid secrets from 2022 are still active and exploitable
In the 2025 report, we found that nearly 70% of credentials confirmed as valid in 2022 were still valid in January 2025, which means they had not been remediated. When we retested that same dataset in January 2026, the validity rate was still above 64%

This gap is even more concerning because 46% of critical secrets are missed by validation-only prioritization, meaning many high-risk exposures remain underprioritized simply because they cannot be automatically verified.
The full State of Secrets Sprawl 2026 report dives deeper into all of these trends, from AI-assisted commits and MCP configuration leaks to internal repositories, collaboration tools, self-hosted infrastructure, and the long-term remediation gap.
From secrets sprawl to NHI governance
That phrase, “NHI governance,” can sound abstract until you reduce it to the questions security teams actually need to answer:

What non-human identities exist in the environment?
Who owns them?
What can they access?

If a team cannot answer those questions, AI adoption is likely outpacing identity maturity. That is the deeper lesson of the 2026 report. AI did not invent secrets sprawl. It accelerated the conditions that make it worse: faster shipping, broader participation in software creation, more integrations, more service accounts, more local tooling, and more configuration surfaces where credentials can end up by mistake.
The future will contain many more non-human identities. That means the path forward cannot be just “scan harder.” It has to include prevention, ownership, context, lifecycle control, and remediation workflows that are built for speed.

*** This is a Security Bloggers Network syndicated blog from GitGuardian Blog – Take Control of Your Secrets Security authored by Anna Nabiullina. Read the original post at: https://blog.gitguardian.com/the-state-of-secrets-sprawl-2026/

About Author

AndyC

Andy Curtis is an award-winning security consultant, researcher and public speaker. He has been working in the computer security industry since the early 1990s, having been employed by state and federal government, leading healthcare and banking providers across three continents. He has given talks about computer security for some of the world’s largest companies, worked with law enforcement agencies on investigations into hacking groups, and is a regular voice on TV and radio explaining IT security threats.

See author's posts