New Gaslight macOS Malware Uses Prompt Injection to Disrupt AI-Assisted Analysis
New Gaslight macOS Malware Uses Prompt Injection to Disrupt AI-Assisted Analysis
A previously undocumented Rust-based macOS implant and information stealer has been found to embed a prompt injection payload designed to trick a malware analyst’s artificial intelligence (AI) tools and trick it into aborting or refusing an analysis of the artifact.
The malware has been codenamed Gaslight owing to this deceptive behavior. It’s been assessed with high confidence that the tool is the work of North Korea-aligned threat actors.
“Its most notable feature is an embedded cascade of fabricated system-failure messages, designed to make an LLM-assisted triage agent doubt its own session,” SentinelOne researcher Phil Stokes said in a technical report. “It attacks the agent’s perception, rather than the sandbox it runs in.”
Central to the malware’s architecture is a Telegram bot API based command-and-control (C2) channel that enters into a polling loop, allowing the operator to issue instructions over an interactive shell and return the results of the execution. In the event two instances of the same bot token poll simultaneously, a “Conflict” response is issued, causing the second copy to terminate.
The shell supports six main commands, granting a persistent foothold over the infected host –
- help, to show command help
- id, to identify the implant to the operator
- shell, to execute a shell command via execvp
- kill, to terminate a target process by PID
- upload, to exfiltrate a file via Telegram’s “attach://” mechanism
- stop, to halt the execution of the implant
SentinelOne said it identified signs suggesting the presence of a seventh command named “focus,” although its functionality remains undetermined at this stage. To achieve persistence, Gaslight makes use of a LaunchAgent that uses the label “com.apple.system.services.activity” in its .plist file.
Also embedded within the malware is a 6.6 KB Base64-encoded Python script that functions as an information gathering suite responsible for harvesting Terminal command histories, installed application listings, snapshots of running processes, system hardware and software profile, macOS Keychain database, and data from Chrome, Brave, Firefox, and Safari web browsers. The collected data is subsequently compressed into a ZIP archive (“temp/collected_data.zip”) and uploaded via Telegram.
The Python stealer, for its part, is deployed by means of a separate 2 KB Base64-encoded bash installer that drops a cpython-3.10.18 interpreter from the “astral-sh/python-build-standalone” project. The presence of emojis and extensive comment headers indicates that it was likely generated using a large language model (LLM).
What’s notable about Gaslight is that details related to the bot token, the chat ID (tg_room_id), and the rest of the operator configuration are not hard-coded into the sample, but rather supplied at runtime. “The implant self-redacts its Telegram bot token in its own runtime output, denying it to anyone who captures logs or crash artifacts,” Stokes added.
On top of that, the malware attempts to evade an AI-based detection by incorporating a Markdown-fenced block containing 38 fabricated “system” messages designed to trick a security agent into aborting, truncating, or refusing analysis.
“The scaffold contains fake system messages about token expiry, out-of-memory kills, disk exhaustion, and repeated operation failures. It also plants bogus warnings about injection vulnerabilities and static-analysis flags,” SentinelOne said, calling it an “attempt to weaponize the LLM-assisted triage pipelines that increasingly sit in the reverse-engineering loop.”
