AI-Automated Threat Hunting Brings GhostPenguin Out of the Shadows
AI-Automated Threat Hunting Brings GhostPenguin Out of the Shadows
Threat hunting approach
Our approach focused on collecting, processing, and analyzing a large number of malware samples from known and reported attacks. The goal was to extract useful artifacts that help hunt for new, undetected threats.
Hunting workflow
1. Collect and extract artifacts
We gather many malware samples from known and reported attacks and extract key information from them such as strings, API calls, behaviors, function names, variable names, and constants. All collected data is stored in a structured database. Afterwards, we tag and categorize the samples so they are easier to search and compare.
2. Build VirusTotal hunting queries
Using the extracted artifacts, we create VirusTotal hunting rules and run them against samples with zero detections. When we find potential candidates, pass the samples to the profiling stage.
3. Profiling and analysis
Binary files are sent to IDA Pro (Hex-Rays) for decompilation and further artifact extraction. CAPA also utilized to identify specific capabilities (A custom rule has been generated based on the artifacts collected during Stage 1). Non-binary files like scripts or code are passed directly to the profiler for feature extraction. The profiler subsequently generates a unified profile in JSON format for each file, which is then forwarded to the next stage of analysis.
The AI agent Quick Inspect reviews the JSON profile created during the profiling stage. It analyzes the artifact, scores it, and determines if the file is malicious or not. Files below the threshold go into a monitoring list for later review, while files above the threshold tagged as malicious and move to the next stage.
The Deep Inspector agent performs a deeper analysis on files that pass the threshold and are tagged as malicious. It generates a detailed analysis report for the file based on the decompiled code and the metadata created during the profiling stage. The agent reviews the file profile and produces a code-analysis report that includes:
- A short summary
- Identified capabilities
- Code execution flow
- Technical analysis
- MITRE ATT&CK framework mapping
We used this pipeline to hunt for a VirusTotal zero-detection sample that we named GhostPenguin. The sample was submitted on July 7, 2025, and remained undetected in VirusTotal for more than four months.
If a file is packed or obfuscated, the YARA scanner and AI model usually detect this and tags it. If you have automated scripts for unpacking, you can set up an MCP server that can route these files to your unpacking pipeline for dynamic, static, or manual unpacking. Simple obfuscation and unpacking process can often be handled directly by AI (by a AI resolver or AI generating script for deobfuscation/unpacking), but heavy or complex obfuscation should be processed by external automation, custom scripts or manual efforts.
