GULoader Campaigns: A Deep Dive Analysis of a highly evasive Shellcode based loader

Authored
by:

Anandeshwar

Unnikrishnan

Stage

1:

GULoader

Shellcode
Deployment

In
recent
GULoader
campaigns,
we
are
seeing
a
rise
in
NSIS-based
installers
delivered
via
E-mail
as
malspam
that
use
plugin
libraries
to
execute
the
GU
shellcode
on
the
victim
system.
The
NSIS
scriptable
installer
is
a
highly
efficient
software
packaging
utility.
The
installer
behavior
is
dictated
by
an
NSIS
script
and
users
can
extend
the
functionality
of
the
packager
by
adding
custom
libraries
(dll)
known
as
NSIS
plugins.
Since
its
inception,
adversaries
have
abused
the
utility
to
deliver
malware.

NSIS
stands
for
Nullsoft
Scriptable
Installer.
NSIS
installer
files
are
self-contained
archives
enabling
malware
authors
to
include
malicious
assets
along
with
junk
data.
The
junk
data
is
used
as
Anti-AV
/
AV
Evasion
technique.
The
image
below
shows
the
structure
of
an
NSIS
GULoader
staging
executable
archive.

The
NSIS
script,

which
is
a
file
found
in
the
archive,

has
a
file
extension

“.nsi”

as
shown
in
the
image
above.
The
deployment

strategy
employed
by
the
threat
actor
can
be

studied
by
analyzing
the
NSIS
script
commands
provided
in
the
script
file.

The
image
shown
below
is
an
oversimplified
view
of
the
whole
shellcode
staging
process.

The

file
that
holds
the
encoded

GULoader

shellcode
is
dropped
on
to
victim’s
disc
based
on
the
script
configuration
along
with
other
data.

Junk
is
appended
at
the
beginning
of
the
encoded
shellcode.

The
encoding
style
varies
from
sample
to
sample.
But

in
all
most

all
the

cases,

it’s

a
simple
XOR
encoding.

As
mentioned
before,

the
shellcode
is
appended
to
junk
data,
because
of

this,

an

offset
is

used
to
retrieve

encoded

GULoader

shellcode.
In
the
image,
the

FileSeek

NSIS
command
is
used
to
do
proper
offsetting.

Some
samples
have
unprotected

GULoader

shellcode
appended
to
junk
data.

A
plugin
used
by
the
NSIS
installer
is
nothing
but
a
DLL

which
gets
loaded
by
the
installer
program
at
runtime
and
invokes
functions
exported
by
the
library.

Two
DLL
files
are
dropped
in
user’s
TEMP
directory,
in
all
analyzed
samples
one
DLL
has

a
consistent
name
of
system.dll
and
name
of
the
other
one
varies.

The
system.dll

is
responsible
for

allocating

memory
for
the
shellcode
and

its
execution.

The
following

image
shows

how
the

NSIS

script
calls
functions
in
plugin

libraries.

The
system.dll
has

the

following
exports
as
shown

the
in

the
image
below.
The

function
named

“Call”

is
being
used
to
deploy
the
shellcode
on
victim’s
system.

The
Call
function
exported
by
system.dll
resolves
following
functions
dynamically
and
execute
them
to
deploy
the
shellcode.
CreateFile
–
To
read
the
shellcode
dumped
on
to
disk
by
the
installer.
As
part
of
installer
set
up,
all
the
files
seen
in
the
installer
archive
earlier
are
dumped
on
to
disk
in
new
directory
created
in
C:
drive.
VirtualAlloc
–
To
hold
the
shellcode
in
the
RWX
memory.
SetFilePointer
–
To
seek
the
exact
position
of
the
shellcode
in
the
dumped
file.
ReadFile
–
To
read
the
shellcode.
EnumResourceTypesA
–
Execution
via
callback
mechanism.
The
second
parameter
is
of
the
type
ENUMRESTYPEPROCA
which
is
simply
a
pointer
to
a
callback
routine.
The
address
where
the
shellcode
is
allocated
in
the
memory
is
passed
as
the
second
argument
to
this
API
leading
to
execution
of
the
shellcode.
Callback
functions
parameters
are
good
resources
for
indirect
execution
of
the
code.

Vectored
Exception
Handling
in

GULoader

The
implementation
of
the
exception
handling
by
the
Operating
System
provides
an
opportunity
for
the
adversary
to
take
over
execution
flow.
The
Vectored
Exception
Handling
on
Windows
provides
the
user
with
ability
to
register
custom
exception
handler,
which
is
simply
a
code
logic
that
gets
executed
at
the
event
of
an
exception.
The
interesting
thing
about
handling
exceptions
is
that
the
way
in
which
the
system
resumes
its
normal
execution
flow
of
the
program
after
the
event
of
exception.
Adversaries
exploit
this
mechanism
and
take
ownership
of
the
execution
flow.
Malware
can
divert
the
flow
to
the
code
which
is
under
its
control
when
the
exception
occurs.
Normally
it
is
employed
by
the
malware
to
achieve
following
goals:

Hooking
Covert
code
execution
and
anti-analysis

The
GuLoader
employs
the
VEH
mainly
for
obfuscating
the
execution
flow
and
to
slow
down
the
analysis.
This
section
will
cover
the
internals
of
Vectored
exception
handling
on
Windows
and
investigates
how
GUloader
is
abusing
the
VEH
mechanism
to
thwart
any
analysis
efforts.

The
Vectored
Exception
Handling
(VEH)
is
an
extension
of
Structured
Exception
Handling
(SEH)
with
which
we
can
add
a
vectored
exception
handler
which
will
be
called
despite
of
our
position
in
a
call
frame,
simply
put
VEH
is
not
frame-based.
VEH
is
abused
by
malware,
either
to
manipulate
the
control
flow
or
covertly
execute
user
functions.
Windows
provides
AddVectoredExceptionHandler
Win32
API
to
add
custom
exception
handlers.
The
function
signature
is
shown
below.

The
Handler
routine

is

of
the

type
PVECTORED_EXCEPTION_HANDLER.

Further
checking
the

documentation,

we
can
see
the
handler
function
takes

a
pointer
to
_EXCEPTION_POINTERS
type

as
its
input

as
shown
in
the
image
below.

The

_EXCEPTION_POINTERS
type
holds
two
important

structures;

PEXCEPTION_RECORD
and
PCONTEXT.
PEXCEPTION_RECORD

contains

all
the
information
related
to

exception

raised
by
the
system
like
exception
code

etc.

and

PCONTEXT
structure
holds

CPU
register

(like
RIP/EIP,
debug
registers

etc.)

values
or
state
of
the
thread

captured
when
exception
occurred.

This
means
the
exception
handler
can
access
both
ExceptionRecord
and
ContextRecord.
Here
from
within
the
handler
one
can
tamper
with
the
data
stored
in
the
ContextRecord,
thus
manipulating
EIP/RIP
to
control
the
execution
flow
when
user
application
resumes
from
exception
handling.
There
is
one
interesting
thing
about
exception
handling,
the
execution
to
the
application
is
given
back
via
NtContinue
native
routine.
Exception
dispatch
routines
call
the
handler
and
when
handler
returns
to
dispatcher,
it
passes
the
ContextRecord
to
the
NtContinue
and
execution
is
resumed
from
the
EIP/RIP
in
the
record.
On
a
side
note,
this
is
an
oversimplified
explanation
of
the
whole
exception
handling
process.

Vectored
Handler
in
GULoader

GULoader
registers
a
vectored
exception
handler
via
RtlAddVectoredExceptionHandler
native
routine.
The
below
image
shows
the
control
flow
of
the
handler
code.
Interestingly
most
of
the
code
blocks
present
here
are
junk
added
to
thwart
the
analysis
efforts.

The
GULoader’s
handler
implementation
is
as
follows
(disregarding
the
junk
code).
Reads
ExceptionInfo
passed
to
the
handler
by
the
system.
Reads
the
ExceptionCode
from
ExceptionRecord
structure.
Checks
the
value
of
ExceptionCode
field
against
the
computed
exception
codes
for
STATUS_ACCESS_VIOLATION,
STATUS_BREAKPOINT
and
STATUS_SINGLESTEP.
Based
on
the
exception
code,
malware
takes
a
branch
and
executes
code
that
modifies
the
EIP.

The
GULoader
sets
the
trap
flag
to
trigger
single
stepping
intentionally
to
detect
analysis.
The
handler
code
gets
executed
as
discussed
before,
a
block
of
code
is
executed
based
on
the
exception
code.
If
the
exception
is
single
stepping,
status
code
is
0x80000004,
following
actions
take
place:

The
GULoader
reads
the
ContextRecord
and
retrieves
EIP
value
of
the
thread.
Increments
the
current
EIP
by
2
and
reads
the
one
byte
from
there.
Performs
an
XOR
on
the
one-byte
data
fetched
from
step
before
and
a
static
value.
The
static
value
changes
with
samples.
In
our
sample
value
is
0x1A.
The
XOR’ed
value
is
then
added
to
the
EIP
fetched
from
the
ContextRecord.
Finally,
the
modified
EIP
value
from
prior
step
is
saved
in
the
ContextRecord
and
returns
the
control
back
to
the
system(dispatcher).
The
malware
has
the
same
logic
for
the
access
violation
exception.

When
the
shellcode
is
executed
without
debugger,
INT3
instruction
invokes
the
vectored
exception
handler
routine,
with
an
exception
of
EXCEPTION_BREAKPOINT,
handler
computes
EIP
by
incrementing
the
EIP
by
1
and
fetching
the
data
from
incremented
location.
Later
XORing
the
fetched
data
with
a
constant
in
our
case
0x1A.
The
result
is
added
to
current
EIP
value.
The
logic
implemented
for
handling
INT3
exceptions
also
scan
the
program
code
for
0xCC
instructions
put
by
the
researchers.
If
0xCC
are
found
that
are
placed
by
researchers
then
EIP
is
not
calculated
properly.

EIP
Calculation
Logic
Summary

Trigger via interrupt instruction (INT3)	eip=((ReadByte(eip+1)^0x1A)+eip)
Trigger via Single Stepping(PUSHFD/POPFD)	eip=((ReadByte(eip+2)^0x1A)+eip)

*The
value
0x1A
changes
with
samples

Detecting
Abnormal
Execution
Flow
via
VEH

The
shellcode
is
structured
in
such
a
way
that
the
malware
can
detect
abnormal
execution
flow
by
the
order
in
which
exception
occurred
at
runtime.
The
pushfd/popfd
instructions
are
followed
by
the
code
that
when
executed
throws
STATUS_ACCESS_VIOLATION.
When
program
is
executed
normally,
the
execution
will
not
reach
the
code
that
follows
the
pushfd/popfd
instruction
block,
thus
raising
only
STATUS_SINGLESTEP.
When
accidently
stepped
over
the
pushfd/popfd
block
in
debugger,
the
STATUS_SINGLESTEP
is
not
thrown
at
the
debugger
as
it
suppreses
this
because
the
debugger
is
already
single
stepping
through
the
code,
this
is
detected
by
the
handler
logic
when
we
encounter
code
that
follows
the
pushfd/popfd
instruction
block
wich
throws
a
STATUS_ACCESS_VIOLATION.
Now
it
runs
into
a
nested
exception
situation
(the
access
violation
followed
by
suppressed
single
stepping
exception
via
trap).
Because
of
this,
whenever
an
access
violation
occurs,
the
handler
routine
checks
for
nested
exception
information
in
_EXCEPTION_POINTERS
structure
as
discussed
in
the
beginning.

Below
image
shows
this
the
carefully
laid
out
code
to
detect
analysis.

The
Egg
hunting:
VEH
Assisted
Runtime
Padding

One
interesting
feature
seen
in
GULoader
shellcode
in
the
wild
is
runtime
padding.
Runtime
padding
is
an
evasive
behavior
to
beat
automated
scanners
and
other
security
checks
employed
at
runtime.
It
delays
the
malicious
activities
performed
by
the
malware
on
the
target
system.

The
egg
value
in
the
analyzed
sample
is
0xAE74B61.
It
initiates
a
search
for
this
value
in
its
own
data
segment
of
the
shellcode.
Don’t
forget
the
fact
that
this
is
implemented
via
VEH
handler.
This
search
itself
adds
0.3
million
of
VEH
iteration
on
top
of
regular
VEH
control
manipulation
employed
in
the
code.
The
loader
ends
this
search
when
it
retrieves
the
address
location
of
the
egg
value.
To
make
sure
the
value
is
not
being
manipulated
by
any
means
by
the
researcher,
it
performs
two
additional
checks
to
validate
the
egg
location.
If
the
check
fails,
the
search
continues.
The
process
of
retrieving
the
location
of
the
egg
is
shown
in
the
image
below.

As
mentioned
above,
the
validity
of
the
egg
location
is
checked
by
retrieving
byte
values
from
two
offsets:
one
is
4
bytes
away
from
the
egg
location
and
the
value
is
0xB8.
The
other
is
at
9
bytes
from
the
egg
location
and
the
value
is
0xC3.
This
check
needs
to
be
passed
for
the
loader
to
proceed
to
the
next
stage
of
infection.
Core
malware
activities
are
performed
after
this
runtime
padding
loop.

The
following
images
show
the
egg
location
validity
checks
performed
by
GULoader.
The
values
0xB8
and
0xC3
are
checked
by
using
proper
offsets
from
the
egg
location.

Stage
2:
Environment
Check
and
Code
Injection

In
the
second
stage
of
the
infection
chain,
the
GULoader
performs
anti-analysis
and
code
injection.
Major
anti-analysis
vectors
are
listed
below.
After
making
sure
that
shellcode
is
not
running
in
a
sandbox,
it
proceeds
to
conduct
code
injection
into
a
newly
spawned
process
where
stage
3
is
initiated
to
download
and
deploy
actual
payload.
This
payload
can
be
either
commodity
stealer
or
RAT.

Anti-analysis
Techniques

Employs
runtime
padding
as
discussed
before.
Scans
whole
process
memory
for
analysis
tool
specific
strings
Uses
DJB2
hashing
for
string
checks
and
dynamic
API
address
resolution.
Strings
are
decoded
at
runtime
Checks
if
qemu
is
installed
on
the
system
by
checking
the
installation
path:
C:\Program
Files\qqa\qqa.exe
Patches
the
following
APIs:
DbgUIRemoteBreakIn
The
function’s
prologue
is
patched
with
ExitProcess
call
LdrLoadDll
The
initial
bytes
are
patched
with
instruction
“mov
edi
edi”
DbgBreakPoint
Patches
with
instruction
nop
Clears
hooks
placed
in
ntdll.dll
by
security
products
or
researcher
for
the
analysis.
Window
Enumeration
via
EnumWindows
Hides
the
shellcode
thread
from
the
debugger
via
ZwSetInformationThread
by
passing
0x11
(ThreadHideFromDebugger)
Device
driver
enumeration
via
EnumDeviceDrivers
andGetDeviceDriverBaseNameA
Installed
software
enumeration
via
MsiEnumProductsA
and
MsiGetProductInfoA
System
service
enumeration
via
OpenSCManagerA
and
EnumServiceStatusA
Checks
use
of
debugging
ports
by
passing
ProcessDebugPort
(0x7)
class
to
NtQueryInformationProcess
Use
of
CPUID
and
RDTSC
instructions
to
detect
virtual
environments
and
instrumentation.

Anti-dump
Protection

Whenever
GULoader
invokes
a
Win32
api,
the
call
is
sandwiched
between
two
XOR
loops
as
shown
in
the
image
below.
The
loop
prior
to
the
call
encoded
the
active
shellcode
region
where
the
call
is
taking
place
to
prevent
the
memory
from
getting
dumped
by
the
security
products
based
on
event
monitoring
or
api
calls.
Following
the
call,
the
shellcode
region
is
decoded
again
back
to
normal
and
resumes
execution.
The
XOR
key
used
is
a
word
present
in
the
shellcode
itself.

String
Decoding

This
section
covers
the
process
undertaken
by
the
GUloader
to
decode
the
strings
at
the
runtime.

The
NtAllocateVirtualMemory
is
called
to
allocate
a
buffer
to
hold
the
encoded
bytes.
The
encoded
bytes
are
computed
by
performing
various
arithmetic
and
logical
operations
on
static
values
embedded
as
operands
of
assembly
instructions.
Below
image
shows
the
recovery
of
encoded
bytes
via
various
mathematical
and
logical
operations.
The
EAX
points
to
memory
buffer,
where
computed
encoded
values
get
stored.

The
first
byte/word
is
reserved
to
hold
the
size
of
the

encoded
bytes.
Below
shows
a
12
byte
long
encoded
data
being

written
to
memory.

Later,
the
first

word

gets
replaced
by
the
first

word

of
the
actual
encoded
data.

Below
image
shows
the
buffer
after
replacing
the
first
word.

The
encoded
data
is
fully

recovered

now,

and

malware

proceeds

to
decode

it.
For
decoding
the

simple

XOR
is

employed,

and
key
is
present
in
the

shellcode.
The
assembly
routine
that
does
the
decoding
is
shown

in

the

image
below.

Each
byte
in
the
buffer
is

XORed
with

the
key.

The
result
of
the
XOR
operation
is
written
to
same
memory
buffer
that
holds
the
encoded
data.

A
final

view
of
the
memory
buffer
with
decoded
data
is
shown
below.

The
image
shows
the

decoding

the
string
“psapi.dll”,

later
this

string

is
used
in
fetching
the
addresses

of

various
functions

to
employ

anti-analysis.

The
stage
2
culminates
in
code
injection,
to
be
specific
GULoader
employs
a
variation
of
the
process
hollowing
technique,
where
a
benign
process
is
spawned
in
a
suspended
state
by
the
malware
stager
process
and
proceeds
to
overwrite
the
original
content
present
in
the
suspended
process
with
malicious
content,
later
the
state
of
the
thread
in
the
suspended
process
is
changed
by
modifying
processor
register
values
like
EIP
and
finally
the
process
resumes
its
execution.
By
controlling
EIP,
malware
can
now
direct
the
control
flow
in
the
spawned
process
to
a
desired
code
location.
After
a
successful
hollowing,
the
malware
code
will
be
running
under
the
cover
of
a
legit
application.

The
variation
of
hollowing
technique
employed
by
the
GULoader
doesn’t
replace
the
file
contents,
but
instead
injects
the
same
shellcode
and
maps
the
memory
in
the
suspended
process.
Interestingly,

GULoader
employs
an
additional
technique
if
the
hollowing
attempt
fails.
More
details
are
covered
in
the
following
section.

Listed
below
Win32
native
APIs
are
dynamically
resolved
at
runtime
to
perform
the
code
injection.

NtCreateSection
ZwMapViewOfSection
NtWriteVirtualMemory
ZwGetContetThread
NtSetContextThread
NtResumeThread

Overview
of
Code
Injection

Initially
image
“%windir%Microsoft.NETFrameworkversion
on
32-bit
systems<version>CasPol.exe”
is
spawned
in
suspended
mode
via
CreateProcessInternalW
native
API.
The
Gu
loader
retrieves
a
handle
to
the
file

“C:WindowsSysWOW64iertutil.dll”

which
is
used
in
section
creation.
The
section
object
created
via

NtCreateSection

will
be
backed
by
iertutil.dll.
This
behavior
is
mainly
to
avoid
suspicion,
a
section
object
which
is
not
backed
by
any
file
may
draw
unwanted
attention
from
security
systems.
The
next
phase
in
the
code
injection
is
the
mapping
of
the
view
created
on
the
section
backed
by
the
iertutil.dll
into
the
spawned
CasPol.exe
process.
Once
the
view
is
successfully
mapped
to
the
process,
malware
can
inject
the
shellcode
in
the
mapped
memory
and
resume
the
process
thus
initiating
stage
3.
The
native
api
ZwMapViewOfSection
is
used
to
perform
this
task.
Following
the
execution
of
the
above
API,
the
malware
checks
the
result
of
the
function
call
against
the
below
listed
error
statuses.
C0000018
(STATUS_CONFLICTING_ADDRESS)
C0000220
(STATUS_MAPPED_ALIGNMENT)
40000003
(STATUS_IMAGE_NOT_AT_BASE).
If
the
mapping
is
unsuccessful
and
status
code
returned
by
ZwMapViewOfSection
matches
with
any
of
the
code
mentioned
above,
it
has
a
backup
plan.
The
GuLoader
calls
NtAllocateVirtualMemory
by
directly
calling
the
system
call
stub
which
is
normally
found
in
ntdll.dll
library
to
bypass
EDR/AV
hooks.
The
memory
is
allocated
in
the
remote
CasPol.exe
process
with
an
RWX
memory
protection.
Following
image
shows
the
direct
use
of
NtAllocateVirtualMemory
system
call.

After

memory
allocation,

it

writes
itself

into
remote
process

via

NtWriteVirtualMemory

as
discussed
above.

GULoader

shellcodes

taken
from
the

field

are
bigger
in

size,

samples

taken
for
this
analysis
are
all
greater
than
20
mb.
In

samples
analyzed,
the

buffer
size

allocated

to
hold
the
shellcode
is
2950000
bytes.

The
below
image
shows
the

GuLoader

shellcode
in
the
memory.

Misleading
Entry
point

The
GULoader
is
highly
evasive
in
nature,
if
abnormal
execution
flow
is
detected
with
help
of
employed
anti-analysis
vectors,
the
EIP
and
EBX
fields
of
thread
context
structure
(of
CasPol.exe
process)
will
be
overwritten
with
a
decoy
address,
which
is
required
for
the
stage
3
of
malware
execution.
The
location
ebp+4
is
used
to
hold
the
entry
point
despite
of
the
fact
whether
program
is
being
debugged
or
not.
The
Gu
loader
uses
ZwGetContextThread
and
NtSetContextThread
routines
to
accomplish
modification
of
the
thread
state.
The
CONTEXT
structure
is
retrieved
via
ZwGetContextThread,
the
value
[ebp+14C]
is
used
as
the
entry
point
address.
The
current
EIP
value
held
in
the
EIP
field
in
the
context
structure
of
the
thread
will
be
changed
to
a
recalculated
address
based
on
value
at
ebp+4.
Below
image
shows
the
RVA
calculation.
The
base
address
of
the
executing
shellcode
(stage
2)
is
subtracted
from
the
virtual
address
[ebp+4]
to
obtain
RVA.

The
RVA
is
added
to
the

base
address
of
the

newly

allocated

memory
in
the
CasPol.exe
process
to
obtain
new
VA
which
can
be
used
in
the
remote
process.

The
new
VA
is
written
into

EIP
and
EBX
field
in
the
thread
context
structure

of
the
CasPol.exe
process

retrieved
via

ZwGetContextThread.

Below
image
shows
the
modified
context
structure

and
value
of
EIP.

Finally,

by
calling

ZwSetContextThread,
the

changes

made
to
the
CONTEXT
structure

is

committed

in
the
target
thread
of
CasPol.exe
process.

The
thread
is
resumed
by
calling

NtResumeThread.

The
CasPol.exe
resumes
execution

and

performs
stage

3

of
the
infection
chain.

Stage
3:
Payload
Deployment

The
GULoader
shellcode
resumes
execution
from
within
a
new
host
process,
in
this
report,
analyzed
samples
inject
the
shellcode
either
into
the
same
process
spawned
as
a
child
process
or
caspol.exe.
Stage3
performs
all
the
anti-analysis
once
again
to
make
sure
this
stage
is
not
being
analyzed.
After
all
checks,
GUloader
proceeds
to
perform
stage3
activities
by
decoding
the
encoded
C2
string
in
the
memory
as
shown
in
the
image
below.
The
decoding
method
is
the
same
as
discussed
before.

Later
the
addresses
of
following
functions
are
resolved
dynamically
by
loading
wininet.dll:

InternetOpenA

InternetSetOptionA
InternetOpenUrlA
InternetReadFile
InternetCloseHandle.

The
below

image
shows
the
response

from

the
content

delivery
network
(cdn)
server
where
the
final
payload
is

stored.

In
this
analysis,

a
payload
of
size
0x2E640
bytes
is
sent
to
the

loader.
Interestingly,

the
first

40
bytes

are
ignored
by
the
loader.
The
actual
payload
starts
from
the
offset
40

which
is
highlighted
in
the
image.

The

cdn

server
is
well
protected,
it
only

serves
to

clients
with
proper
headers

and

cookies.
If

these

are
not
present
in
the
HTTP
request,
the
following
message
is
shown
to
the
user.

Final
Payload

Quasi
Key
Generation

The
first
step
in
decoding
the
the
downloaded
final
payload
by
the
GUloader
is
generating
a
quasi
key
which
will
be
later
used
in
decoding
the
actual
key
embeded
in
the
GULoader
shellcode.
The
encoded
embeded
key
size
is
371
bytes
in
analysed
sample.
The
process
of
quasi
key
generation
is
as
follows:

The
40th

and
41st

bytes
(word)

are
retrived
from
the
download
buffer
in
the
memory.
The
above
word
is
XORed
with
the
first
word
of
the
encoded
embeded
key
along
and
a
counter
value.
The
process
is
repeated
untill
the
the
word
taken
from
the
downloaded
data
fully
decodes
and
have
a
value
of
0x4D5A
“MZ”.
The
value
present
in
the
counter
when
the
4D5A
gets
decoded
is
taken
as
the
quasi
key.
This
key
is
shown
as
“key-1”
in
the
image
below.
In
the
analysed
sample
the
value
of
this
key
is
“0x5448”

Decoding
Actual
Key

The
embedded
key
in
the
GULoader
shellcode
is
of
the
size
371
bytes
as
discussed
before.
The
quasi
key
is
used
to
decode
the
embeded
key
as
shown
in
the
image
below.

Each
word
in
the
embeded
key
is
XORed
with
quasi
key
key-1.
When
the
interation
counter
exceeds
the
size
value
of
371
bytes,
it
stops
and
proceeds
to
decode
the
downloaded
payload
with
this
new
key.

The

decoded
371
bytes
of

embeded

key
is
shown
below
in
the
image
below.

Decoding
File

A
byte
level
decoding
happens
after
embeded
key
is
decoded
in
the
memory.
Each
byte
of
the
downloaded
data
is
XORed
with
the
key
to
obtain
the
actual
data,
which
is
a
PE
file.
The
decoded
data
is
overwritten
to
the
same
buffer
used
to
download
the
decoded
data.

The
final
decoded
PE
file

residing

in
the
memory
is
shown
in
the
image
below:

Finally,
the
loader
loads
the
PE
file
by
allocating
the
memory
with
RWX
permission
in
the
stage3
process,
based
on
analyzing
multiple
samples
it’s
either
the
same
process
in
stage
2
as
the
child
process,

or
casPol.exe.
The
loading
involved
code
relocation
and
IAT
correction
as
expected
in
such
a
scenario.
The
final
payload
resumes
execution
from
within
the
hollowed
stage3
process.
Below
malware
families
are
usually
seen
deployed
by
the
GULoader:

Vidar
(Stealer)
Raccoon
(Stealer)
Remcos
RAT

Below
image
shows
the
injected
memory
regions
in
stage3
process
caspol.exe
in
this
report.

Conclusion

The
role
played
by
malware
loaders
popularly
known
as
“crypters”
is
significant
in
the
deployment
of
Remote
Administration
Tools
and
stealer
malwares
that
target
consumer
data.
The
exfiltrated
Personal
Identifiable
Information
(PII)
extracted
from
the
compromised
endpoints
are
largely
collected
and
funneled
to
various
underground
data
selling
marketplaces.
This
also
impacts
businesses
as
various
critical
information
used
for
authentication
purposes
are
getting
leaked
from
the
personal
systems
of
the
user
leading
to
initial
access
on
the
company
networks.
The
GuLoader
is
heavily
used
in
mass
malware
campaigns
to
infect
the
users
with
popular
stealer
malware
like
Raccoon,
Vidar,
and
Redline.
Commodity
RATs
like
Remcos
are
also
seen
delivered
in
such
campaign
activities.
On
the
bright
side,

it
is
not
difficult
to
fingerprint
malware
specimens
used
in
the
mass
campaigns
because
of
the
volume
its
volume
and
relevance,
detection
rules
and
systems
can
be
built
around
this
very
fact.

Following
table
summarizes
all
the
dynamically
resolved
Win32
APIs

Win32
API

RtlAddVectoredExceptionHandler

NtAllocateVirtualMemory

DbgUIRemoteBreakIn

LdrLoadDll

DbgBreakPoint

EnumWindows

Nt/ZwSetInformationThread

EnumDeviceDrivers

GetDeviceDriverBaseNameA

MsiEnumProductsA

MsiGetProductInfoA

TerminateProcess

ExitProcess

NtSetContextThread

NtWriteVirtualMemory

NtCreateSection

NtMapViewOfSection

NtOpenFile

NtSetInformationProcess

NtClose

NtResumeThread

NtProtectVirtualMemory

CreateProcessInternal

GetLongPathNameW

Sleep

NtCreateThreadEx

WaitForSingleObject

TerminateThread

CreateFileW

WriteFile

CloseHandle

GetFileSize

ReadFile

ShellExecuteW

SHCreateDirectoryExW

RegCreateKeyExA

RegSetValueExA

OpenSCManagerA

EnumServiceStatusA

CloseServiceHandle

NtQueryInformationProcess

InternetOpenA

InternetSetOptionA

InternetOpenUrlA

InternetReadFile

InternetCloseHandle

IOC

889fddcb57ed66c63b0b16f2be2dbd7ec0252031cad3b15dfea5411ac245ef56

59b71cb2c5a14186a5069d7935ebe28486f49b7961bddac0a818a021373a44a3

4d9cdd7526f05343fda35aca3e0e6939abed8a037a0a871ce9ccd0e69a3741f2

c8006013fc6a90d635f394c91637eae12706f58897a6489d40e663f46996c664

c69e558e5526feeb00ab90efe764fb0b93b3a09692659d1a57c652da81f1d123

45156ac4b40b7537f4e003d9f925746b848a939b2362753f6edbcc794ea8b36a

e68ce815ac0211303d2c38ccbb5ccead144909d295230df4b7a419dfdea12782

b24b36641fef3acbf3b643967d408b10bf8abfe1fe1f99d704a9a19f1dfc77e8

569aa6697083993d9c387426b827414a7ed225a3dd2e1e3eba1b49667573fdcb

60de2308ebfeadadc3e401300172013be27af5b7d816c49696bb3dedc208c54e

23458977440cccb8ac7d0d05c238d087d90f5bf1c42157fb3a161d41b741c39d

About Author

AndyC

Andy Curtis is an award-winning security consultant, researcher and public speaker. He has been working in the computer security industry since the early 1990s, having been employed by state and federal government, leading healthcare and banking providers across three continents. He has given talks about computer security for some of the world’s largest companies, worked with law enforcement agencies on investigations into hacking groups, and is a regular voice on TV and radio explaining IT security threats.

See author's posts

GULoader Campaigns: A Deep Dive Analysis of a highly evasive Shellcode based loader

Stage

1:

GULoader

Shellcode
Deployment

Vectored
Exception
Handling
in

GULoader

The
Egg
hunting:
VEH
Assisted
Runtime
Padding