GULoader Campaigns: A Deep Dive Analysis of a highly evasive Shellcode based loader

Authored
by:

Anandeshwar

Unnikrishnan

Stage

1:

GULoader

Shellcode
Deployment 

In
recent
GULoader
campaigns,
we
are
seeing
a
rise
in
NSIS-based
installers
delivered
via
E-mail
as
malspam
tha

GULoader Campaigns: A Deep Dive Analysis of a highly evasive Shellcode based loader

Authored
by:

Anandeshwar


Unnikrishnan



Stage


1:


GULoader


Shellcode
Deployment
 


In
recent
GULoader
campaigns,
we
are
seeing
a
rise
in
NSIS-based
installers
delivered
via
E-mail
as
malspam
that
use
plugin
libraries
to
execute
the
GU
shellcode
on
the
victim
system.
The
NSIS
scriptable
installer
is
a
highly
efficient
software
packaging
utility.
The
installer
behavior
is
dictated
by
an
NSIS
script
and
users
can
extend
the
functionality
of
the
packager
by
adding
custom
libraries
(dll)
known
as
NSIS
plugins.
Since
its
inception,
adversaries
have
abused
the
utility
to
deliver
malware.
 


NSIS
stands
for
Nullsoft
Scriptable
Installer.
NSIS
installer
files
are
self-contained
archives
enabling
malware
authors
to
include
malicious
assets
along
with
junk
data.
The
junk
data
is
used
as
Anti-AV
/
AV
Evasion
technique.
The
image
below
shows
the
structure
of
an
NSIS
GULoader
staging
executable
archive.





The
NSIS
script
,

which
is
a
file
found
in
the
archive
,

has
a
file
extension


.nsi

as
shown
in
the
image
above.
The
deployment


strategy
employed
by
the
threat
actor
can
be


studied
by
analyzing
the
NSIS
script
commands
provided
in
the
script
file.


The
image
shown
below
is
an
oversimplified
view
of
the
whole
shellcode
staging
process
.
 





The


file
that
holds
the
encoded


GULoader


shellcode
is
dropped
on
to
victim’s
disc
based
on
the
script
configuration
along
with
other
data.


Junk
is
appended
at
the
beginning
of
the
encoded
shellcode.


The
encoding
style
varies
from
sample
to
sample.
But


in
all
most


all
the


cases,


it’s


a
simple
XOR
encoding
.

As
mentioned
before
,


the
shellcode
is
appended
to
junk
data,
because
of


this,


an


offset
is


used
to
retrieve


encoded


GULoader


shellcode
.
In
the
image
,
the


FileSeek


NSIS
command
is
used
to
do
proper
offsetting.


Some
samples
have
unprotected


GULoader


shellcode
appended
to
junk
data
.
 





A
plugin
used
by
the
NSIS
installer
is
nothing
but
a
DLL


which
gets
loaded
by
the
installer
program
at
runtime
and
invokes
functions
exported
by
the
library


Two
DLL
files
are
dropped
in
user’s
TEMP
directory,
in
all
analyzed
samples
one
DLL
has


a
consistent
name
of
system.dll
and
name
of
the
other
one
varies. 
  



The
system.dll


is
responsible
for


allocating


memory
for
the
shellcode
and


its
execution.


The
following


image
shows


how
the


NSIS


script
calls
functions
in
plugin


libraries.





The
system.dll
has


the


following
exports
as
shown


the
in


the
image
below.
The


function
named


“Call”


is
being
used
to
deploy
the
shellcode
on
victim’s
system
.
 



  • The
    Call
    function
    exported
    by
    system.dll
    resolves
    following
    functions
    dynamically
    and
    execute
    them
    to
    deploy
    the
    shellcode.
     

  • CreateFile

    To
    read
    the
    shellcode
    dumped
    on
    to
    disk
    by
    the
    installer.
    As
    part
    of
    installer
    set
    up,
    all
    the
    files
    seen
    in
    the
    installer
    archive
    earlier
    are
    dumped
    on
    to
    disk
    in
    new
    directory
    created
    in
    C:
    drive.
     

  • VirtualAlloc

    To
    hold
    the
    shellcode
    in
    the
    RWX
    memory.
     

  • SetFilePointer

    To
    seek
    the
    exact
    position
    of
    the
    shellcode
    in
    the
    dumped
    file.
     

  • ReadFile

    To
    read
    the
    shellcode. 
     

  • EnumResourceTypesA

    Execution
    via
    callback
    mechanism.
    The
    second
    parameter
    is
    of
    the
    type
    ENUMRESTYPEPROCA
    which
    is
    simply
    a
    pointer
    to
    a
    callback
    routine.
    The
    address
    where
    the
    shellcode
    is
    allocated
    in
    the
    memory
    is
    passed
    as
    the
    second
    argument
    to
    this
    API
    leading
    to
    execution
    of
    the
    shellcode.
    Callback
    functions
    parameters
    are
    good
    resources
    for
    indirect
    execution
    of
    the
    code.  
     




Vectored
Exception
Handling
in


GULoader
 


The
implementation
of
the
exception
handling
by
the
Operating
System
provides
an
opportunity
for
the
adversary
to
take
over
execution
flow.
The
Vectored
Exception
Handling
on
Windows
provides
the
user
with
ability
to
register
custom
exception
handler,
which
is
simply
a
code
logic
that
gets
executed
at
the
event
of
an
exception.
The
interesting
thing
about
handling
exceptions
is
that
the
way
in
which
the
system
resumes
its
normal
execution
flow
of
the
program
after
the
event
of
exception.
Adversaries
exploit
this
mechanism
and
take
ownership
of
the
execution
flow.
Malware
can
divert
the
flow
to
the
code
which
is
under
its
control
when
the
exception
occurs.
Normally
it
is
employed
by
the
malware
to
achieve
following
goals:
 


  • Hooking
     

  • Covert
    code
    execution
    and
    anti-analysis
     


The
GuLoader
employs
the
VEH
mainly
for
obfuscating
the
execution
flow
and
to
slow
down
the
analysis.
This
section
will
cover
the
internals
of
Vectored
exception
handling
on
Windows
and
investigates
how
GUloader
is
abusing
the
VEH
mechanism
to
thwart
any
analysis
efforts. 
 


  • The
    Vectored
    Exception
    Handling
    (VEH)
    is
    an
    extension
    of
    Structured
    Exception
    Handling
    (SEH)
    with
    which
    we
    can
    add
    a
    vectored
    exception
    handler
    which
    will
    be
    called
    despite
    of
    our
    position
    in
    a
    call
    frame,
    simply
    put
    VEH
    is
    not
    frame-based.
     

  • VEH
    is
    abused
    by
    malware,
    either
    to
    manipulate
    the
    control
    flow
    or
    covertly
    execute
    user
    functions.
     

  • Windows
    provides
    AddVectoredExceptionHandler
    Win32
    API
    to
    add
    custom
    exception
    handlers.
    The
    function
    signature
    is
    shown
    below.
     



The
Handler
routine


is


of
the


type
PVECTORED_EXCEPTION_HANDLER.


Further
checking
the


documentation,


we
can
see
the
handler
function
takes


a
pointer
to
_E
XCEPTION_POINTERS
type


as
its
input


as
shown
in
the
image
below.
 



The


_
EXCEPTION_POINTERS
type
holds
two
important


structures
;


PEXCEPTION_RECORD
and
PCONTEXT
.
PEXCEPTION_RECORD


contains


all
the
information
related
t
o

exception


raised
by
the
system
like
exception
code


etc.


and


PCONTEXT
structure
hold
s

CPU
register


(like
RIP/EIP
,
debug
registers


etc.
)

values
or
state
of
the
thread


captured
when
exception
occurred.
 


  • This
    means
    the
    exception
    handler
    can
    access
    both
    ExceptionRecord
    and
    ContextRecord.
    Here
    from
    within
    the
    handler
    one
    can
    tamper
    with
    the
    data
    stored
    in
    the
    ContextRecord,
    thus
    manipulating
    EIP/RIP
    to
    control
    the
    execution
    flow
    when
    user
    application
    resumes
    from
    exception
    handling. 
     
     

  • There
    is
    one
    interesting
    thing
    about
    exception
    handling,
    the
    execution
    to
    the
    application
    is
    given
    back
    via
    NtContinue
    native
    routine.
    Exception
    dispatch
    routines
    call
    the
    handler
    and
    when
    handler
    returns
    to
    dispatcher,
    it
    passes
    the
    ContextRecord
    to
    the
    NtContinue
    and
    execution
    is
    resumed
    from
    the
    EIP/RIP
    in
    the
    record.
    On
    a
    side
    note,
    this
    is
    an
    oversimplified
    explanation
    of
    the
    whole
    exception
    handling
    process.
     


Vectored
Handler
in
GULoader
 


  • GULoader
    registers
    a
    vectored
    exception
    handler
    via
    RtlAddVectoredExceptionHandler
    native
    routine. 
    The
    below
    image
    shows
    the
    control
    flow
    of
    the
    handler
    code.
    Interestingly
    most
    of
    the
    code
    blocks
    present
    here
    are
    junk
    added
    to
    thwart
    the
    analysis
    efforts. 
     


  • The
    GULoader’s
    handler
    implementation
    is
    as
    follows
    (disregarding
    the
    junk
    code).
     

  • Reads
    ExceptionInfo
    passed
    to
    the
    handler
    by
    the
    system.
     

  • Reads
    the
    ExceptionCode
    from
    ExceptionRecord
    structure.
     

  • Checks
    the
    value
    of
    ExceptionCode
    field
    against
    the
    computed
    exception
    codes
    for
    STATUS_ACCESS_VIOLATION,
    STATUS_BREAKPOINT
    and
    STATUS_SINGLESTEP.
     

  • Based
    on
    the
    exception
    code,
    malware
    takes
    a
    branch
    and
    executes
    code
    that
    modifies
    the
    EIP.
     


 


The
GULoader
sets
the
trap
flag
to
trigger
single
stepping
intentionally
to
detect
analysis.
The
handler
code
gets
executed
as
discussed
before,
a
block
of
code
is
executed
based
on
the
exception
code.
If
the
exception
is
single
stepping,
status
code
is
0x80000004,
following
actions
take
place:


  • The
    GULoader
    reads
    the
    ContextRecord
    and
    retrieves
    EIP
    value
    of
    the
    thread.
     

  •  Increments
    the
    current
    EIP
    by
    2
    and
    reads
    the
    one
    byte
    from
    there.
     

  • Performs
    an
    XOR
    on
    the
    one-byte
    data
    fetched
    from
    step
    before
    and
    a
    static
    value.
    The
    static
    value
    changes
    with
    samples.
    In
    our
    sample
    value
    is
    0x1A.
     

  • The
    XOR’ed
    value
    is
    then
    added
    to
    the
    EIP
    fetched
    from
    the
    ContextRecord.
     

  • Finally,
    the
    modified
    EIP
    value
    from
    prior
    step
    is
    saved
    in
    the
    ContextRecord
    and
    returns
    the
    control
    back
    to
    the
    system(dispatcher).
     

  • The
    malware
    has
    the
    same
    logic
    for
    the
    access
    violation
    exception.
     


 


  • When
    the
    shellcode
    is
    executed
    without
    debugger,
    INT3
    instruction
    invokes
    the
    vectored
    exception
    handler
    routine,
    with
    an
    exception
    of
    EXCEPTION_BREAKPOINT,
    handler
    computes
    EIP
    by
    incrementing
    the
    EIP
    by
    1
    and
    fetching
    the
    data
    from
    incremented
    location.
    Later
    XORing
    the
    fetched
    data
    with
    a
    constant
    in
    our
    case
    0x1A.
    The
    result
    is
    added
    to
    current
    EIP
    value.
    The
    logic
    implemented
    for
    handling
    INT3
    exceptions
    also
    scan
    the
    program
    code
    for
    0xCC
    instructions
    put
    by
    the
    researchers.
    If
    0xCC
    are
    found
    that
    are
    placed
    by
    researchers
    then
    EIP
    is
    not
    calculated
    properly.
     


EIP
Calculation
Logic
Summary
 


Trigger
via
interrupt
instruction
(INT3)
 

eip=((ReadByte(eip+1)^0x1A)+eip)
 

Trigger
via
Single
Stepping(PUSHFD/POPFD)
 

eip=((ReadByte(eip+2)^0x1A)+eip)
 


*The
value
0x1A
changes
with
samples
 


Detecting
Abnormal
Execution
Flow
via
VEH
 


  • The
    shellcode
    is
    structured
    in
    such
    a
    way
    that
    the
    malware
    can
    detect
    abnormal
    execution
    flow
    by
    the
    order
    in
    which
    exception
    occurred
    at
    runtime.
    The
    pushfd/popfd
    instructions
    are
    followed
    by
    the
    code
    that
    when
    executed
    throws
    STATUS_ACCESS_VIOLATION.
    When
    program
    is
    executed
    normally,
    the
    execution
    will
    not
    reach
    the
    code
    that
    follows
    the
    pushfd/popfd
    instruction
    block,
    thus
    raising
    only
    STATUS_SINGLESTEP.
    When
    accidently
    stepped
    over
    the
    pushfd/popfd
    block
    in
    debugger,
    the
    STATUS_SINGLESTEP
    is
    not
    thrown
    at
    the
    debugger
    as
    it
    suppreses
    this
    because
    the
    debugger
    is
    already
    single
    stepping
    through
    the
    code,
    this
    is
    detected
    by
    the
    handler
    logic
    when
    we
    encounter
    code
    that
    follows
    the
    pushfd/popfd
    instruction
    block
    wich
    throws
    a
    STATUS_ACCESS_VIOLATION.
    Now
    it
    runs
    into
    a
    nested
    exception
    situation
    (the
    access
    violation
    followed
    by
    suppressed
    single
    stepping
    exception
    via
    trap).
    Because
    of
    this,
    whenever
    an
    access
    violation
    occurs,
    the
    handler
    routine
    checks
    for
    nested
    exception
    information
    in
    _EXCEPTION_POINTERS
    structure
    as
    discussed
    in
    the
    beginning.
     


Below
image
shows
this
the
carefully
laid
out
code
to
detect
analysis.
 


The
Egg
hunting:
VEH
Assisted
Runtime
Padding
 


One
interesting
feature
seen
in
GULoader
shellcode
in
the
wild
is
runtime
padding.
Runtime
padding
is
an
evasive
behavior
to
beat
automated
scanners
and
other
security
checks
employed
at
runtime.
It
delays
the
malicious
activities
performed
by
the
malware
on
the
target
system. 
 


  • The
    egg
    value
    in
    the
    analyzed
    sample
    is
    0xAE74B61. 
     

  • It
    initiates
    a
    search
    for
    this
    value
    in
    its
    own
    data
    segment
    of
    the
    shellcode.
     

  • Don’t
    forget
    the
    fact
    that
    this
    is
    implemented
    via
    VEH
    handler.
    This
    search
    itself
    adds
    0.3
    million
    of
    VEH
    iteration
    on
    top
    of
    regular
    VEH
    control
    manipulation
    employed
    in
    the
    code.
     

  • The
    loader
    ends
    this
    search
    when
    it
    retrieves
    the
    address
    location
    of
    the
    egg
    value.
    To
    make
    sure
    the
    value
    is
    not
    being
    manipulated
    by
    any
    means
    by
    the
    researcher,
    it
    performs
    two
    additional
    checks
    to
    validate
    the
    egg
    location.
     

  • If
    the
    check
    fails,
    the
    search
    continues.
    The
    process
    of
    retrieving
    the
    location
    of
    the
    egg
    is
    shown
    in
    the
    image
    below. 
     


  • As
    mentioned
    above,
    the
    validity
    of
    the
    egg
    location
    is
    checked
    by
    retrieving
    byte
    values
    from
    two
    offsets:
    one
    is
    4
    bytes
    away
    from
    the
    egg
    location
    and
    the
    value
    is
    0xB8.
    The
    other
    is
    at
    9
    bytes
    from
    the
    egg
    location
    and
    the
    value
    is
    0xC3.
    This
    check
    needs
    to
    be
    passed
    for
    the
    loader
    to
    proceed
    to
    the
    next
    stage
    of
    infection.
    Core
    malware
    activities
    are
    performed
    after
    this
    runtime
    padding
    loop.
     


 
The
following
images
show
the
egg
location
validity
checks
performed
by
GULoader.
The
values
0xB8
and
0xC3
are
checked
by
using
proper
offsets
from
the
egg
location.
 


Stage
2:
Environment
Check
and
Code
Injection 
 


In
the
second
stage
of
the
infection
chain,
the
GULoader
performs
anti-analysis
and
code
injection.
Major
anti-analysis
vectors
are
listed
below.
After
making
sure
that
shellcode
is
not
running
in
a
sandbox,
it
proceeds
to
conduct
code
injection
into
a
newly
spawned
process
where
stage
3
is
initiated
to
download
and
deploy
actual
payload.
This
payload
can
be
either
commodity
stealer
or
RAT. 
 


Anti-analysis
Techniques 
 


  • Employs
    runtime
    padding
    as
    discussed
    before.
     

  • Scans
    whole
    process
    memory
    for
    analysis
    tool
    specific
    strings
     

  • Uses
    DJB2
    hashing
    for
    string
    checks
    and
    dynamic
    API
    address
    resolution
    . 

  • Strings
    are
    decoded
    at
    runtime
     

  • Checks
    if
    qemu
    is
    installed
    on
    the
    system
    by
    checking
    the
    installation
    path:
     

  • C:\Program
    Files\qqa\qqa.exe
     

  • Patches
    the
    following
    APIs:
     

  • DbgUIRemoteBreakIn
     

  • The
    function’s
    prologue
    is
    patched
    with
    ExitProcess
    call
     

  • LdrLoadDll
     

  • The
    initial
    bytes
    are
    patched
    with
    instruction
    “mov
    edi
    edi”
     

  • DbgBreakPoint
     

  • Patches
    with
    instruction
    nop
     

  • Clears
    hooks
    placed
    in
    ntdll.dll
    by
    security
    products
    or
    researcher
    for
    the
    analysis.
     

  • Window
    Enumeration
    via
    EnumWindows
     

  • Hides
    the
    shellcode
    thread
    from
    the
    debugger
    via
    ZwSetInformationThread
    by
    passing
    0x11
    (ThreadHideFromDebugger)
     

  • Device
    driver
    enumeration
    via
    EnumDeviceDrivers
    andGetDeviceDriverBaseNameA
     

  • Installed
    software
    enumeration
    via
    MsiEnumProductsA
    and
    MsiGetProductInfoA
     

  • System
    service
    enumeration
    via
    OpenSCManagerA
    and
    EnumServiceStatusA
     

  • Checks
    use
    of
    debugging
    ports
    by
    passing
    ProcessDebugPort
    (0x7)
    class
    to
    NtQueryInformationProcess
     

  • Use
    of
    CPUID
    and
    RDTSC
    instructions
    to
    detect
    virtual
    environments
    and
    instrumentation.
     


Anti-dump
Protection
 


Whenever
GULoader
invokes
a
Win32
api,
the
call
is
sandwiched
between
two
XOR
loops
as
shown
in
the
image
below. 
The
loop
prior
to
the
call
encoded
the
active
shellcode
region
where
the
call
is
taking
place
to
prevent
the
memory
from
getting
dumped
by
the
security
products
based
on
event
monitoring
or
api
calls.
Following
the
call,
the
shellcode
region
is
decoded
again
back
to
normal
and
resumes
execution.
The
XOR
key
used
is
a
word
present
in
the
shellcode
itself.
 



String
Decoding 
 


This
section
covers
the
process
undertaken
by
the
GUloader
to
decode
the
strings
at
the
runtime.
 


  • The
    NtAllocateVirtualMemory
    is
    called
    to
    allocate
    a
    buffer
    to
    hold
    the
    encoded
    bytes.
     

  • The
    encoded
    bytes
    are
    computed
    by
    performing
    various
    arithmetic
    and
    logical
    operations
    on
    static
    values
    embedded
    as
    operands
    of
    assembly
    instructions.
    Below
    image
    shows
    the
    recovery
    of
    encoded
    bytes
    via
    various
    mathematical
    and
    logical
    operations.
    The
    EAX
    points
    to
    memory
    buffer,
    where
    computed
    encoded
    values
    get
    stored.
     



The
first
byte/word
is
reserved
to
hold
the
size
of
the


encoded
bytes.
Below
shows
a
12
byte
long
encoded
data
being


written
to
memory.
 



Later,
the
first


word


gets
replaced
by
the
first


word


of
the
actual
encoded
data
.

Below
image
shows
the
buffer
after
replacing
the
first
word.
 



The
encoded
data
is
fully


recovered


now,


and


malware


proceeds


to
decode


it.
For
decoding
the


simple


XOR
is


employed
,

and
key
is
present
in
the


shellcode.
The
assembly
routine
that
does
the
decoding
is
shown


in


the


image
below.


Each
byte
in
the
buffer
is


XORed
with


the
key
.
 



The
result
of
the
XOR
operation
is
written
to
same
memory
buffer
that
holds
the
encoded
data.


A
final


view
of
the
memory
buffer
with
decoded
data
is
shown
below.
 



The
image
shows
the


decoding


the
string
“psapi.dll
”,

later
this


string


is
used
in
fetching
the
address
es

of


various
functions


to
employ


anti-
analysis. 
 


The
stage
2
culminates
in
code
injection,
to
be
specific
GULoader
employs
a
variation
of
the
process
hollowing
technique,
where
a
benign
process
is
spawned
in
a
suspended
state
by
the
malware
stager
process
and
proceeds
to
overwrite
the
original
content
present
in
the
suspended
process
with
malicious
content,
later
the
state
of
the
thread
in
the
suspended
process
is
changed
by
modifying
processor
register
values
like
EIP
and
finally
the
process
resumes
its
execution.
By
controlling
EIP,
malware
can
now
direct
the
control
flow
in
the
spawned
process
to
a
desired
code
location.
After
a
successful
hollowing,
the
malware
code
will
be
running
under
the
cover
of
a
legit
application. 
 


The
variation
of
hollowing
technique
employed
by
the
GULoader
doesn’t
replace
the
file
contents,
but
instead
injects
the
same
shellcode
and
maps
the
memory
in
the
suspended
process.
Interestingly
,

GULoader
employs
an
additional
technique
if
the
hollowing
attempt
fails.
More
details
are
covered
in
the
following
section. 
 


Listed
below
Win32
native
APIs
are
dynamically
resolved
at
runtime
to
perform
the
code
injection.
 


  • NtCreateSection
     

  • ZwMapViewOfSection
     

  • NtWriteVirtualMemory
     

  • ZwGetContetThread
     

  • NtSetContextThread
     

  • NtResumeThread

      


Overview
of
Code
Injection
 


  • Initially
    image
    “%windir%Microsoft.NETFrameworkversion
    on
    32-bit
    systems<version>CasPol.exe”
    is
    spawned
    in
    suspended
    mode
    via
    CreateProcessInternalW
    native
    API.
     

  • The
    Gu
    loader
    retrieves
    a
    handle
    to
    the
    file



    “C:WindowsSysWOW64iertutil.dll”


    which
    is
    used
    in
    section
    creation.
    The
    section
    object
    created
    via



    NtCreateSection


    will
    be
    backed
    by
    iertutil.dll. 
     

  • This
    behavior
    is
    mainly
    to
    avoid
    suspicion,
    a
    section
    object
    which
    is
    not
    backed
    by
    any
    file
    may
    draw
    unwanted
    attention
    from
    security
    systems. 
     

  • The
    next
    phase
    in
    the
    code
    injection
    is
    the
    mapping
    of
    the
    view
    created
    on
    the
    section
    backed
    by
    the
    iertutil.dll
    into
    the
    spawned
    CasPol.exe
    process.
    Once
    the
    view
    is
    successfully
    mapped
    to
    the
    process,
    malware
    can
    inject
    the
    shellcode
    in
    the
    mapped
    memory
    and
    resume
    the
    process
    thus
    initiating
    stage
    3.
    The
    native
    api
    ZwMapViewOfSection
    is
    used
    to
    perform
    this
    task.
    Following
    the
    execution
    of
    the
    above
    API,
    the
    malware
    checks
    the
    result
    of
    the
    function
    call
    against
    the
    below
    listed
    error
    statuses.
     

  • C0000018
    (STATUS_CONFLICTING_ADDRESS)
     

  • C0000220
    (STATUS_MAPPED_ALIGNMENT)
     

  • 40000003
    (STATUS_IMAGE_NOT_AT_BASE).
     

  • If
    the
    mapping
    is
    unsuccessful
    and
    status
    code
    returned
    by
    ZwMapViewOfSection
    matches
    with
    any
    of
    the
    code
    mentioned
    above,
    it
    has
    a
    backup
    plan.
     

  • The
    GuLoader
    calls
    NtAllocateVirtualMemory
    by
    directly
    calling
    the
    system
    call
    stub
    which
    is
    normally
    found
    in
    ntdll.dll
    library
    to
    bypass
    EDR/AV
    hooks.
    The
    memory
    is
    allocated
    in
    the
    remote
    CasPol.exe
    process
    with
    an
    RWX
    memory
    protection.
    Following
    image
    shows
    the
    direct
    use
    of
    NtAllocateVirtualMemory
    system
    call.
     



After


memory
allocation
,

it


writes
itself


into
remote
process


via


NtWriteVirtualMemory


as
discussed
above
.

GULoader


shellcode
s

taken
from
the


field


are
bigger
in


size
, 

samples


taken
for
this
analysis
are
all
greater
than
20
mb
.
In


samples
analyzed
,
the


buffer
size


allocated


to
hold
the
shellcode
is
2
950000
bytes.


The
below
image
shows
the


GuLoader


shellcode
in
the
memory
.
 


Misleading
Entry
point


 


  • The
    GULoader
    is
    highly
    evasive
    in
    nature,
    if
    abnormal
    execution
    flow
    is
    detected
    with
    help
    of
    employed
    anti-analysis
    vectors,
    the
    EIP
    and
    EBX
    fields
    of
    thread
    context
    structure
    (of
    CasPol.exe
    process)
    will
    be
    overwritten
    with
    a
    decoy
    address,
    which
    is
    required
    for
    the
    stage
    3
    of
    malware
    execution.
    The
    location
    ebp+4
    is
    used
    to
    hold
    the
    entry
    point
    despite
    of
    the
    fact
    whether
    program
    is
    being
    debugged
    or
    not.
     

  • The
    Gu
    loader
    uses
    ZwGetContextThread
    and
    NtSetContextThread
    routines
    to
    accomplish
    modification
    of
    the
    thread
    state.
    The
    CONTEXT
    structure
    is
    retrieved
    via
    ZwGetContextThread,
    the
    value
    [ebp+14C]
    is
    used
    as
    the
    entry
    point
    address.
    The
    current
    EIP
    value
    held
    in
    the
    EIP
    field
    in
    the
    context
    structure
    of
    the
    thread
    will
    be
    changed
    to
    a
    recalculated
    address
    based
    on
    value
    at
    ebp+4.
    Below
    image
    shows
    the
    RVA
    calculation. 
    The
    base
    address
    of
    the
    executing
    shellcode
    (stage
    2)
    is
    subtracted
    from
    the
    virtual
    address
    [ebp+4]
    to
    obtain
    RVA. 
     



The
RVA
is
added
to
the


base
address
of
the


newly


allocated


memory
in
the
CasPol.exe
process
to
obtain
new
VA
which
can
be
used
in
the
remote
process.


The
new
VA
is
written
into


EIP
and
EBX
field
in
the
thread
context
structure


of
the
CasPol.exe
process


retrieved
via


ZwGetContextThread
.

Below
image
shows
the
modified
context
structure


and
value
of
EIP. 
 



Finally,


by
calling


ZwSetContextThread
,
the


change
s

made
to
the
CONTEXT
structure


is


committed


in
the
target
thread
of
CasPol
.exe
process.


The
thread
is
resumed
by
calling


NtResumeThread
.

The
CasPol.exe
resumes
executi
on

and


performs
stage


3


of
the
infection
chain.
 



Stage
3:
Payload
Deployment 
 


The
GULoader
shellcode
resumes
execution
from
within
a
new
host
process,
in
this
report,
analyzed
samples
inject
the
shellcode
either
into
the
same
process
spawned
as
a
child
process
or
caspol.exe.
Stage3
performs
all
the
anti-analysis
once
again
to
make
sure
this
stage
is
not
being
analyzed.
After
all
checks,
GUloader
proceeds
to
perform
stage3
activities
by
decoding
the
encoded
C2
string
in
the
memory
as
shown
in
the
image
below.
The
decoding
method
is
the
same
as
discussed
before.
 


Later
the
addresses
of
following
functions
are
resolved
dynamically
by
loading
wininet.dll:
 


  • InternetOpenA
     

  • InternetSetOptionA
     

  • InternetOpenUrlA
     

  • InternetReadFile
     

  • InternetCloseHandle.
     



The
below


image
shows
the
response


from


the
content


delivery
network
(
cdn)
server
where
the
final
payload
is


stored
.

In
this
analysis
,

a
payload
of
size
0x2E640
bytes
is
sent
to
the


loader.
Interestingly
,


the
first


40
bytes


are
ignored
by
the
loader.
The
actual
payload
starts
from
the
offset
40


which
is
highlighted
in
the
image.
 




The


cdn


server
is
well
protected,
it
only


serve
s
to


clients
with
proper
headers


and


cookies.
If


these


are
not
present
in
the
HTTP
request,
the
following
message
is
shown
to
the
user.
 



Final
Payload
 


Quasi
Key
Generation
 


The
first
step
in
decoding
the
the
downloaded
final
payload
by
the
GUloader
is
generating
a
quasi
key
which
will
be
later
used
in
decoding
the
actual
key
embeded
in
the
GULoader
shellcode.
The
encoded
embeded
key
size
is
371
bytes
in
analysed
sample.
The
process
of
quasi
key
generation
is
as
follows:
 


  • The
    40
    th

    and
    41
    st

    bytes
    (word)


    are
    retrived
    from
    the
    download
    buffer
    in
    the
    memory.
     

  • The
    above
    word
    is
    XORed
    with
    the
    first
    word
    of
    the
    encoded
    embeded
    key
    along
    and
    a
    counter
    value.
     

  • The
    process
    is
    repeated
    untill
    the
    the
    word
    taken
    from
    the
    downloaded
    data
    fully
    decodes
    and
    have
    a
    value
    of
    0x4D5A
    “MZ”.
     

  • The
    value
    present
    in
    the
    counter
    when
    the
    4D5A
    gets
    decoded
    is
    taken
    as
    the
    quasi
    key.
    This
    key
    is
    shown
    as
    “key-1”
    in
    the
    image
    below.
    In
    the
    analysed
    sample
    the
    value
    of
    this
    key
    is
    “0x5448”
     


Decoding
Actual
Key
 


The
embedded
key
in
the
GULoader
shellcode
is
of
the
size
371
bytes
as
discussed
before.
The
quasi
key
is
used
to
decode
the
embeded
key
as
shown
in
the
image
below.
 


  • Each
    word
    in
    the
    embeded
    key
    is
    XORed
    with
    quasi
    key
    key-1.
     

  • When
    the
    interation
    counter
    exceeds
    the
    size
    value
    of
    371
    bytes,
    it
    stops
    and
    proceeds
    to
    decode
    the
    downloaded
    payload
    with
    this
    new
    key.
     



The


decoded
371
bytes
of


embeded


key
is
shown
below
in
the
image
below.
 


Decoding
File
 


A
byte
level
decoding
happens
after
embeded
key
is
decoded
in
the
memory.
Each
byte
of
the
downloaded
data
is
XORed
with
the
key
to
obtain
the
actual
data,
which
is
a
PE
file.
The
decoded
data
is
overwritten
to
the
same
buffer
used
to
download
the
decoded
data.
 



The
final
decoded
PE
file


residing


in
the
memory
is
shown
in
the
image
below:
 



Finally,
the
loader
loads
the
PE
file
by
allocating
the
memory
with
RWX
permission
in
the
stage3
process,
based
on
analyzing
multiple
samples
it
s
either
the
same
process
in
stage
2
as
the
child
process
,

or
casPol.exe.
The
loading
involved
code
relocation
and
IAT
correction
as
expected
in
such
a
scenario.
The
final
payload
resumes
execution
from
within
the
hollowed
stage3
process.
Below
malware
families
are
usually
seen
deployed
by
the
GULoader:
 


  • Vidar
    (Stealer)
     

  • Raccoon
    (Stealer)
     

  • Remcos
    RAT
     


Below
image
shows
the
injected
memory
regions
in
stage3
process
caspol.exe
in
this
report.
 


Conclusion 
 


The
role
played
by
malware
loaders
popularly
known
as
“crypters”
is
significant
in
the
deployment
of
Remote
Administration
Tools
and
stealer
malwares
that
target
consumer
data.
The
exfiltrated
Personal
Identifiable
Information
(PII)
extracted
from
the
compromised
endpoints
are
largely
collected
and
funneled
to
various
underground
data
selling
marketplaces.
This
also
impacts
businesses
as
various
critical
information
used
for
authentication
purposes
are
getting
leaked
from
the
personal
systems
of
the
user
leading
to
initial
access
on
the
company
networks.
The
GuLoader
is
heavily
used
in
mass
malware
campaigns
to
infect
the
users
with
popular
stealer
malware
like
Raccoon,
Vidar,
and
Redline.
Commodity
RATs
like
Remcos
are
also
seen
delivered
in
such
campaign
activities.
On
the
bright
side
,

it
is
not
difficult
to
fingerprint
malware
specimens
used
in
the
mass
campaigns
because
of
the
volume
its
volume
and
relevance,
detection
rules
and
systems
can
be
built
around
this
very
fact.
 


 


Following
table
summarizes
all
the
dynamically
resolved
Win32
APIs 
 


Win32
API
 

RtlAddVectoredExceptionHandler
 

NtAllocateVirtualMemory
 

DbgUIRemoteBreakIn
 

LdrLoadDll
 

DbgBreakPoint
 

EnumWindows
 

Nt/ZwSetInformationThread
 

EnumDeviceDrivers
 

GetDeviceDriverBaseNameA
 

MsiEnumProductsA
 

MsiGetProductInfoA
 

TerminateProcess
 

ExitProcess
 

NtSetContextThread
 

NtWriteVirtualMemory
 

NtCreateSection
 

NtMapViewOfSection
 

NtOpenFile
 

NtSetInformationProcess
 

NtClose
 

NtResumeThread
 

NtProtectVirtualMemory
 

CreateProcessInternal
 

GetLongPathNameW
 

Sleep
 

NtCreateThreadEx
 

WaitForSingleObject
 

TerminateThread
 

CreateFileW
 

WriteFile
 

CloseHandle
 

GetFileSize
 

ReadFile
 

ShellExecuteW
 

SHCreateDirectoryExW
 

RegCreateKeyExA
 

RegSetValueExA
 

OpenSCManagerA
 

EnumServiceStatusA
 

CloseServiceHandle
 

NtQueryInformationProcess
 

InternetOpenA
 

InternetSetOptionA
 

InternetOpenUrlA
 

InternetReadFile
 

InternetCloseHandle
 


IOC
 


889fddcb57ed66c63b0b16f2be2dbd7ec0252031cad3b15dfea5411ac245ef56
 


59b71cb2c5a14186a5069d7935ebe28486f49b7961bddac0a818a021373a44a3
 


4d9cdd7526f05343fda35aca3e0e6939abed8a037a0a871ce9ccd0e69a3741f2
 


c8006013fc6a90d635f394c91637eae12706f58897a6489d40e663f46996c664
 


c69e558e5526feeb00ab90efe764fb0b93b3a09692659d1a57c652da81f1d123
 


45156ac4b40b7537f4e003d9f925746b848a939b2362753f6edbcc794ea8b36a
 


e68ce815ac0211303d2c38ccbb5ccead144909d295230df4b7a419dfdea12782
 


b24b36641fef3acbf3b643967d408b10bf8abfe1fe1f99d704a9a19f1dfc77e8
 


569aa6697083993d9c387426b827414a7ed225a3dd2e1e3eba1b49667573fdcb
 


60de2308ebfeadadc3e401300172013be27af5b7d816c49696bb3dedc208c54e
 


23458977440cccb8ac7d0d05c238d087d90f5bf1c42157fb3a161d41b741c39d
 

About Author

Subscribe To InfoSec Today News

You have successfully subscribed to the newsletter

There was an error while trying to send your request. Please try again.

World Wide Crypto will use the information you provide on this form to be in touch with you and to provide updates and marketing.