Userland hooking, IAT

I remember I had a conversation a few weeks ago with tweekier (this guy is a paing in the ass, he always makes me think) and a flash occured in my mind; I ‘ve never done a userland hooking the rootkit way. I have used several times the SetWindowsHookEx, I hooked some kernel functions, but boy I ‘ve never done it for a userland rootkit. I was dazed and confused during the x-mas hangovers so I didn’t want to think. I found a very good article from awarenetwork.org

[==============================================================================]
[---------------------------[ Userland API Hooking ]---------------------------]
 S0ban>   -  -- -   - - - = == ================================================]

       _.d####b._
     .############.
   .################.
  .##################.__ __ __ __ _ _ __ __
  ##############/�_`|#\ V  V // _` | '_/ -_)
  ##############\__,|# \_/\_/ \__,_|_| \___|
  ###########>"<######
  *#########(   )####*
   ##########>.<#####
    ################
     *############*
       "T######T"

0. Overview  ------------------------------------------------------------------]

API hooking is essentially the act of intercepting an API function call, and
modifying it's functionality somehow, either by redirecting it to a function
of our choice, or stopping the function from being called, or logging the
request... the possibilities are endless. This is useful for cracking
applications, especially when the application does gay stuff like hardcode
API offsets (funny story, that...) and check itself for integrity in memory.

Unfortunately, API hooking under Windows - without going down the rabbit hole
of Kernel Mode Programming - has long been a poorly documented subject. What
little documentation is available is often incomplete or erroneous, and based
more than theory than in practice, basically without adequate information for
someone to actually implement a working API hook.

Also, while it is already possible to monitor API calls with a debugger (or
use a pre-built library such as Microsoft Detours), implementing your own API
hooking system is good programming practice, and will help reinforce your
knowledge of the PE format - always a good thing, if you want to do some
reversing or hacking on Windows.

Without further ado, into the flames.

1. Preparation - Background Information, PE Format, Points of
   Attack ---------------------------------------------------------------------]

When a program is loaded into memory, a virtual memory space is created for it
(for each process), which holds the actual program and each DLL it needs
loaded at load-time (e.g. DLL's the programs calls functions from, e.g.
kernel32.dll, user32.dll). The program itself (the core PE file) and the DLL's
are collectively referred to as modules. You can load a process in WinDbg and
observe this for yourself.

1.0 "I've Googled a bit, what's this DLL injection stuff?" --------------------]

DLL injection is a subject often used in the same forum post as API hooking,
and for good reason.
When we do API hooking, we're effectively asking the target process to execute
our code instead of a given function - for example, by hooking MessageBoxW,
we're asking the program to run our code instead of the MessageBoxW contained
in user32.dll, a part of the Windows OS. However, the target process must have
the replacement code to execute within it's own virtual memory space. DLL
injection is the cleanest way to get code into the target process's virtual
memory space. We avoid directly writing our code into the other process's
memory space, as we can't be sure what we're overwriting - what may seem to be
a long string of zeroes may infact be used by the program for decompression,
etc etc.

1.1 Import Address Table Hooking ----------------------------------------------]

When the core PE file is loaded into memory, it's structure is similar to the
PE structure on disk (see Iczelion's tutorials on PE format, they are quite
thorough). Unlike on disk, however, there is no need to convert virtual
addresses to physical ones, as everything is already in it's appropriate
(virtual) address. Each process is by default loaded at the base address of
0x00400000, starting with the IMAGE_DOS_HEADER structure.

Following on from this, the IMAGE_NT_HEADERS structure is located at
  0x00400000 + IMAGE_DOS_HEADER.e_lfanew
(as per on disk). This structure contains the Import Address Table
 (IMAGE_NT_HEADERS.OptionalHeader.DataDirectories[1]).
This table contains a list of the API functions the program imports. This
list is filled in at load time by the windows PE loader, which fills the list
in with the actual in-memory locations of these API functions. When the
program wants to call an API function, it simply looks up the location of
the function from this Import Address Table.

Assuming we have already injected a DLL into the target process containing
our code, we can redirect a function to our code by changing it's entry in
the Import Address Table.

The original API is still loaded at it's original place in memory, so to call
it, we simply save the address of the original API when hooking, look it up,
and call that function when we're done with our processing.

In summary, the process of hooking an API using IAT hooking is as follows:

- Open the target process
- Inject a DLL containing our custom function
- Locate the Import Address Table
- Locate the specific entry for the function we need
-- Save this entry, incase we want to call the original later
- Replace that entry with one pointing to a custom function

1.2 Inline Hooking ------------------------------------------------------------]

When the core PE file is loaded into memory, the PE loader conveniently also
loads any other DLLs the program needs (for example, if the program calls
MessageBoxA, user32.dll will be loaded). These DLL's are also mapped into the
process's memory space, as in the following diagram:

-----------------------------
 ----------------
  Notepad.exe
  [... Import Address Table ...]
    Main:
    printf("hi");
    exit(0);
    ...
 ----------------
 ----------------
  kernel32.dll
  [... Export Address Table ...]
    ExitProcess:
    add eax,1
    ...
 ----------------
 ----------------
  user32.dll
  [... Export Address Table ...]
    MessageBoxA:
    push ecx
    ...
 ----------------
 ----------------
  more.dlls
  [... Export Address Table ...]
   AnotherFunction:
   xor ecx,edx
   ...
 ----------------
-----------------------------

In the header structures of these DLLs are Export Address Tables, which is a
list of each function the DLL exports, along with it's corresponding location
in the DLL. Knowing where the DLL is located in the host process's memory
space (using WinAPI, we can enumerate the modules in a process), and knowing
where a function is in a DLL, we can locate where the function is in the
process's memory space.

Inline hooking involves locating a target function in this manner, then
modifying the code of the target function in order to make the target function
jump to a location the user specifies once it starts executing.

----------------------------------------------------------
MessageBoxA:                 MessageBoxA: [After]
  mov     edi, edi             push offset hookMessageBoxA
  push    ebp                  ret
  mov     ebp, esp             ...
  ...                          ...
----------------------------------------------------------

The greatest advantage of this type of modification over IAT patching is that
it's fairly flexible, and evades a lot of common anti-debugging tricks such as
checking for IAT hooks. Additionally, we are able to hook API's which aren't
imported by the target program (e.g. API's loaded via GetProcAddress API call).
Also, there's greater flexibility in potential hook locations thanks to Win32's
API design:

-------------------------------------------------
ApiFunction() -> ApiFunctionEx() / ApiFunctionW()
-------------------------------------------------

Many Win32 API's are simply wrappers for other API's. For example, MessageBoxA
"subcontracts" it's work to MessageBoxExA, which in turn calls
MessageBoxTimeoutA, which then calls MessageBoxTimeoutW. With inline hooking,
we are able to hook at any point of this call chain, including
MessageBoxTimeoutW, which will also catch MessageBoxW calls.

In summary, the process of inline hooking is as follows:

- Open the target process
- Locate the DLL containing the function we want to hook within the
  target process's memory space
- Locate the target function within the target DLL, map that to the
  memory space
- Inject a DLL containing our custom function, locate our custom
  function within the newly injected DLL, map to memory space of target
  process
- Store first six bytes/prelude (implementation-dependent,
  explained below)of the "old" function
- Patch the "old" function to point to our custom function

2. Implementation Specifics ---------------------------------------------------]

The most important part of any technique is implementation, without
implementation we are nothing. This section is not a step-by-step guide to
implementing an API hooking system, it simply outlines some of the more
common pitfalls with import address table and inline hooking, and how to
avoid them.

2.0. How to get the location of our "replacement" function --------------------]

The easiest method is to parse the export address table of our DLL either
in-memory or on-disk, and retrieve it from there. Alternatively, use
GetProcAddress and LoadLibrary on our custom DLL from within our custom DLL's
DllEntry function.

2.1 I need to hook functions from when a program starts executing,
    I don't want to miss any API calls ----------------------------------------]

The obvious solution would be to use CreateProcess with the CREATE_SUSPENDED
flag to create the thread and parse it - however, you can't inject a DLL into
a process with a suspended primary thread, because the primary thread is
suspended, and can't load your DLL (or any DLLs). Thus, we implement a small
hack - we parse the PEB of the target program (use NtQueryProcessInformation
to find the PEB) to find the image's base address and thus it's entrypoint,
and save it. We then modify the first two bytes at the entrypoint to simply
loop back to itself ("\xEB\xFE", or "jmp $-2"). We use Get/SetThreadContext
to set our thread's instruction pointer (just modify the EIP register in the
CONTEXT structure) to the entrypoint which we located earlier, then
ResumeThread. This puts the thread into an infinite loop, without truly
suspending it - the thread does nothing useful, but continues to execute and
load modules. After loading our DLL's, we SuspendThread again, restore the
first two bytes at the entry point, and use Get/SetThreadContext to reset the
primary thread's instruction pointer, and use ResumeThread to wake our process,
effectively "unfreezing" it. In summary:

- Need to hook right from the beginning of a process?
-- No: ExitProcess(0);
-- Yes:
---- Use CreateProcess[Ex] to create the target process with CREATE_SUSPENDED
---- Use NtQueryProcessInformation to find the PROCESS_BASIC_INFORMATION struct
---- Read the PEB from our target process's memory space (located at
     PROCESS_BASIC_INFORMATION.PebBaseAddress).
---- Find the ImageBaseAddress from the PEB - that's where the core process
     is located in memory. Parse the PE headers to find the process's entry
     point.
---- Save the first two bytes at the entry point, replace them with \xEB\xFE
---- Use GetThreadContext on the primary thread (still suspsended) to get a
     CONTEXT structure. Modify EIP to point to the entry point.
---- Use SetThreadContext to install the modified CONTEXT structure
---- Resume the thread with ResumeThread. The primary thread will now run
     in an infinite loop at the entrypoint, but it won't be suspended.
---- Sleep() for a short period to let the DLL's that were originally
     going to load with the DLL load. When creating a process with
     CREATE_SUSPENDED, not all DLL's are fully loaded when you can first
     take control of the process.
---- [ Load DLL's, hook API's, make general eliteness here ]
---- Suspend the thread again with SuspendThread
---- Restore the two bytes at the entry point
---- Use GetThreadContext on the primary thread to get CONTEXT, set EIP to the
     entry point
---- Use SetThreadContext to install the modified CONTEXT
---- Push button
---- ???
---- Receive 2 bacon, 6 internets and 1 win.

2.2 DLL loading isn't instant, how do I tell when my DLL is loaded? -----------]

Generally, simply use some communications channel between your DLL and your
"injector" applet/loader, such as IPC pipes. Alternatively, use
WaitForDebugEvent with a timeout, and wait for a LOAD_DLL_DEBUG_EVENT, or just
implement a timeout (using a fixed length of time). If you load your process in
a frozen state and hook the API's you need from there.

2.3 How do I call the original function when I use inline hooking? ------------]

There are different ways to restore the the original functionality of the
hooked function when using inline hooking. The best method will depend on how
you patched the original call. One suggestion (my method) is to patch the
original call with a PUSH (offset of your replacement function), followed by
RET. This solves the problem of relative offset calculation (use an absolute
offset in the push), and preserves the stack (unlike CALL). Here's how to
restore the original functionality:

2.3.0 Patch the call (one-off hooks, functions you only need hooked once)
- Retrieve the location of the original hooked API (communicate with the
  process that did the hooking, or LoadLibrary and GetProcAddress)
- Retrieve the bytes overwritten during patching (or hardcode them)
- Restore the first few bytes of that hooked function, write directly to
  memory.

2.3.1 Emulate the first few instructions of the orignal function
- Retrieve the location of the original hooked API (communicate with the
  process that did the hooking, or LoadLibrary and GetProcAddress)
- Recreate the stack as you got it, with the exception of the return addr:

---------------------------------------------------------------------]

For example, the hook function:

hook_recv PROC s:DWORD, buf:DWORD, bLen:DWORD, flags:DWORD

  ;; stack is at state 1 (on the left, diagram below)

	push flags
	push bLen
	push buf
	push s

	;; the stack as you got it is effectively duplicated

	lea eax,retAddr
	push eax

	;; create a new return address

	mov eax,recvOffset

	;; WSARecv
	mov edi,edi
	push ebp
	mov ebp,esp
	push ecx
	add eax,6

	;; stack is at state 2 (on the right, below)
	;; eax is WSARecv + 6, the number of bytes you emulated.
	push eax
	db 0C3h

	retAddr: ret
hook_recv  ENDP

---------------------------------------------------------------------]

Stack frames:

Original stack frame     Recreated Stack Frame

[RETNADDR]               [retAddr]        ---\
[s]                      [s]                  |
[buf]                    [buf]                >-- destroyed by real
recv func
[bLen]                   [bLen]               |
[flags]                  [flags]          ---/
                         [RETNADDR]
                         [s]
                         [buf]
                         [bLen]
                         [flags]

3. Summary --------------------------------------------------------------------]

In summary, API hooking is a powerful and flexible, but under-utilised
technique. In this document, two methods are outlined for implementing
user-land API hooking, and will hopefully help you write your own custom
API hooking system.

Happy hacking. Remember - cheat for life, cheat to win.                  [ EOF ]
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: