Hooking

Last updated

In computer programming, the term hooking covers a range of techniques used to alter or augment the behaviour of an operating system, of applications, or of other software components by intercepting function calls or messages or events passed between software components. Code that handles such intercepted function calls, events or messages is called a hook.

Contents

Hook methods are of particular importance in the Template Method Pattern where common code in an abstract class can be augmented by custom code in a subclass. In this case each hook method is defined in the abstract class with an empty implementation which then allows a different implementation to be supplied in each concrete subclass.

Hooking is used for many purposes, including debugging and extending functionality. Examples might include intercepting keyboard or mouse event messages before they reach an application, or intercepting operating system calls in order to monitor behavior or modify the function of an application or other component. It is also widely used in benchmarking programs, for example frame rate measuring in 3D games, where the output and input is done through hooking.

Hooking can also be used by malicious code. For example, rootkits, pieces of software that try to make themselves invisible by faking the output of API calls that would otherwise reveal their existence, often use hooking techniques.

Methods

Typically hooks are inserted while software is already running, but hooking is a tactic that can also be employed prior to the application being started. Both these techniques are described in greater detail below.

Source modification

Hooking can be achieved by modifying the source of the executable or library before an application is running, through techniques of reverse engineering. This is typically used to intercept function calls to either monitor or replace them entirely.

For example, by using a disassembler, the entry point of a function within a module can be found. It can then be altered to instead dynamically load some other library module and then have it execute desired methods within that loaded library. If applicable, another related approach by which hooking can be achieved is by altering the import table of an executable. This table can be modified to load any additional library modules as well as changing what external code is invoked when a function is called by the application.

An alternative method for achieving function hooking is by intercepting function calls through a wrapper library. A wrapper is a version of a library that an application loads, with all the same functionality of the original library that it will replace. That is, all the functions that are accessible are essentially the same between the original and the replacement. This wrapper library can be designed to call any of the functionality from the original library, or replace it with an entirely new set of logic.

Runtime modification

Operating systems and software may provide the means to easily insert event hooks at runtime. It is available provided that the process inserting the hook is granted enough permission to do so. Microsoft Windows for example, allows users to insert hooks that can be used to process or modify system events and application events for dialogs, scrollbars, and menus as well as other items. It also allows a hook to insert, remove, process or modify keyboard and mouse events. Linux provides another example where hooks can be used in a similar manner to process network events within the kernel through NetFilter.

When such functionality is not provided, a special form of hooking employs intercepting the library function calls made by a process. Function hooking is implemented by changing the very first few code instructions of the target function to jump to an injected code. Alternatively on systems using the shared library concept, the interrupt vector table or the import descriptor table can be modified in memory. Essentially these tactics employ the same ideas as those of source modification, but instead altering instructions and structures located in the memory of a process once it is already running.

Sample code

Virtual method table hooking

Whenever a class defines/inherits a virtual function (or method), compilers add a hidden member variable to the class which points to a virtual method table (VMT or Vtable). Most compilers place the hidden VMT pointer at the first 4 bytes of every instance of the class. A VMT is basically an array of pointers to all the virtual functions that instances of the class may call. At runtime these pointers are set to point to the right functions, because at compile time, it is not yet known if the base function is to be called or if an overridden version of the function from a derived class is to be called (thereby allowing for polymorphism). Therefore, virtual functions can be hooked by replacing the pointers to them within any VMT that they appear. The code below shows an example of a typical VMT hook in Microsoft Windows, written in C++. [1]

#include<iostream>#include"windows.h"usingnamespacestd;classVirtualClass{public:intnumber;virtualvoidVirtualFn1()//This is the virtual function that will be hooked.{cout<<"VirtualFn1 called "<<number++<<"\n\n";}};usingVirtualFn1_t=void(__thiscall*)(void*thisptr);VirtualFn1_torig_VirtualFn1;void__fastcallhkVirtualFn1(void*thisptr,intedx)//This is our hook function which we will cause the program to call instead of the original VirtualFn1 function after hooking is done.{cout<<"Hook function called"<<"\n";orig_VirtualFn1(thisptr);//Call the original function.}intmain(){VirtualClass*myClass=newVirtualClass();//Create a pointer to a dynamically allocated instance of VirtualClass.void**vTablePtr=*reinterpret_cast<void***>(myClass);//Find the address that points to the base of VirtualClass' VMT (which then points to VirtualFn1) and store it in vTablePtr.DWORDoldProtection;VirtualProtect(vTablePtr,4,PAGE_EXECUTE_READWRITE,&oldProtection);//Removes page protection at the start of the VMT so we can overwrite its first pointer.orig_VirtualFn1=reinterpret_cast<VirtualFn1_t>(*vTablePtr);//Stores the pointer to VirtualFn1 from the VMT in a global variable so that it can be accessed again later after its entry in the VMT has been //overwritten with our hook function.*vTablePtr=&hkVirtualFn1;//Overwrite the pointer to VirtualFn1 within the virtual table to a pointer to our hook function (hkVirtualFn1).VirtualProtect(vTablePtr,4,oldProtection,0);//Restore old page protection.myClass->VirtualFn1();//Call the virtual function from our class instance. Because it is now hooked, this will actually call our hook function (hkVirtualFn1).myClass->VirtualFn1();myClass->VirtualFn1();deletemyClass;return0;}

It is important to note that all virtual functions must be class member functions, and all (non-static) class member functions are called with the __thiscall calling convention (unless the member function takes a variable number of arguments, in which case it is called with __cdecl). The __thiscall calling convention passes a pointer to the calling class instance (commonly referred to as a "this" pointer) via the ECX register (on the x86 architecture). Therefore, in order for a hook function to properly intercept the "this" pointer that is passed and take it as an argument, it must look into the ECX register. In the above example, this is done by setting the hook function (hkVirtualFn1) to use the __fastcall calling convention, which causes the hook function to look into the ECX register for one of its arguments.

Also note that, in the above example, the hook function (hkVirtualFn1) is not a member function itself so it cannot use the __thiscall calling convention. __fastcall has to be used instead because it is the only other calling convention that looks into the ECX register for an argument.

C# keyboard event hook

The following example will hook into keyboard events in Microsoft Windows using the Microsoft .NET Framework.

usingSystem.Runtime.InteropServices;namespaceHooks;publicclassKeyHook{/* Member variables */protectedstaticintHook;protectedstaticLowLevelKeyboardDelegateDelegate;protectedstaticreadonlyobjectLock=newobject();protectedstaticboolIsRegistered=false;/* DLL imports */[DllImport("user32")]privatestaticexternintSetWindowsHookEx(intidHook,LowLevelKeyboardDelegatelpfn,inthmod,intdwThreadId);[DllImport("user32")]privatestaticexternintCallNextHookEx(inthHook,intnCode,intwParam,KBDLLHOOKSTRUCTlParam);[DllImport("user32")]privatestaticexternintUnhookWindowsHookEx(inthHook);/* Types & constants */protecteddelegateintLowLevelKeyboardDelegate(intnCode,intwParam,refKBDLLHOOKSTRUCTlParam);privateconstintHC_ACTION=0;privateconstintWM_KEYDOWN=0x0100;privateconstintWM_KEYUP=0x0101;privateconstintWH_KEYBOARD_LL=13;[StructLayout(LayoutKind.Sequential)]publicstructKBDLLHOOKSTRUCT{publicintvkCode;publicintscanCode;publicintflags;publicinttime;publicintdwExtraInfo;}/* Methods */staticprivateintLowLevelKeyboardHandler(intnCode,intwParam,refKBDLLHOOKSTRUCTlParam){if(nCode==HC_ACTION){if(wParam==WM_KEYDOWN)System.Console.Out.WriteLine("Key Down: "+lParam.vkCode);elseif(wParam==WM_KEYUP)System.Console.Out.WriteLine("Key Up: "+lParam.vkCode);}returnCallNextHookEx(Hook,nCode,wParam,lParam);}publicstaticboolRegisterHook(){lock(Lock){if(IsRegistered)returntrue;Delegate=LowLevelKeyboardHandler;Hook=SetWindowsHookEx(WH_KEYBOARD_LL,Delegate,Marshal.GetHINSTANCE(System.Reflection.Assembly.GetExecutingAssembly().GetModules()[0]).ToInt32(),0);if(Hook!=0)returnIsRegistered=true;Delegate=null;returnfalse;}}publicstaticboolUnregisterHook(){lock(Lock){returnIsRegistered=(UnhookWindowsHookEx(Hook)!=0);}}}

API/function hooking/interception using JMP instruction aka splicing

The following source code is an example of an API/function hooking method which hooks by overwriting the first six bytes of a destination function with a JMP instruction to a new function. The code is compiled into a DLL file then loaded into the target process using any method of DLL injection. Using a backup of the original function one might then restore the first six bytes again so the call will not be interrupted. In this example the win32 API function MessageBoxW is hooked. [2]

/* This idea is based on chrom-lib approach, Distributed under GNU LGPL License. Source chrom-lib: https://github.com/linuxexp/chrom-lib Copyright (C) 2011  Raja Jamwal*/#include<windows.h>#define SIZE 6typedefint(WINAPI*pMessageBoxW)(HWND,LPCWSTR,LPCWSTR,UINT);// Messagebox prototypeintWINAPIMyMessageBoxW(HWND,LPCWSTR,LPCWSTR,UINT);// Our detourvoidBeginRedirect(LPVOID);pMessageBoxWpOrigMBAddress=NULL;// address of originalBYTEoldBytes[SIZE]={0};// backupBYTEJMP[SIZE]={0};// 6 byte JMP instructionDWORDoldProtect,myProtect=PAGE_EXECUTE_READWRITE;INTAPIENTRYDllMain(HMODULEhDLL,DWORDReason,LPVOIDReserved){switch(Reason){caseDLL_PROCESS_ATTACH:// if attachedpOrigMBAddress=(pMessageBoxW)GetProcAddress(GetModuleHandleA("user32.dll"),// get address of original "MessageBoxW");if(pOrigMBAddress!=NULL)BeginRedirect(MyMessageBoxW);// start detouringbreak;caseDLL_PROCESS_DETACH:VirtualProtect((LPVOID)pOrigMBAddress,SIZE,myProtect,&oldProtect);// assign read write protectionmemcpy(pOrigMBAddress,oldBytes,SIZE);// restore backupVirtualProtect((LPVOID)pOrigMBAddress,SIZE,oldProtect,&myProtect);// reset protectioncaseDLL_THREAD_ATTACH:caseDLL_THREAD_DETACH:break;}returnTRUE;}voidBeginRedirect(LPVOIDnewFunction){BYTEtempJMP[SIZE]={0xE9,0x90,0x90,0x90,0x90,0xC3};// 0xE9 = JMP 0x90 = NOP 0xC3 = RETmemcpy(JMP,tempJMP,SIZE);// store jmp instruction to JMPDWORDJMPSize=((DWORD)newFunction-(DWORD)pOrigMBAddress-5);// calculate jump distanceVirtualProtect((LPVOID)pOrigMBAddress,SIZE,// assign read write protectionPAGE_EXECUTE_READWRITE,&oldProtect);memcpy(oldBytes,pOrigMBAddress,SIZE);// make backupmemcpy(&JMP[1],&JMPSize,4);// fill the nop's with the jump distance (JMP,distance(4bytes),RET)memcpy(pOrigMBAddress,JMP,SIZE);// set jump instruction at the beginning of the original functionVirtualProtect((LPVOID)pOrigMBAddress,SIZE,oldProtect,&myProtect);// reset protection}intWINAPIMyMessageBoxW(HWNDhWnd,LPCWSTRlpText,LPCWSTRlpCaption,UINTuiType){VirtualProtect((LPVOID)pOrigMBAddress,SIZE,myProtect,&oldProtect);// assign read write protectionmemcpy(pOrigMBAddress,oldBytes,SIZE);// restore backupintretValue=MessageBoxW(hWnd,lpText,lpCaption,uiType);// get return value of original functionmemcpy(pOrigMBAddress,JMP,SIZE);// set the jump instruction againVirtualProtect((LPVOID)pOrigMBAddress,SIZE,oldProtect,&myProtect);// reset protectionreturnretValue;// return original return value}

Netfilter hook

This example shows how to use hooking to alter network traffic in the Linux kernel using Netfilter.

#include<linux/module.h>#include<linux/kernel.h>#include<linux/skbuff.h>#include<linux/ip.h>#include<linux/tcp.h>#include<linux/in.h>#include<linux/netfilter.h>#include<linux/netfilter_ipv4.h>/* Port we want to drop packets on */staticconstuint16_tport=25;/* This is the hook function itself */staticunsignedinthook_func(unsignedinthooknum,structsk_buff**pskb,conststructnet_device*in,conststructnet_device*out,int(*okfn)(structsk_buff*)){structiphdr*iph=ip_hdr(*pskb);structtcphdr*tcph,tcpbuf;if(iph->protocol!=IPPROTO_TCP)returnNF_ACCEPT;tcph=skb_header_pointer(*pskb,ip_hdrlen(*pskb),sizeof(*tcph),&tcpbuf);if(tcph==NULL)returnNF_ACCEPT;return(tcph->dest==port)?NF_DROP:NF_ACCEPT;}/* Used to register our hook function */staticstructnf_hook_opsnfho={.hook=hook_func,.hooknum=NF_IP_PRE_ROUTING,.pf=NFPROTO_IPV4,.priority=NF_IP_PRI_FIRST,};static__initintmy_init(void){returnnf_register_hook(&nfho);}static__exitvoidmy_exit(void){nf_unregister_hook(&nfho);}module_init(my_init);module_exit(my_exit);

Internal IAT hooking

The following code demonstrates how to hook functions that are imported from another module. This can be used to hook functions in a different process from the calling process. For this the code must be compiled into a DLL file then loaded into the target process using any method of DLL injection. The advantage of this method is that it is less detectable by antivirus software and/or anti-cheat software, one might make this into an external hook that doesn't make use of any malicious calls. The Portable Executable header contains the Import Address Table (IAT), which can be manipulated as shown in the source below. The source below runs under Microsoft Windows.

#include<windows.h>typedefint(__stdcall*pMessageBoxA)(HWNDhWnd,LPCSTRlpText,LPCSTRlpCaption,UINTuType);//This is the 'type' of the MessageBoxA call.pMessageBoxARealMessageBoxA;//This will store a pointer to the original function.voidDetourIATptr(constchar*function,void*newfunction,HMODULEmodule);int__stdcallNewMessageBoxA(HWNDhWnd,LPCSTRlpText,LPCSTRlpCaption,UINTuType){//Our fake functionprintf("The String Sent to MessageBoxA Was : %s\n",lpText);returnRealMessageBoxA(hWnd,lpText,lpCaption,uType);//Call the real function}intmain(intargc,CHAR*argv[]){DetourIATptr("MessageBoxA",(void*)NewMessageBoxA,0);//Hook the functionMessageBoxA(NULL,"Just A MessageBox","Just A MessageBox",0);//Call the function -- this will invoke our fake hook.return0;}void**IATfind(constchar*function,HMODULEmodule){//Find the IAT (Import Address Table) entry specific to the given function.intip=0;if(module==0)module=GetModuleHandle(0);PIMAGE_DOS_HEADERpImgDosHeaders=(PIMAGE_DOS_HEADER)module;PIMAGE_NT_HEADERSpImgNTHeaders=(PIMAGE_NT_HEADERS)((LPBYTE)pImgDosHeaders+pImgDosHeaders->e_lfanew);PIMAGE_IMPORT_DESCRIPTORpImgImportDesc=(PIMAGE_IMPORT_DESCRIPTOR)((LPBYTE)pImgDosHeaders+pImgNTHeaders->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT].VirtualAddress);if(pImgDosHeaders->e_magic!=IMAGE_DOS_SIGNATURE)printf("libPE Error : e_magic is no valid DOS signature\n");for(IMAGE_IMPORT_DESCRIPTOR*iid=pImgImportDesc;iid->Name!=NULL;iid++){for(intfuncIdx=0;*(funcIdx+(LPVOID*)(iid->FirstThunk+(SIZE_T)module))!=NULL;funcIdx++){char*modFuncName=(char*)(*(funcIdx+(SIZE_T*)(iid->OriginalFirstThunk+(SIZE_T)module))+(SIZE_T)module+2);constuintptr_tnModFuncName=(uintptr_t)modFuncName;boolisString=!(nModFuncName&(sizeof(nModFuncName)==4?0x80000000:0x8000000000000000));if(isString){if(!_stricmp(function,modFuncName))returnfuncIdx+(LPVOID*)(iid->FirstThunk+(SIZE_T)module);}}}return0;}voidDetourIATptr(constchar*function,void*newfunction,HMODULEmodule){void**funcptr=IATfind(function,module);if(*funcptr==newfunction)return;DWORDoldrights,newrights=PAGE_READWRITE;//Update the protection to READWRITEVirtualProtect(funcptr,sizeof(LPVOID),newrights,&oldrights);RealMessageBoxA=(pMessageBoxA)*funcptr;//Some compilers require the cast (like "MinGW"), not sure about MSVC though*funcptr=newfunction;//Restore the old memory protection flags.VirtualProtect(funcptr,sizeof(LPVOID),oldrights,&newrights);}

See also

Related Research Articles

Common Intermediate Language (CIL), formerly called Microsoft Intermediate Language (MSIL) or Intermediate Language (IL), is the intermediate language binary instruction set defined within the Common Language Infrastructure (CLI) specification. CIL instructions are executed by a CIL-compatible runtime environment such as the Common Language Runtime. Languages which target the CLI compile to CIL. CIL is object-oriented, stack-based bytecode. Runtimes typically just-in-time compile CIL instructions into native code.

The bridge pattern is a design pattern used in software engineering that is meant to "decouple an abstraction from its implementation so that the two can vary independently", introduced by the Gang of Four. The bridge uses encapsulation, aggregation, and can use inheritance to separate responsibilities into different classes.

Multiple dispatch or multimethods is a feature of some programming languages in which a function or method can be dynamically dispatched based on the run-time (dynamic) type or, in the more general case, some other attribute of more than one of its arguments. This is a generalization of single-dispatch polymorphism where a function or method call is dynamically dispatched based on the derived type of the object on which the method has been called. Multiple dispatch routes the dynamic dispatch to the implementing function or method using the combined characteristics of one or more arguments.

In mathematics and computer science, a higher-order function (HOF) is a function that does at least one of the following:

<span class="mw-page-title-main">C syntax</span> Set of rules defining correctly structured programs

The syntax of the C programming language is the set of rules governing writing of software in C. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

In computer programming, a function object is a construct allowing an object to be invoked or called as if it were an ordinary function, usually with the same syntax. In some languages, particularly C++, function objects are often called functors.

In the C++ programming language, a reference is a simple reference datatype that is less powerful but safer than the pointer type inherited from C. The name C++ reference may cause confusion, as in computer science a reference is a general concept datatype, with pointers and C++ references being specific reference datatype implementations. The definition of a reference in C++ is such that it does not need to exist. It can be implemented as a new name for an existing object.

In computer programming, an entry point is the place in a program where the execution of a program begins, and where the program has access to command line arguments.

In some programming languages, const is a type qualifier, which indicates that the data is read-only. While this can be used to declare constants, const in the C family of languages differs from similar constructs in other languages in that it is part of the type, and thus has complicated behavior when combined with pointers, references, composite data types, and type-checking. In other languages, the data is not in a single memory location, but copied at compile time for each use. Languages which use it include C, C++, D, JavaScript, Julia, and Rust.

typedef is a reserved keyword in the programming languages C, C++, and Objective-C. It is used to create an additional name (alias) for another data type, but does not create a new type, except in the obscure case of a qualified typedef of an array type where the typedef qualifiers are transferred to the array element type. As such, it is often used to simplify the syntax of declaring complex data structures consisting of struct and union types, although it is also commonly used to provide specific descriptive type names for integer data types of varying sizes.

A dynamic-link library (DLL) is a shared library in the Microsoft Windows or OS/2 operating system.

Platform Invocation Services, commonly referred to as P/Invoke, is a feature of Common Language Infrastructure implementations, like Microsoft's Common Language Runtime, that enables managed code to call native code.

The curiously recurring template pattern (CRTP) is an idiom, originally in C++, in which a class X derives from a class template instantiation using X itself as a template argument. More generally it is known as F-bound polymorphism, and it is a form of F-bounded quantification.

In computer programming, DLL injection is a technique used for running code within the address space of another process by forcing it to load a dynamic-link library. DLL injection is often used by external programs to influence the behavior of another program in a way its authors did not anticipate or intend. For example, the injected code could hook system function calls, or read the contents of password textboxes, which cannot be done the usual way. A program used to inject arbitrary code into arbitrary processes is called a DLL injector.

C++11 is a version of the ISO/IEC 14882 standard for the C++ programming language. C++11 replaced the prior version of the C++ standard, called C++03, and was later replaced by C++14. The name follows the tradition of naming language versions by the publication year of the specification, though it was formerly named C++0x because it was expected to be published before 2010.

In computer programming, an anonymous function is a function definition that is not bound to an identifier. Anonymous functions are often arguments being passed to higher-order functions or used for constructing the result of a higher-order function that needs to return a function. If the function is only used once, or a limited number of times, an anonymous function may be syntactically lighter than using a named function. Anonymous functions are ubiquitous in functional programming languages and other languages with first-class functions, where they fulfil the same role for the function type as literals do for other data types.

Dynamic loading is a mechanism by which a computer program can, at run time, load a library into memory, retrieve the addresses of functions and variables contained in the library, execute those functions or access those variables, and unload the library from memory. It is one of the 3 mechanisms by which a computer program can use some other software; the other two are static linking and dynamic linking. Unlike static linking and dynamic linking, dynamic loading allows a computer program to start up in the absence of these libraries, to discover available libraries, and to potentially gain additional functionality.

This article describes the syntax of the C# programming language. The features described are compatible with .NET Framework and Mono.

In computer programming, variadic templates are templates that take a variable number of arguments.

<span class="mw-page-title-main">Nim (programming language)</span> Programming language

Nim is a general-purpose, multi-paradigm, statically typed, compiled high-level systems programming language, designed and developed by a team around Andreas Rumpf. Nim is designed to be "efficient, expressive, and elegant", supporting metaprogramming, functional, message passing, procedural, and object-oriented programming styles by providing several features such as compile time code generation, algebraic data types, a foreign function interface (FFI) with C, C++, Objective-C, and JavaScript, and supporting compiling to those same languages as intermediate representations.

References

Windows

Linux

Emacs

OS X and iOS

In Depth API Hooking