Ntoskrnl.exe

Last updated

ntoskrnl.exe (short for Windows NT operating system kernel executable), also known as the kernel image, contains the kernel and executive layers of the Microsoft Windows NT kernel, and is responsible for hardware abstraction, process handling, and memory management. In addition to the kernel and executive layers, it contains the cache manager, security reference monitor, memory manager, scheduler (Dispatcher), and blue screen of death (the prose and portions of the code). [1]

Contents

Overview

x86 versions of ntoskrnl.exe depend on bootvid.dll, hal.dll and kdcom.dll (x64 variants of ntoskrnl.exe have these dlls embedded in the kernel to improve performance). However, it is not a native application. In other words, it is not linked against ntdll.dll . Instead, ntoskrnl.exe containing a standard "start" entry point that calls the architecture-independent kernel initialization function. Because it requires a static copy of the C Runtime objects, the executable is usually about 10 MB in size.

In Windows XP and earlier, the Windows installation source ships four kernel image files to support uniprocessor systems, symmetric multiprocessor (SMP) systems, CPUs with PAE, and CPUs without PAE. Windows setup decides whether the system is uniprocessor or multiprocessor, then, installs both the PAE and non-PAE variants of the kernel image for the decided kind. On a multiprocessor system, Setup installs ntkrnlmp.exe and ntkrpamp.exe but renames them to ntoskrnl.exe and ntkrnlpa.exe respectively.

Starting with Windows Vista, Microsoft began unifying the kernel images as multi-core CPUs took to the market and PAE became mandatory.

Kernel image filenames
32-bit Windows
FilenameSupports
SMP
Supports
PAE
32-bit kernel
ntoskrnl.exeNoNo
ntkrnlmp.exeYesNo
ntkrnlpa.exeNoYes
ntkrpamp.exeYesYes
64-bit kernel (x64 editions)
FilenameSupports
SMP
Supports
57 bit VA
ntkrnlmp.exeYesNo
ntkrla57.exeYesYes

Routines in ntoskrnl use prefixes on their names to indicate in which component of ntoskrnl they are defined.

Since not all functions are being exported by the kernel, function prefixes ending in i or p (such as Mi, Obp, Iop) are internal and not supposed to be accessed by the user. These functions contain the core code and implements important checks, such as for vulnerabilities, missing arguments and exception handling.

The following table lists some of them.

NT function prefixes
PrefixMeaning
CcFile system cache [2]
CmConfiguration Manager, the kernel mode side of Windows Registry
CsrFunctions used to communicate with the Win32 subsystem process, csrss.exe (csrss stands for client/server runtime sub-system)
DbgDebugging aid functions, such as a software break point
ExWindows executive, an "outer layer" of Ntoskrnl.exe
ExpWindows executive private (routines intended for the internal use of Windows Executive)
FsRtlFile system runtime library [3]
IoI/O manager [4]
IopInternal and base code for the I/O Manager
KeCore kernel routines [5]
KiInternal core routines
Kx Interrupt handling, semaphores, spinlocks, multithreading and context switching functions
KyInternal and stub functions that generate a trap frame and call Kx-prefixed functions
KsKernel streaming
KxInternal functions for acquiring spinlock and semaphore implementations
LdrNT's PE Executables loader
LdrpInternal base code for the PE Loader, usually just PE relocate implementation.
Lpc Local Procedure Call, an internal, undocumented, interprocess or user/kernel message passing mechanism
Lsa Local Security Authority
MmMemory management
MiMemory management routines not exported for call outside the memory manager (i = internal)
NlsNls for Native Language Support (similar to code pages).
Ob Object Manager
ObpInternal base code for the Object Manager
PfxPrefix handling
Po Plug-and-play and power management [6]
Ps Process and thread management (task management)
PspInternal base code for the Task manager
Rtl Runtime library, i.e., many utility functions that can be used by native applications, yet don't directly involve kernel support
RtlpRuntime library private (for internal use only)
SeSecurity Manager, access token for the Win32 API
SepInternal base code for the Security Manager
Vf Driver Verifier
ViDriver Verifier routines not exported for call outside the driver verifier
ZwNt or Zw are system calls declared in ntdll.dll and ntoskrnl.exe. When called from ntdll.dll in user mode, these groups are almost exactly the same; they trap into kernel mode and call the equivalent function in ntoskrnl.exe via the SSDT. When calling the functions directly in ntoskrnl.exe (only possible in kernel mode), the Zw variants ensure kernel mode, whereas the Nt variants do not. [7]

Initialization

When the kernel receives control, it gets a struct-type pointer from bootloader. The pointer's destination contains information about the hardware, the path to the Windows Registry file, kernel parameters containing boot preferences or options that change the behavior of the kernel, path of the files loaded by the bootloader (SYSTEM Registry hive, nls for character encoding conversion, and vga font). [8] The definition of this structure can be retrieved by using the kernel debugger or downloading it from the Microsoft symbol database. [9] [ page needed ]

In the x86 architecture, the kernel receives the system already in protected mode, with the GDT, IDT and TSS ready.[ further explanation needed ] But since it does not know the address of each one, it has to load them one by one to fill the PCR structure.[ jargon ]

The main entry point of ntoskrnl.exe performs some system dependent initialization then calls a system independent initialization then enters an idle loop.[ contradictory ]

Interrupt handling

Modern operating systems use interrupts instead of I/O port polling to wait for information from devices.

In the x86 architecture, interrupts are handled through the Interrupt Dispatch Table (IDT). When a device triggers an interrupt and the interrupt flag (IF) in the FLAGS register is set, the processor's hardware looks for an interrupt handler in the table entry corresponding to the interrupt number to which in turn has been translated from IRQ by PIC chips, or in more modern hardwares, APIC. Interrupt handlers usually save some subset of the state of registers before handling it and restore them back to their original values when done.

The interrupt table contains handlers for hardware interrupts, software interrupts, and exceptions. For some IA-32 versions of the kernel, one example of such a software interrupt handler (of which there are many) is in its IDT table entry 2E16 (hexadecimal; 46 in decimal), used in assembly language as INT 2EH for system calls. In the real implementation the entry points to an internal subroutine named (as per symbol information published by Microsoft) KiSystemService. For newer versions, different mechanisms making use of SYSENTER instruction and in x86-64 SYSCALL instruction are used instead.

One notable feature of NT's interrupt handling is that interrupts are usually conditionally masked based on their priority (called "IRQL"), instead of disabling all IRQs via the interrupt flag. This permits various kernel components to carry on critical operations without necessarily blocking services of peripherals and other devices. [10]

Memory manager

The entire physical memory (RAM) address range is broken into many small blocks also called pages, 4KB in size each, and mapped to virtual addresses. A few of the properties of each block are stored in structures called page table entries, which are managed by the OS and accessed by the processor's hardware. Page tables are organized into a tree structure, and the physical page number of the top-level table is stored in control register 3 (CR3).


Microsoft Windows divides virtual address space into two regions. The lower part, starting at zero, is instantiated separately for each process and is accessible from both user and kernel mode. Application programs run in processes and supply code that runs in user mode. The upper part is accessible only from kernel mode, and with some exceptions, is instantiated just once, system-wide. Ntoskrnl.exe is mapped into this region, as are several other kernel mode components. This region also contains data used by kernel mode code, such as the kernel mode heaps and the file system cache.

Virtual Address Space Layouts [9]
ArchMmHighestUserAddressMmSystemRangeStart
x86 [lower-alpha 1] 0x7fffffff0x80000000
ARM
x86-640x000007ff'ffffffff0xffff8000'00000000

Registry

Windows Registry is a repository for configuration and settings information for the operating system and for other software, such as applications. It can be thought of as a filesystem optimized for small files. [11] However, it is not accessed through file system-like semantics, but rather through a specialized set of APIs, implemented in kernel mode and exposed to user mode.

The registry is stored on disk as several different files called "hives." One, the System hive, is loaded early in the boot sequence and provides configuration information required at that time. Additional registry hives, providing software-specific and user-specific data, are loaded during later phases of system initialization and during user login, respectively.

Drivers

The list of drivers to be loaded from the disk are retrieved from the Services key of the current control set's key in the SYSTEM registry hive. That key stores device drivers, kernel processes and user processes. They are all collectively called "services" and are all stored mixed on the same place.

During initialization or upon driver load request, the kernel traverses that tree looking for services tagged as kernel services.

See also

Notes

  1. Tunable via /userva or /3gb switch.

Related Research Articles

A terminate-and-stay-resident program is a computer program running under DOS that uses a system call to return control to DOS as though it has finished, but remains in computer memory so it can be reactivated later. This technique partially overcame DOS's limitation of executing only one program, or task, at a time. TSRs are used only in DOS, not in Windows.

<span class="mw-page-title-main">Windows API</span> Microsofts core set of application programming interfaces on Windows

The Windows API, informally WinAPI, is the foundational application programming interface (API) that allows a computer program to access the features of the Microsoft Windows operating system in which the program is running.

x86-64 64-bit version of x86 architecture

x86-64 is a 64-bit version of the x86 instruction set, first announced in 1999. It introduced two new modes of operation, 64-bit mode and compatibility mode, along with a new 4-level paging mode.

NTLDR is the boot loader for all releases of Windows NT operating system from 1993 with the release of Windows NT 3.1 up until Windows XP and Windows Server 2003. From Windows Vista onwards it was replaced by the BOOTMGR bootloader. NTLDR is typically run from the primary storage device, but it can also run from portable storage devices such as a CD-ROM, USB flash drive, or floppy disk. NTLDR can also load a non NT-based operating system given the appropriate boot sector in a file.

<span class="mw-page-title-main">Virtual address space</span>

In computing, a virtual address space (VAS) or address space is the set of ranges of virtual addresses that an operating system makes available to a process. The range of virtual addresses usually starts at a low address and can extend to the highest address allowed by the computer's instruction set architecture and supported by the operating system's pointer size implementation, which can be 4 bytes for 32-bit or 8 bytes for 64-bit OS versions. This provides several benefits, one of which is security through process isolation assuming each process is given a separate address space.

<span class="mw-page-title-main">Windows Registry</span> Database for Microsoft Windows

The Windows Registry is a hierarchical database that stores low-level settings for the Microsoft Windows operating system and for applications that opt to use the registry. The kernel, device drivers, services, Security Accounts Manager, and user interfaces can all use the registry. The registry also allows access to counters for profiling system performance.

Virtual DOS machines (VDM) refer to a technology that allows running 16-bit/32-bit DOS and 16-bit Windows programs when there is already another operating system running and controlling the hardware.

<span class="mw-page-title-main">Architecture of Windows NT</span> Overview of the architecture of the Microsoft Windows NT line of operating systems

The architecture of Windows NT, a line of operating systems produced and sold by Microsoft, is a layered design that consists of two main components, user mode and kernel mode. It is a preemptive, reentrant multitasking operating system, which has been designed to work with uniprocessor and symmetrical multiprocessor (SMP)-based computers. To process input/output (I/O) requests, it uses packet-driven I/O, which utilizes I/O request packets (IRPs) and asynchronous I/O. Starting with Windows XP, Microsoft began making 64-bit versions of Windows available; before this, there were only 32-bit versions of these operating systems.

The Native API is a lightweight application programming interface (API) used by Windows NT and user mode applications. This API is used in the early stages of Windows NT startup process, when other components and APIs are still unavailable. Therefore, a few Windows components, such as the Client/Server Runtime Subsystem (CSRSS), are implemented using the Native API. The Native API is also used by subroutines such as those in kernel32.dll that implement the Windows API, the API based on which most of the Windows components are created.

The interrupt descriptor table (IDT) is a data structure used by the x86 architecture to implement an interrupt vector table. The IDT is used by the processor to determine the correct response to interrupts and exceptions.

The Session Manager Subsystem, or smss.exe, is a component of the Microsoft Windows NT family of operating systems, starting in Windows NT 3.1. It is executed during the startup process of those operating systems.

The Microsoft Windows family of operating systems employ some specific exception handling mechanisms.

The Microsoft Windows operating system supports a form of shared libraries known as "dynamic-link libraries", which are code libraries that can be used by multiple processes while only one copy is loaded into memory. This article provides an overview of the core libraries that are included with every modern Windows installation, on top of which most Windows applications are built.

The Client/Server Runtime Subsystem, or csrss.exe, is a component of the Windows NT family of operating systems that provides the user mode side of the Win32 subsystem. In modern versions of Windows, it is primarily involved with process and thread management, console window handling, side-by-side assembly loading and the shutdown process. Historically, it had also been responsible for window management and graphics rendering, however, these operations have been moved to kernel mode starting with Windows NT 4.0 to improve performance.

The booting process of Microsoft Windows varies between different releases.

Windows NT is a proprietary graphical operating system produced by Microsoft as part of its Windows product line, the first version of which was released on July 27, 1993, and it lives on today since the latest version of Windows, 11, includes its technology.

In computing on Microsoft platforms, WoW64 is a subsystem of the Windows operating system capable of running 32-bit applications on 64-bit Windows. It is included in all 64-bit versions of Windows, except in Windows Server Server Core where it is an optional component, and Windows Nano Server where it is not included. WoW64 aims to take care of many of the differences between 32-bit Windows and 64-bit Windows, particularly involving structural changes to Windows itself.

<span class="mw-page-title-main">Architecture of Windows 9x</span>

The Windows 9x series of operating systems refers to the monolithic kernel which powers these operating systems. The basic code is similar in function to MS-DOS. As a 16-/32-bit hybrid, it requires support from MS-DOS to operate.

An Interrupt Request Level (IRQL) is a hardware-independent means with which Windows prioritizes interrupts that come from the system's processors. On processor architectures on which Windows runs, hardware generates signals that are sent to an interrupt controller. The interrupt controller sends an interrupt request to the CPU with a certain priority level, and the CPU sets a mask that causes any other interrupts with a lower priority to be put into a pending state, until the CPU releases control back to the interrupt controller. If a signal comes in at a higher priority, then the current interrupt will be put into a pending state; the CPU sets the interrupt mask to the priority and places any interrupts with a lower priority into a pending state until the CPU finishes handling the new, higher priority interrupt.

<span class="mw-page-title-main">Longene</span> Linux distribution

Longene is a Linux-based operating system kernel intended to be binary compatible with application software and device drivers made for Microsoft Windows and Linux. As of 1.0-rc2, it consists of a Linux kernel module implementing aspects of the Windows kernel and a modified Wine distribution designed to take advantage of the more native interface. Longene is written in the C programming language and is free and open source software. It is licensed under the terms of the GNU General Public License version 2 (GPLv2).

References

  1. Russinovich, M: Systems Internals Tips and Trivia, SysInternals Information
  2. Microsoft Corporation (2009). "Cache Manager Routines". Microsoft Corporation . Retrieved 2009-06-13.
  3. Microsoft Corporation (2009). "File System Runtime Library Routines". Microsoft Corporation . Retrieved 2009-06-13.
  4. Microsoft Corporation (2009). "I/O Manager Routines". Microsoft Corporation . Retrieved 2009-06-13.
  5. Microsoft Corporation (2009). "Core Kernel Library Support Routines". Microsoft Corporation . Retrieved 2009-06-13.
  6. Microsoft Corporation (2009). "Power Manager Routines". Microsoft Corporation . Retrieved 2009-06-13.
  7. The NT Insider (August 27, 2003). "Nt vs. Zw - Clearing Confusion On The Native API". OSR Online. 10 (4). OSR Open Systems Resources. Retrieved 2013-09-16.
  8. "struct LOADER_PARAMETER_BLOCK". www.nirsoft.net.
  9. 1 2 Practical Reverse Engineering Using X86, X64, Arm, Windows Kernel, and Reversing Tools. John Wiley & Sons Inc. 2014. ISBN   978-1118787311.
  10. CC Hameed (January 22, 2008). "What is IRQL and why is it important? | Ask the Performance Team Blog". Microsoft Corporation . Retrieved 2018-11-11.
  11. Tanenbaum, Andrew S. (2008). Modern operating systems (3rd ed.). Upper Saddle River, N.J.: Pearson Prentice Hall. p. 829. ISBN   978-0136006633.

Further reading