DOS MZ executable

Last updated

DOS MZ executable
Filename extension
.exe, .com, .dll
Internet media type application/x-dosexec, application/x-msdos-program, application/x-ms-dos-executable
Magic number MZ
Type of format Binary, executable
Extended to New Executable
Linear Executable
Portable Executable

The DOS MZ executable format is the executable file format used for .EXE files in DOS.

Contents

The file can be identified by the ASCII string "MZ" (hexadecimal: 4D 5A) at the beginning of the file (the "magic number"). "MZ" are the initials of Mark Zbikowski, one of the leading developers of MS-DOS. [1]

The MZ DOS executable file is newer than the COM executable format and differs from it. The DOS executable header contains relocation information, which allows multiple segments to be loaded at arbitrary memory addresses, and it supports executables larger than 64k; however, the format still requires relatively low memory limits. These limits were later bypassed using DOS extenders.

Segment handling

The environment of an EXE program run by DOS is found in its Program Segment Prefix.

EXE files normally have separate segments for the code, data, and stack. Program execution begins at address 0 of the code segment, and the stack pointer register is set to whatever value is contained in the header information (thus if the header specifies a 512 byte stack, the stack pointer is set to 200h). It is possible to not use a separate stack segment and simply use the code segment for the stack if desired.

The DS (data segment) register normally contains the same value as the CS (code segment) register and is not loaded with the actual segment address of the data segment when an EXE file is initialized; it is necessary for the programmer to set it themselves, generally done via the following instructions:

MOVAX,@DATAMOVDS,AX

Termination

In the original DOS 1.x API, it was also necessary to have the CS register pointing to the segment with the PSP at program termination; this was done via the following instructions:

PUSHDSXORAX,AXPUSHAX

Program termination would then be performed by a RETF instruction, which would retrieve the original segment address with the PSP from the stack and then jump to address 0, which contained an INT 20h instruction.

The DOS 2.x API introduced a new program termination function, INT 21h Function 4Ch which does not require saving the PSP segment address at the start of the program, and Microsoft advised against the use of the older DOS 1.x method.

Compatibility

MZ DOS executables can be run from DOS and Windows 9x-based operating systems. 32-bit Windows NT-based operating systems can execute them using their built-in Virtual DOS machine (although some graphics modes are unsupported). 64-bit versions of Windows cannot execute them. Alternative ways to run these executables include DOSBox and DOSEMU.

MZ DOS executables can be created by linkers, like Digital Mars Optlink, MS linker, VALX or Open Watcom's WLINK; additionally, FASM can create them directly.

See also

Further reading

Related Research Articles

The Portable Executable (PE) format is a file format for executables, object code, dynamic-link-libraries (DLLs), and binary files used on 32-bit and 64-bit Windows operating systems, as well as in UEFI environments. It is the standard format for executables on Windows NT-based systems, including files such as .exe, .dll, .sys, and .mui. At its core, the PE format is a structured data container that gives the Windows operating system loader eveything it needs to properly manage the executable code it contains. This includes references for dynamically linked libraries, tables for importing and exporting APIs, resource management data and thread-local storage (TLS) information.

x86 assembly language is a family of low-level programming languages that are used to produce object code for the x86 class of processors. These languages provide backward compatibility with CPUs dating back to the Intel 8008 microprocessor, introduced in April 1972. As assembly languages, they are closely tied to the architecture's machine code instructions, allowing for precise control over hardware.

<span class="mw-page-title-main">A20 line</span> Signal in the system bus of an x86-based computer system

The A20, or address line 20, is one of the electrical lines that make up the system bus of an x86-based computer system. The A20 line in particular is used to transmit the 21st bit on the address bus.

An environment variable is a user-definable value that can affect the way running processes will behave on a computer. Environment variables are part of the environment in which a process runs. For example, a running process can query the value of the TEMP environment variable to discover a suitable location to store temporary files, or the HOME or USERPROFILE variable to find the directory structure owned by the user running the process.

In computer programming, a magic number is any of the following:

For Microsoft Windows, OS/2, and DOS, .exe is the filename extension that denotes a file as being executable – a computer program – containing an entry point.

A fat binary is a computer executable program or library which has been expanded with code native to multiple instruction sets which can consequently be run on multiple processor types. This results in a file larger than a normal one-architecture binary file, thus the name.

<span class="mw-page-title-main">COM file</span> Type of simple executable file

A COM file is a type of simple executable file. On the Digital Equipment Corporation (DEC) VAX operating systems of the 1970s, .COM was used as a filename extension for text files containing commands to be issued to the operating system. With the introduction of Digital Research's CP/M, the type of files commonly associated with COM extension changed to that of executable files. This convention was later carried over to DOS. Even when complemented by the more general EXE file format for executables, the compact COM files remained viable and frequently used under DOS.

Turbo Assembler is an assembler for software development published by Borland in 1989. It runs on and produces code for 16- or 32-bit x86 MS-DOS and compatibles for Microsoft Windows. It can be used with Borland's other language products: Turbo Pascal, Turbo Basic, Turbo C, and Turbo C++. The Turbo Assembler package is bundled with Turbo Linker and is interoperable with Turbo Debugger.

The Program Segment Prefix (PSP) is a data structure used in DOS systems to store the state of a program. It resembles the Zero Page in the CP/M operating system. The PSP has the following structure:

The Microsoft Macro Assembler (MASM) is an x86 assembler that uses the Intel syntax for MS-DOS and Microsoft Windows. Beginning with MASM 8.0, there are two versions of the assembler: One for 16-bit & 32-bit assembly sources, and another (ML64) for 64-bit sources only.

A File Control Block (FCB) is a file system structure in which the state of an open file is maintained. A FCB is managed by the operating system, but it resides in the memory of the program that uses the file, not in operating system memory. This allows a process to have as many files open at one time as it wants, provided it can spare enough memory for an FCB per file.

In CP/M-86, Concurrent CP/M-86, Personal CP/M-86, S5-DOS, DOS Plus, Concurrent DOS, FlexOS, Multiuser DOS, System Manager and REAL/32 as well as by SCP1700, CP/K and K8918-OS, CMD is the filename extension used by CP/M-style executable programs. It corresponds to COM in CP/M-80 and EXE in DOS. The same extension is used by the command-line interpreter CMD.EXE in OS/2 and Windows for batch files.

Mark "Zibo" Joseph Zbikowski is a former Microsoft Architect and an early computer hacker. He started working at the company only a few years after its inception, leading efforts in MS-DOS, OS/2, Cairo and Windows NT. In 2006, he was honored for 25 years of service with the company, the third employee to reach this milestone, after Bill Gates and Steve Ballmer. He retired the same year from Microsoft.

The New Executable is a 16-bit executable file format, a successor to the DOS MZ executable format. It was used in Windows 1.0–3.x, Windows 9x, multitasking MS-DOS 4.0, OS/2 1.x, and the OS/2 subset of Windows NT up to version 5.0. An NE is also called a segmented executable. It utilizes the 286 protected mode.

INT 13h is shorthand for BIOS interrupt call 13hex, the 20th interrupt vector in an x86-based computer system. The BIOS typically sets up a real mode interrupt handler at this vector that provides sector-based hard disk and floppy disk read and write services using cylinder-head-sector (CHS) addressing. Modern PC BIOSes also include INT 13h extension functions, originated by IBM and Microsoft in 1992, that provide those same disk access services using 64-bit LBA addressing; with minor additions, these were quasi-standardized by Phoenix Technologies and others as the EDD BIOS extensions.

A program information file (PIF) defines how a given DOS program should be run in a multi-tasking environment, especially in order to avoid giving it unnecessary resources which could remain available to other programs. TopView was the originator of PIFs; they were then inherited and extended by DESQview and Microsoft Windows, where they are most often seen. PIFs are seldom used today in software due to the absence of DOS applications.

In computing, the x86 memory models are a set of six different memory models of the x86 CPU operating in real mode which control how the segment registers are used and the default size of pointers.

<span class="mw-page-title-main">Debug (command)</span> Line-oriented debug utility in DOS

The line-oriented debugger DEBUG.EXE is an external command in operating systems such as DOS, OS/2 and Windows.

On many computer operating systems, a computer process terminates its execution by making an exit system call. More generally, an exit in a multithreading environment means that a thread of execution has stopped running. For resource management, the operating system reclaims resources that were used by the process. The process is said to be a dead process after it terminates.

References

  1. Inside Windows: An In-Depth Look into the Win32 Portable Executable File Format - MSDN Magazine, February 2002 Archived 2018-07-11 at the Wayback Machine . "Every PE file begins with a small MS-DOS executable. ... The first bytes of a PE file begin with the traditional MS-DOS header, called an IMAGE_DOS_HEADER. The only two values of any importance are e_magic and e_lfanew. ... The e_magic field (a WORD) needs to be set to the value 0x5A4D. ... In ASCII representation, 0x5A4D is MZ, the initials of Mark Zbikowski, one of the original architects of MS-DOS."