DOS MZ executable

Last updated

DOS MZ executable
Filename extension
.exe, .com, .dll
Internet media type application/x-dosexec, application/x-msdos-program
Magic number MZ
Type of format Binary, executable
Extended to New Executable
Linear Executable
Portable Executable

The DOS MZ executable format is the executable file format used for .EXE files in DOS.

Contents

The file can be identified by the ASCII string "MZ" (hexadecimal: 4D 5A) at the beginning of the file (the "magic number"). "MZ" are the initials of Mark Zbikowski, one of the leading developers of MS-DOS. [1]

The MZ DOS executable file is newer than the COM executable format and differs from it. The DOS executable header contains relocation information, which allows multiple segments to be loaded at arbitrary memory addresses, and it supports executables larger than 64k; however, the format still requires relatively low memory limits. These limits were later bypassed using DOS extenders.

Segment handling

The environment of an EXE program run by DOS is found in its Program Segment Prefix.

EXE files normally have separate segments for the code, data, and stack. Program execution begins at address 0 of the code segment, and the stack pointer register is set to whatever value is contained in the header information (thus if the header specifies a 512 byte stack, the stack pointer is set to 200h). It is possible to not use a separate stack segment and simply use the code segment for the stack if desired.

The DS (data segment) register normally contains the same value as the CS (code segment) register and is not loaded with the actual segment address of the data segment when an EXE file is initialized; it is necessary for the programmer to set it themselves, generally done via the following instructions:

MOVAX,@DATAMOVDS,AX

Termination

In the original DOS 1.x API, it was also necessary to have the CS register pointing to the segment with the PSP at program termination; this was done via the following instructions:

PUSHDSXORAX,AXPUSHAX

Program termination would then be performed by a RETF instruction, which would retrieve the original segment address with the PSP from the stack and then jump to address 0, which contained an INT 20h instruction.

The DOS 2.x API introduced a new program termination function, INT 21h Function 4Ch which does not require saving the PSP segment address at the start of the program, and Microsoft advised against the use of the older DOS 1.x method.

Compatibility

MZ DOS executables can be run from DOS and Windows 9x-based operating systems. 32-bit Windows NT-based operating systems can execute them using their built-in Virtual DOS machine (although some graphics modes are unsupported). 64-bit versions of Windows cannot execute them. Alternative ways to run these executables include DOSBox and DOSEMU.

MZ DOS executables can be created by linkers, like Digital Mars Optlink, MS linker, VALX or Open Watcom's WLINK; additionally, FASM can create them directly.

See also

Further reading

Related Research Articles

<span class="mw-page-title-main">Netwide Assembler</span> Assembler for the Intel x86 architecture

The Netwide Assembler (NASM) is an assembler and disassembler for the Intel x86 architecture. It can be used to write 16-bit, 32-bit (IA-32) and 64-bit (x86-64) programs. It is considered one of the most popular assemblers for Linux and x86 chips.

The Portable Executable (PE) format is a file format for executables, object code, DLLs and others used in 32-bit and 64-bit versions of Windows operating systems. The PE format is a data structure that encapsulates the information necessary for the Windows OS loader to manage the wrapped executable code. This includes dynamic library references for linking, API export and import tables, resource management data and thread-local storage (TLS) data. On NT operating systems, the PE format is used for EXE, DLL, SYS, MUI and other file types. The Unified Extensible Firmware Interface (UEFI) specification states that PE is the standard executable format in EFI environments.

x86 assembly language is the name for the family of assembly languages which provide some level of backward compatibility with CPUs back to the Intel 8008 microprocessor, which was launched in April 1972. It is used to produce object code for the x86 class of processors.

<span class="mw-page-title-main">A20 line</span> Signal in the system bus of an x86-based computer system

The A20, or address line 20, is one of the electrical lines that make up the system bus of an x86-based computer system. The A20 line in particular is used to transmit the 21st bit on the address bus.

In computer programming, a magic number is any of the following:

.exe is a common filename extension denoting an executable file for Microsoft Windows, OS/2, and DOS.

A fat binary is a computer executable program or library which has been expanded with code native to multiple instruction sets which can consequently be run on multiple processor types. This results in a file larger than a normal one-architecture binary file, thus the name.

<span class="mw-page-title-main">COM file</span> Type of simple executable file

A COM file is a type of simple executable file. On the Digital Equipment Corporation (DEC) VAX operating systems of the 1970s, .COM was used as a filename extension for text files containing commands to be issued to the operating system. With the introduction of Digital Research's CP/M, the type of files commonly associated with COM extension changed to that of executable files. This convention was later carried over to DOS. Even when complemented by the more general EXE file format for executables, the compact COM files remained viable and frequently used under DOS.

Turbo Assembler is an assembler for software development published by Borland in 1989. It runs on and produces code for 16- or 32-bit x86 MS-DOS and compatible on Microsoft Windows. It can be used with Borland's other language products: Turbo Pascal, Turbo Basic, Turbo C, and Turbo C++. The Turbo Assembler package is bundled with Turbo Linker and is interoperable with Turbo Debugger.

The Program Segment Prefix (PSP) is a data structure used in DOS systems to store the state of a program. It resembles the Zero Page in the CP/M operating system. The PSP has the following structure:

The Microsoft Macro Assembler (MASM) is an x86 assembler that uses the Intel syntax for MS-DOS and Microsoft Windows. Beginning with MASM 8.0, there are two versions of the assembler: One for 16-bit & 32-bit assembly sources, and another (ML64) for 64-bit sources only.

In CP/M-86, Concurrent CP/M-86, Personal CP/M-86, S5-DOS, DOS Plus, Concurrent DOS, FlexOS, Multiuser DOS, System Manager and REAL/32 as well as by SCP1700, CP/K and K8918-OS, CMD is the filename extension used by CP/M-style executable programs. It corresponds to COM in CP/M-80 and EXE in DOS. The same extension is used by the command-line interpreter CMD.EXE in OS/2 and Windows for batch files.

Mark "Zibo" Joseph Zbikowski is a former Microsoft Architect and an early computer hacker. He started working at the company only a few years after its inception, leading efforts in MS-DOS, OS/2, Cairo and Windows NT. In 2006, he was honored for 25 years of service with the company, the third employee to reach this milestone, after Bill Gates and Steve Ballmer. He retired the same year from Microsoft.

The New Executable is a 16-bit .exe file format, a successor to the DOS MZ executable format. It was used in Windows 1.0–3.x, Windows 9x, multitasking MS-DOS 4.0, OS/2 1.x, and the OS/2 subset of Windows NT up to version 5.0. A NE is also called a segmented executable.

INT 13h is shorthand for BIOS interrupt call 13hex, the 20th interrupt vector in an x86-based computer system. The BIOS typically sets up a real mode interrupt handler at this vector that provides sector-based hard disk and floppy disk read and write services using cylinder-head-sector (CHS) addressing. Modern PC BIOSes also include INT 13h extension functions, originated by IBM and Microsoft in 1992, that provide those same disk access services using 64-bit LBA addressing; with minor additions, these were quasi-standardized by Phoenix Technologies and others as the EDD BIOS extensions.

A program information file (PIF) defines how a given DOS program should be run in a multi-tasking environment, especially in order to avoid giving it unnecessary resources which could remain available to other programs. TopView was the originator of PIFs; they were then inherited and extended by DESQview and Microsoft Windows, where they are most often seen. PIFs are seldom used today in software due to the absence of DOS applications.

In computing, the x86 memory models are a set of six different memory models of the x86 CPU operating in real mode which control how the segment registers are used and the default size of pointers.

<span class="mw-page-title-main">Debug (command)</span> Line-oriented debug utility in DOS

The line-oriented debugger DEBUG.EXE is an external command in operating systems such as DOS, OS/2 and Windows.

On many computer operating systems, a computer process terminates its execution by making an exit system call. More generally, an exit in a multithreading environment means that a thread of execution has stopped running. For resource management, the operating system reclaims resources that were used by the process. The process is said to be a dead process after it terminates.

This is a comparison of binary executable file formats which, once loaded by a suitable executable loader, can be directly executed by the CPU rather than being interpreted by software. In addition to the binary application code, the executables may contain headers and tables with relocation and fixup information as well as various kinds of meta data. Among those formats listed, the ones in most common use are PE, ELF, Mach-O and MZ.

References

  1. Inside Windows: An In-Depth Look into the Win32 Portable Executable File Format - MSDN Magazine, February 2002 Archived 2018-07-11 at the Wayback Machine . "Every PE file begins with a small MS-DOS executable. ... The first bytes of a PE file begin with the traditional MS-DOS header, called an IMAGE_DOS_HEADER. The only two values of any importance are e_magic and e_lfanew. ... The e_magic field (a WORD) needs to be set to the value 0x5A4D. ... In ASCII representation, 0x5A4D is MZ, the initials of Mark Zbikowski, one of the original architects of MS-DOS."