Program Segment Prefix

Last updated

The Program Segment Prefix (PSP) is a data structure used in DOS systems to store the state of a program. It resembles the Zero Page in the CP/M operating system. The PSP has the following structure:

Contents

OffsetSizeContents
00h–01h2 bytes (code) CP/M-80-like exit (always contains INT 20h) [1] [2]
02h–03h word (2 bytes)Segment of the first byte beyond the memory allocated to the program
04hbyteReserved
05h–09h5 bytes (code) CP/M-80-like far call entry into DOS, and program segment size [1] [3]
0Ah–0Dh dword (4 bytes)Terminate address of previous program (old INT 22h)
0Eh–11hdwordBreak address of previous program (old INT 23h)
12h–15hdwordCritical error address of previous program (old INT 24h)
16h–17hwordParent's PSP segment (usually COMMAND.COM - internal)
18h–2Bh20 bytes Job File Table (JFT) (internal)
2Ch–2Dhword Environment segment
2Eh–31hdwordSS:SP on entry to last INT 21h call (internal)
32h–33hwordJFT size (internal)
34h–37hdwordPointer to JFT (internal)
38h–3BhdwordPointer to previous PSP (only used by SHARE in DOS 3.3 and later)
3Ch–3Fh4 bytesReserved
40h–41hwordDOS version to return (DOS 5 and later, alterable via SETVER in DOS 5 and later)
42h–4Fh14 bytesReserved
50h–52h3 bytes (code) Unix-like far call entry into DOS (always contains INT 21h + RETF)
53h–54h2 bytesReserved
55h–5Bh7 bytesReserved (can be used to make first FCB into an extended FCB)
5Ch–6Bh16 bytesUnopened Standard FCB 1
6Ch–7Fh20 bytesUnopened Standard FCB 2 (overwritten if FCB 1 is opened)
80h1 byteNumber of bytes on command-line
81h–FFh127 bytesCommand-line tail (terminated by a 0Dh) [4] [5]

The PSP is most often used to get the command line arguments of a DOS program; for example, the command "FOO.EXE /A /F" executes FOO.EXE with the arguments '/A' and '/F'.

If the PSP entry for the command line length is non-zero and the pointer to the environment segment is neither 0000h nor FFFFh, programs should first try to retrieve the command line from the environment variable %CMDLINE% before extracting it from the PSP. This way, it is possible to pass command lines longer than 126 characters to applications.

The segment address of the PSP is passed in the DS register when the program is executed. It can also be determined later by using Int 21h function 51h or Int 21h function 62h. Either function will return the PSP address in register BX. [6]

Alternatively, in .COM programs loaded at offset 100h, one can address the PSP directly just by using the offsets listed above. Offset 000h points to the beginning of the PSP, 0FFh points to the end, etc.

For example, the following code displays the command line arguments:

org100h; .COM - not using ds; INT 21h subfunction 9 requires '$' to terminate stringxorbx,bxmovbl,[80h]cmpbl,7Ehjaexit; preventing overflowmovbyte[bx+81h],'$'; print the stringmovah,9movdx,81hint21hexit:movax,4C00h; subfunction 4Cint21h

In DOS 1.x, it was necessary for the CS (Code Segment) register to contain the same segment as the PSP at program termination, thus standard programming practice involved saving the DS register (since the DS register is loaded with the PSP segment) along with a zero word to the stack at program start and terminating the program with a RETF instruction, which would pop the saved segment value off the stack and jump to address 0 of the PSP, which contained an INT 20h instruction.

; savepushdsxorax,axpushax; move to the default data group (@data)movax,@datamovds,ax; print message in mess1 (21h subfunction 9)movdx,mess1movah,9int21hretf

If the executable was a .COM file, this procedure was unnecessary and the program could be terminated merely with a direct INT 20h instruction or else calling INT 21h function 0. However, the programmer still had to ensure that the CS register contained the segment address of the PSP at program termination. Thus,

jmpstartmess1db'Hello world!$'start:movdx,mess1movah,9int21hint20h

In DOS 2.x and higher, program termination was accomplished instead with INT 21h function 4Ch which did not require the CS register to contain the segment value of the PSP.

See also

Related Research Articles

x86 memory segmentation refers to the implementation of memory segmentation in the Intel x86 computer instruction set architecture. Segmentation was introduced on the Intel 8086 in 1978 as a way to allow programs to address more than 64 KB (65,536 bytes) of memory. The Intel 80286 introduced a second version of segmentation in 1982 that added support for virtual memory and memory protection. At this point the original mode was renamed to real mode, and the new version was named protected mode. The x86-64 architecture, introduced in 2003, has largely dropped support for segmentation in 64-bit mode.

<span class="mw-page-title-main">86-DOS</span> Discontinued computer operating system for x86 processors and predecessor to MS-DOS/PC DOS

86-DOS is a discontinued operating system developed and marketed by Seattle Computer Products (SCP) for its Intel 8086-based computer kit.

x86 assembly language is the name for the family of assembly languages which provide some level of backward compatibility with CPUs back to the Intel 8008 microprocessor, which was launched in April 1972. It is used to produce object code for the x86 class of processors.

<span class="mw-page-title-main">A20 line</span> Signal in the system bus of an x86-based computer system

The A20, or address line 20, is one of the electrical lines that make up the system bus of an x86-based computer system. The A20 line in particular is used to transmit the 21st bit on the address bus.

<span class="mw-page-title-main">High memory area</span>

In DOS memory management, the high memory area (HMA) is the RAM area consisting of the first 65520 bytes above the one megabyte in an IBM AT or compatible computer.

Fetching the instruction opcodes from program memory well in advance is known as prefetching and it is served by using a prefetch input queue (PIQ). The pre-fetched instructions are stored in a queue. The fetching of opcodes well in advance, prior to their need for execution, increases the overall efficiency of the processor boosting its speed. The processor no longer has to wait for the memory access operations for the subsequent instruction opcode to complete. This architecture was prominently used in the Intel 8086 microprocessor.

A fat binary is a computer executable program or library which has been expanded with code native to multiple instruction sets which can consequently be run on multiple processor types. This results in a file larger than a normal one-architecture binary file, thus the name.

<span class="mw-page-title-main">COM file</span> Type of simple executable file

A COM file is a type of simple executable file. On the Digital Equipment Corporation (DEC) VAX operating systems of the 1970s, .COM was used as a filename extension for text files containing commands to be issued to the operating system. With the introduction of Digital Research's CP/M, the type of files commonly associated with COM extension changed to that of executable files. This convention was later carried over to DOS. Even when complemented by the more general EXE file format for executables, the compact COM files remained viable and frequently used under DOS.

Turbo Assembler is an assembler for software development published by Borland in 1989. It runs on and produces code for 16- or 32-bit x86 MS-DOS and compatible on Microsoft Windows. It can be used with Borland's other language products: Turbo Pascal, Turbo Basic, Turbo C, and Turbo C++. The Turbo Assembler package is bundled with Turbo Linker and is interoperable with Turbo Debugger.

A File Control Block (FCB) is a file system structure in which the state of an open file is maintained. A FCB is managed by the operating system, but it resides in the memory of the program that uses the file, not in operating system memory. This allows a process to have as many files open at one time as it wants, provided it can spare enough memory for an FCB per file.

The Zero Page is a data structure used in CP/M systems for programs to communicate with the operating system. In 8-bit CP/M versions it is located in the first 256 bytes of memory, hence its name.

INT 13h is shorthand for BIOS interrupt call 13hex, the 20th interrupt vector in an x86-based computer system. The BIOS typically sets up a real mode interrupt handler at this vector that provides sector-based hard disk and floppy disk read and write services using cylinder-head-sector (CHS) addressing. Modern PC BIOSes also include INT 13h extension functions, originated by IBM and Microsoft in 1992, that provide those same disk access services using 64-bit LBA addressing; with minor additions, these were quasi-standardized by Phoenix Technologies and others as the EDD BIOS extensions.

In computing, the x86 memory models are a set of six different memory models of the x86 CPU operating in real mode which control how the segment registers are used and the default size of pointers.

The DOS MZ executable format is the executable file format used for .EXE files in DOS.

A Trace Vector Decoder (TVD) is computer software that uses the trace facility of its underlying microprocessor to decode encrypted instruction opcodes just-in-time prior to execution and possibly re-encode them afterwards. It can be used to hinder reverse engineering when attempting to prevent software cracking as part of an overall copy protection strategy.

<span class="mw-page-title-main">Debug (command)</span> Line-oriented debug utility in DOS

The line-oriented debugger DEBUG.EXE is an external command in operating systems such as DOS, OS/2 and Windows.

<span class="mw-page-title-main">Overlay (programming)</span>

In a general computing sense, overlaying means "the process of transferring a block of program code or other data into main memory, replacing what is already stored". Overlaying is a programming method that allows programs to be larger than the computer's main memory. An embedded system would normally use overlays because of the limitation of physical memory, which is internal memory for a system-on-chip, and the lack of virtual memory facilities.

On many computer operating systems, a computer process terminates its execution by making an exit system call. More generally, an exit in a multithreading environment means that a thread of execution has stopped running. For resource management, the operating system reclaims resources that were used by the process. The process is said to be a dead process after it terminates.

The DOS API is an API which originated with 86-DOS and is used in MS-DOS/PC DOS and other DOS-compatible operating systems. Most calls to the DOS API are invoked using software interrupt 21h. By calling INT 21h with a subfunction number in the AH processor register and other parameters in other registers, various DOS services can be invoked. These include handling keyboard input, video output, disk file access, program execution, memory allocation, and various other activities. In the late 1980s, DOS extenders along with the DOS Protected Mode Interface (DPMI) allow the programs to run in either 16-bit or 32-bit protected mode and still have access to the DOS API.

A trap flag permits operation of a processor in single-step mode. If such a flag is available, debuggers can use it to step through the execution of a computer program.

References

  1. 1 2 Taylor, Roger; Lemmons, Phil (June 1982). "Upward migration - Part 1: Translators - Using translation programs to move CP/M-86 programs to CP/M and MS-DOS" [Using translation programs to move CP/M programs to CP/M-86 and MS-DOS](PDF). BYTE . Vol. 7, no. 6. BYTE Publications Inc. pp. 321–322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344 [342, 344]. ISSN   0360-5280. CODEN   BYTEDJ. Archived (PDF) from the original on 2020-01-16. Retrieved 2020-01-15. […] Gaining Access to CP/M-86 […] Gaining access to CP/M-86 requires placing the function code in the CL register, placing the byte parameter in the DL register or placing the word parameter in the DX register, placing the data segment in the DS register (the data segment is usually not changed for a converted program), and executing a software interrupt, INT #224. The result is returned in the AL register if it is a byte value; if the result is a word value, it is returned in both the AX and BX registers. Double-word values are returned with the offset in the BX registers and the segment in the ES register. Conversion of programs from CP/M-80 to CP/M-86, then, requires replacing the call to location 5 with the software interrupt INT #224. Another necessary change involves the warm boot. Under CP/M-80, the warm boot may be accessed by a system call with a function code of 0 for a jump to location 0. CP/M-86, however, does not support the jump to location 0. As a result, you must change this program exit in the translated program if the program is to run correctly. Provided that the call to location 5 is replaced with INT #224, that the warm boot change is made, and that the registers are mapped correctly, there should be little problem in getting the translated program to access the CP/M-86 system functions. […] Gaining Access to MS-DOS […] Although MS-DOS has a "preferred" mechanism through a soft-ware interrupt, INT #33, for accessing the system, an additional mechanism is provided for "preexisting" programs that is compatible with CP/M-80 calling conventions, at least for functions in the range of 0-36. As far as system calls within the allowed function range are concerned, the programmer doesn't have to do anything to translated programs to get them to run under MS-DOS other than to correctly map the registers. MS-DOS also supports the warm boot function of CP/M-80. A jump to location 0 under MS-DOS executes a software interrupt, INT #32, which is functionally a program end and the normal way to exit from a program. […] (13 pages)
  2. Paul, Matthias R. (2002-10-07) [2000]. "Re: Run a COM file". Newsgroup:  alt.msdos.programmer. Archived from the original on 2017-09-03. Retrieved 2017-09-03. (NB. Has details on DOS COM program calling conventions.)
  3. Necasek, Michal (2011-09-13). "Who needs the address wraparound, anyway?". OS/2 Museum. Archived from the original on 2020-02-19. Retrieved 2020-02-19. […] 86-DOS, and hence PC DOS/MS-DOS, used a clever trick. The byte at offset 5 of the PSP contained a far call opcode (9Ah); the word at offset 6 of the PSP contained the appropriate value to indicate program segment size, and also the offset part of the far call. The word at offset 8, which served as the segment part of the far call, was crafted such that when combined with the offset, it would wrap around (a well understood feature of the 8086 CPU) and point to address 0:C0h, which contains interrupt vector 30h. […] the CALL 5 interface works even in DOS emulation under Windows NT and OS/2, and those systems most certainly cannot run with the A20 line disabled. How does that work then? […] Rather than chopping off address bits, the system mirrors the five bytes at 0:C0h at 1000C0h. The same technique had been in fact used in DOS 5 and above running with DOS=HIGH. In that case, DOS makes sure that linear address 1000C0h contains the appropriate far call. […]
  4. Paul, Matthias R. (1997-07-01) [1994-01-01]. MSDOSTIPs Tips für den Umgang mit MS-DOS 5.0-7. MPDOSTIP (in German). Archived from the original on 2017-08-22. Retrieved 2013-10-25. (NB. MSDOSTIP.TXT is part of MPDOSTIP.ZIP, maintained up to 2001 and distributed on many sites at the time. The provided link points to a HTML-converted older version of the MSDOSTIP.TXT file.)
  5. Paul, Matthias R. (1997-05-01) [1995-03-01]. "Hinweise zu JPSofts 4DOS 5.5b/c, 5.51, 5.52a und NDOS". MPDOSTIP (in German). Archived from the original on 2016-11-04. Retrieved 2015-05-08. (NB. The provided link points to a HTML-converted version of the 4DOS5TIP.TXT file, which is part of the MPDOSTIP.ZIP collection.)
  6. "INT 21h,62h - Get PSP address (DOS 3.x)". Archived from the original on 2012-02-07.
  7. 1 2 3 Schulman, Andrew; Brown, Ralf D.; Maxey, David; Michels, Raymond J.; Kyle, Jim (1994) [November 1993]. Undocumented DOS: A programmer's guide to reserved MS-DOS functions and data structures - expanded to include MS-DOS 6, Novell DOS and Windows 3.1 (2 ed.). Reading, Massachusetts, USA: Addison Wesley. ISBN   0-201-63287-X. (xviii+856+vi pages, 3.5"-floppy) Errata:

Further reading