Burroughs large systems descriptors

Last updated

Descriptors [1] [2] are an architectural feature of Burroughs large systems, including the current (as of 2024) Unisys Clearpath/MCP systems. Apart from being stack- and tag-based, a notable architectural feature of these systems is that they are descriptor-based. Descriptors are the means of having data that does not reside on the stack such as arrays and objects. Descriptors are also used for string data as in compilers and commercial applications.

Contents

Descriptors are integral to the automatic memory management system and virtual memory. Descriptors contain metadata about memory blocks including address, length, machine type (word or byte — for strings) and other metadata. Descriptors provide essential memory protection, security, safety, catching all attempts at out-of-bounds access and buffer overflow. Descriptors are a form of capability system. [3]

History

The development of the descriptor was Burroughs method of implementing memory management, allocation and deallocation, as well as virtual memory. In 1958, Robert S. Barton, then at Shell Research, suggested that main storage should be allocated automatically rather than having the programmer being concerned with overlays from secondary memory, in effect virtual memory. [4] :49 [5] Virtual memory was originally developed for the Atlas project at the University of Manchester in the late 1950s.

By 1960, Barton had become lead architect on the Burroughs B5000 project. From 1959 to 1961, W.R. Lonergan was manager of the Burroughs Product Planning Group which included Barton, Donald Knuth as consultant, and Paul King. According to Lonergan, in May 1960, UCLA ran a two-week seminar ‘Using and Exploiting Giant Computers’ to which Paul King and two others were sent. Stan Gill gave a presentation on virtual memory in the Atlas I computer. Paul King took the ideas back to Burroughs and it was determined that virtual memory should be designed into the core of the B5000. [4] :3 Burroughs Corporation released the B5000 in 1964 as the first commercial computer with virtual memory. [6]

In the mid-1960s, other vendors, including, RCA, and General Electric developed machines that supported virtual memory, with operating systems oriented towards time-sharing. IBM didn't support virtual memory in machines intended for general data processing until 1972. However, other systems are not descriptor based and have added virtual memory above the basic processor architecture.

The descriptor was an essential part of the development of the B5000 with automatic memory management and virtual memory at Burroughs. While the details have changed since 1964, the idea of the descriptor essentially remains the same up until the current Unisys Clearpath MCP (Libra) machines which are direct descendants of the B5000.

Details

Descriptors describe storage areas, I/O requests and I/O results. They contain fields indicating, e.g. the type of descriptor, the address, the length, whether the data are present in storage. The details differ depending on the product line and the type of descriptor. The following text numbers the leftmost (most significant) bit as 0, in accordance with the Burroughs documentation.

Program and data descriptors have a bit called the "presence bit" or "p-bit"; it indicates whether the object referred to by the descriptor is currently in memory or in secondary storage. This bit is used to implements virtual memory.

When a descriptor is referenced, the hardware checks the p-bit. If it is 1, the data is present in memory at the location indicated in the address field. If the p-bit is 0, the data block is not present and an interrupt (p-bit interrupt) is raised and MCP code entered to make the block present. In this case, if the address field is 0, the data block has not been allocated (init p-bit) and the MCP searches for a free block the size of which is given in the length field.

The last p-bit scenario is when the p-bit is 0, indicating that the data is not in memory, but the address is non-zero, indicating that the data has been allocated and in this case the address represents a disk address in the virtual memory area on disk. In this case a p-bit interrupt is raised and it is noted as an "other" p-bit.

B5000, B5500 and B5700

The B5000 [7] was the first descriptor-based computer. Every descriptor has a flag (bit 0) of 1. Data descriptors contain a 15-bit storage address and a 10-bit size, as do most I/O descriptors. Program descriptors and External Result descriptors have a 15-bit address but no size field.

B5x00 Program Descriptors

Program descriptors are used to define program segments. Either an Operand Call or a Descriptor call that refers to a Program Descriptor will cause a subroutine call if the presence bit is 1. Otherwise, it will cause a presence interrupt.

B5x00 Program Descriptors [7] :4-2
012345678-1718-3233-47
FlagIPresenceIModeAAddress
110=not in memory
1=in memory
10=word
1=character
0=argument not required
1=argument required

Data Descriptors

Data Descriptors refer either to a block of data present in memory (P=1) or to a block of data that must be read into memory (P=0), Depending on the value of the Presence bit. Either an Operand Call or a Descriptor call that refers to a Data Descriptor will check the presence bit and the size field; if the presence bit is 0 then a presence interrupt will occur; if the size field is nonzero then the second word on the stack must be within range or an index interrupt will occur.

B5x00 Data Descriptors [7] :4-3
0123-78-1718192021-3233-47
FlagIPresenceSizeICAddress
100=not in memory
1=in memory
Reserved
for MCP
Reserved
for MCP
0=Floating
1=Integer
[D5 1] Reserved
for MCP
  1. Continuity bit - for controlling the type of interrupt caused by a program release operator
    0=Set the Program Release Interrupt - I/O areas not tanked or last I/O area
    1=Set the Continuity Interrupt - I/O areas are tanked

I/O Descriptors

B5x00 I/O Descriptors [7] :4-4–4-13
0123-78-1718-3233-47
FlagIAlternate
External
UnitSizeDevice
Dependent
Address
110=write
1=read

    External Result Descriptors

    An External Result Descriptor contains the I/O Descriptor used to initiate the operation with some fields replaced.

    B5x00 External Result Descriptors [7] :4-14–4-15
    0123-78-2526-3233-47
    FlagIAlternate
    External
    UnitIrrelevantDevice
    Dependent
    Address
    110=write
    1=read
    error
    conditions
    last
    location

      B6500, B7500 and successors

      Descriptors describe data blocks. Each descriptor contains a 20-bit address field referencing the data block. Each block has a length which is stored in the descriptor, also 20 bits. The size of the data is also given, being 4-, 6-, 8- or 48-bit data in a three bit field.

      The first computer with this architecture was the B6500. in that implementation, the meaning of the various status bits was:

      In later implementations, these status bits evolved to keep up with growing memory sizes and insights gained over time.

      Usage in compilers

      In ALGOL, the bounds of an array are completely dynamic, can be taken from values computed at run time, unlike in Pascal, which came later, based on ALGOL, where the size of arrays is fixed at compile time. This is the main weakness of Pascal as defined in its standard, but which has been removed in many commercial implementations of Pascal, notably the Burroughs implementations (both the University of Tasmania version by Arthur Sale and Roy Freak, and the later implementation on Burroughs Slice Compiler system by Matt Miller et al.)

      In a program in the Burroughs/Unisys MCP environment, an array is not allocated when it is declared, but only when it is touched at run time for the first time – thus arrays can be declared and the overhead of allocating them avoided if they are not used. Memory management is thus completely dynamic.

      Also, low-level memory allocation system calls such as the malloc class of calls of C and Unix are not needed – arrays are automatically allocated as used. This saves the programmer the great burden of filling programs with the error-prone activity of memory management, which is crucial in mainframe applications.

      When porting programs in lower-level languages such as C, the C memory structure is dealt with by doing its own memory allocation within a large allocated block – thus the security of the rest of the system cannot be compromised by buffer overflows by errant C programs. In fact, many buffer overruns in apparently otherwise correctly running C programs have been caught when ported to Unisys MCP architecture. [8] C, like Pascal, is also implemented using the Slice compiler system, which uses a common code generator and optimizer for all languages. The C compiler, run-time system, POSIX interfaces, as well as a port of many Unix tools was done by Steve Bartels. An Eiffel compiler was also developed using Slice.

      For object-oriented programs which require more dynamic creation of objects than the MCP architecture, objects are best allocated within a single block. Such object allocation is higher level than C's malloc and is best implemented with a modern efficient garbage collector.

      Integration in memory architecture

      The address field in the B5000, B5500, and B5700 was only 15 bits, which meant that only 32K words (192KB) of memory could be addressed by descriptors. The address field in the B6500 became 20 bits or 1 Meg words(6MB). By the mid seventies this was still a significant restriction of the architecture. To overcome this, two solutions were implemented:

      1. Swapper – this solution actually implements another layer on top of memory management, moving large clusters of related data in and out of memory at once.
      2. ASN – this solution allowed physically more memory to be configured in a system, divided into separately addressable chunks. This architecture became known as ASN (Address Space Number) memory. Memory was logically divided into two areas, allocating low memory addresses to a Global address space for the operating system and support software and high memory addresses to several parallel Local address spaces for individual programs. Address spaces are numbered, zero indicating Global, 1..n indicating the local address spaces. Programs sharing data are automatically placed in the same address space.

      No program code modifications were necessary for these features to be utilized. Both solutions could be combined, but eventually the MCP memory requirements and program data sharing requirements outgrew the maximum size of the address spaces itself.

      With the advent of the A Series in the early 1980s, the meaning of this field was changed to contain the address of a master descriptor, which meant that 1 megabyte data blocks can be allocated, but that the machine memory could be greatly expanded to gigabytes or perhaps terabytes. This architecture was named ASD (Advanced Segment Descriptors) memory. This required a new common microcode specification, referred to as Beta. The main visionary behind ASD memory was John McClintock. Later the 3-bit memory tag was increased to a 4-bit specification, allowing the segment descriptor to grow from 20 to 23 bits in size, allowing even more memory to be addressed simultaneously. This microcode specification became known as level Gamma.

      Memory management

      Another significant advantage was realized for virtual memory. In the B5000 design, if a data block were rolled out, all descriptors referencing that block needed to be found in order to update the presence bit and address. With the master descriptor, only the presence bit in the master descriptor needs changing. Also the MCP can move blocks around in memory for compaction and only needs to change the address in the master descriptor.

      A difference between the B5000 and most other systems is that other systems mainly used paged virtual memory, that is pages are swapped out in fixed-sized chunks regardless of the structure of the information in them. B5000 virtual memory works with varying-size segments as described by the descriptors.

      When the memory is filled to a certain capacity, an OS process called the "Working Set Sheriff" is invoked to either compact memory or start moving segments out of memory. It chooses code segments first, since these cannot change and can be reloaded from the original in the code file, so do not need writing out, and then data segments which are written out to the virtual memory file.

      P-bit interrupts [9] are also useful to measure system performance. For first-time allocations, 'init p-bits' indicate a potential performance problem in a program, for example if a procedure allocating an array is continually called. Reloading blocks from virtual memory on disk can significantly degrade system performance and is not the fault of any specific task. This is why many of today's computers may gain increased system performance by adding memory. On B5000 machines, 'other p-bits' indicate a system problem, which can be solved either by better balancing the computing load across the day, or by adding more memory.

      Thus the Burroughs large systems architecture helps optimization of both individual tasks and the system as a whole.

      Buffer overflow protection

      The last and maybe most important point to note about descriptors is how they affect the complementary notions of system security and program correctness. One of the best tools a hacker has to compromise operating systems of today is the buffer overflow. C, in particular, uses the most primitive and error-prone way to mark the end of strings, using a null byte as an end-of-string sentinel in the data stream itself.

      Pointers are implemented on the Unisys MCP systems by indexed descriptors. During indexing operations, pointers are checked at each increment to make sure that neither the source nor the destination blocks are out of bound. During a scan or replace operation, the mechanisms used to read or copy large blocks of memory, both source and destination are checked at each word increment for a valid memory tag. Each memory segment is bounded by tag 3 words, which would make such an operation fail. Each memory segment containing integrity sensitive data, such as program code, is stored in tag 3 words, making an uncontrolled read – let alone modification – impossible. Thus a significant source of program errors can be detected early before software goes into production, and a more significant class of attacks on system security is not possible.

      Notes

        Related Research Articles

        <span class="mw-page-title-main">Burroughs Corporation</span> American computer company

        The Burroughs Corporation was a major American manufacturer of business equipment. The company was founded in 1886 as the American Arithmometer Company by William Seward Burroughs. In 1986, it merged with Sperry UNIVAC to form Unisys. The company's history paralleled many of the major developments in computing. At its start, it produced mechanical adding machines, and later moved into programmable ledgers and then computers. It was one of the largest producers of mainframe computers in the world, also producing related equipment including typewriters and printers.

        <span class="mw-page-title-main">Virtual memory</span> Computer memory management technique

        In computing, virtual memory, or virtual storage, is a memory management technique that provides an "idealized abstraction of the storage resources that are actually available on a given machine" which "creates the illusion to users of a very large (main) memory".

        <span class="mw-page-title-main">Memory management</span> Computer memory management methodology

        Memory management is a form of resource management applied to computer memory. The essential requirement of memory management is to provide ways to dynamically allocate portions of memory to programs at their request, and free it for reuse when no longer needed. This is critical to any advanced computer system where more than a single process might be underway at any time.

        In computing, protected mode, also called protected virtual address mode, is an operational mode of x86-compatible central processing units (CPUs). It allows system software to use features such as segmentation, virtual memory, paging and safe multi-tasking designed to increase an operating system's control over application software.

        <span class="mw-page-title-main">Memory management unit</span> Hardware translating virtual addresses to physical address

        A memory management unit (MMU), sometimes called paged memory management unit (PMMU), is a computer hardware unit that examines all memory references on the memory bus, translating these requests, known as virtual memory addresses, into physical addresses in main memory.

        The Burroughs Large Systems Group produced a family of large 48-bit mainframes using stack machine instruction sets with dense syllables. The first machine in the family was the B5000 in 1961, which was optimized for compiling ALGOL 60 programs extremely well, using single-pass compilers. The B5000 evolved into the B5500 and the B5700. Subsequent major redesigns include the B6500/B6700 line and its successors, as well as the separate B8500 line.

        Memory protection is a way to control memory access rights on a computer, and is a part of most modern instruction set architectures and operating systems. The main purpose of memory protection is to prevent a process from accessing memory that has not been allocated to it. This prevents a bug or malware within a process from affecting other processes, or the operating system itself. Protection may encompass all accesses to a specified area of memory, write accesses, or attempts to execute the contents of the area. An attempt to access unauthorized memory results in a hardware fault, e.g., a segmentation fault, storage violation exception, generally causing abnormal termination of the offending process. Memory protection for computer security includes additional techniques such as address space layout randomization and executable-space protection.

        <span class="mw-page-title-main">Memory address</span> Reference to a specific memory location

        In computing, a memory address is a reference to a specific memory location used at various levels by software and hardware. Memory addresses are fixed-length sequences of digits conventionally displayed and manipulated as unsigned integers. Such numerical semantic bases itself upon features of CPU, as well upon use of the memory like an array endorsed by various programming languages.

        The MCP is the operating system of the Burroughs B5000/B5500/B5700 and the B6500 and successors, including the Unisys Clearpath/MCP systems.

        Memory segmentation is an operating system memory management technique of dividing a computer's primary memory into segments or sections. In a computer system using segmentation, a reference to a memory location includes a value that identifies a segment and an offset within that segment. Segments or sections are also used in object files of compiled programs when they are linked together into a program image and when the image is loaded into memory.

        The Burroughs B1000 Series was a series of mainframe computers, built by the Burroughs Corporation, and originally introduced in the 1970s with continued software development until 1987. The series consisted of three major generations which were the B1700, B1800, and B1900 series machines. They were also known as the Burroughs Small Systems, by contrast with the Burroughs Large Systems and the Burroughs Medium Systems.

        The Burroughs B6x00-7x00 instruction set includes the set of valid operations for the Burroughs B6500, B7500 and later Burroughs large systems, including the current Unisys Clearpath/MCP systems; it does not include the instruction for other Burroughs large systems including the B5000, B5500, B5700 and the B8500. These unique machines have a distinctive design and instruction set. Each word of data is associated with a type, and the effect of an operation on that word can depend on the type. Further, the machines are stack based to the point that they had no user-addressable registers.

        The Burroughs B2500 through Burroughs B4900 was a series of mainframe computers developed and manufactured by Burroughs Corporation in Pasadena, California, United States, from 1966 to 1991. They were aimed at the business world with an instruction set optimized for the COBOL programming language. They were also known as Burroughs Medium Systems, by contrast with the Burroughs Large Systems and Burroughs Small Systems.

        <span class="mw-page-title-main">ICL 2900 Series</span> UK mainframe computer systems

        The ICL 2900 Series was a range of mainframe computer systems announced by the British manufacturer International Computers Limited on 9 October 1974. The company had started development under the name "New Range" immediately on its formation in 1968. The range was not designed to be compatible with any previous machines produced by the company, nor for compatibility with any competitor's machines: rather, it was conceived as a synthetic option, combining the best ideas available from a variety of sources.

        The Global Descriptor Table (GDT) is a data structure used by Intel x86-family processors starting with the 80286 in order to define the characteristics of the various memory areas used during program execution, including the base address, the size, and access privileges like executability and writability. These memory areas are called segments in Intel terminology.

        The New Executive Programming Language (NEWP) is an internal structured-syntax system language for Unisys Master Control Program (MCP) systems. The language is used for writing the MCP operating system and other system utility software, though it can also be used to write user system software with the restriction to not use UNSAFE mode.

        The Burroughs B5000 was the first stack machine and also the first computer with a segmented virtual memory. The Burroughs B5000 instruction set includes the set of valid operations for the B5000, B5500 and B5700. It is not compatible with the B6500, B7500, B8500 or their successors.

        In computer science, a tagged architecture is a type of computer architecture where every word of memory constitutes a tagged union, being divided into a number of bits of data, and a tag section that describes the type of the data: how it is to be interpreted, and, if it is a reference, the type of the object that it points to.

        <span class="mw-page-title-main">Unisys 2200 Series system architecture</span> Aspect of Unisys 2200 Series

        The figure shows a high-level architecture of the OS 2200 system identifying major hardware and software components. The majority of the Unisys software is included in the subsystems and applications area of the model. For example, the database managers are subsystems and the compilers are applications.

        In computing, a data descriptor is a structure containing information that describes data.

        References

        1. THE DESCRIPTOR - a definition of the B 5000 Information Processing System (PDF). Burroughs Corporation. February 1961. Retrieved September 26, 2024 via Bitsavers.
        2. Organick, Elliott Irving (1973). Computer System Organization - The B5700/B6700 Series. ACM MONOGRAPH SERIES. Academic Press Inc. LCCN   72-88334.
        3. Levy, Henry M. (1984). Capability-based Computer Systems.
        4. 1 2 Waychoff, Richard. "Stories About the B5000 and People Who Were There" (PDF). Computer History Museum.
        5. "IEEE Computer August 1977 David Bulman's Letter to the Editor". IEEE.
        6. Cragon, Harvey G. (1996). Memory Systems and Pipelined Processors. Jones and Bartlett Publishers. p. 113. ISBN   978-0-86720-474-2.
        7. 1 2 3 4 5 The Operational Characteristic of the Processors for the Burroughs B 5000 (PDF) (A ed.), Detroit: Burroughs, 1962, 5000-21005A
        8. "Bounds checking in C". Unisys.
        9. "Pbits". Unisys.