NLTSS

Last updated
Network Livermore Timesharing System (NLTSS)
Developer Lawrence Livermore Laboratory
Written in Model (Pascal extension)
OS family capability-based
Working stateDiscontinued
Source model Closed source
Initial release1979;44 years ago (1979)
Final release Final / 1988;35 years ago (1988)
Marketing target Supercomputers
Available in English
Update methodCompile from source code
Platforms CDC 7600, Cray-1, Cray X-MP, Cray Y-MP
Kernel type Microkernel
License Proprietary

The Network Livermore Timesharing System (NLTSS, also sometimes the New Livermore Time Sharing System) is an operating system that was actively developed at Lawrence Livermore Laboratory (now Lawrence Livermore National Laboratory) from 1979 until about 1988, though it continued to run production applications until 1995. An earlier system, the Livermore Time Sharing System had been developed over a decade earlier.

Contents

NLTSS ran initially on a CDC 7600 computer, but only ran production from about 1985 until 1994 on Cray computers including the Cray-1, Cray X-MP, and Cray Y-MP models.

Characteristics

The NLTSS operating system was unusual in many respects and unique in some.

Low-level architecture

NLTSS was a microkernel message passing system. It was unique in that only one system call was supported by the kernel of the system. That system call, which might be called "communicate" (it didn't have a name because it didn't need to be distinguished from other system calls) accepted a list of "buffer tables" (e.g., see The NLTSS Message System Interface) [1] that contained control information for message communication – either sends or receives. Such communication, both locally within the system and across a network was all the kernel of the system supported directly for user processes. The "message system" (supporting the one call and the network protocols) and drivers for the disks and processor composed the entire kernel of the system.

Mid-level architecture

NLTSS is a capability-based security client–server system. The two primary servers are the file server and the process server. The file server was a process privileged to be trusted by the drivers for local storage (disk storage,) and the process server was a process privileged to be trusted by the processor driver (software that switched time sharing control between processes in the "alternator", handled interrupts for processes besides the "communicate" call, provided access to memory and process state for the process server, etc.).

NLTSS was a true network operating system in that its resource requests could come from local processes or remote processes anywhere on the network and the servers didn't distinguish them. A server's only means to make such distinctions would be by network address and they had no reason to make such distinctions. All requests to the servers appeared as network requests.

Communication between processes in NLTSS by convention used the Livermore Interactive Network Communication System (LINCS) protocol suite, which defined a protocol stack along the lines of that defined by the OSI reference model. The transport level protocol for NLTSS and LINCS was named Delta-T. At the presentation level, LINCS defined standards for communicating numbered parameters as tokens (e.g., integers, capabilities, etc.) that were stored in a session level record for processing in a remote procedure call sort of mechanism.

The notion of a "user" was only rather peripherally defined in NLTSS. There was an "account server" that kept track of which users were using which resources (e.g., requests to create objects such as file or processes required such an account capability). Access control was entirely managed with capabilities (communicable authority tokens).

File server

Any process could make requests to the file server for the creation of files (returning a file capability), ask to read or write files (by presenting a file capability), etc. For example, the act of reading a file generally required three buffer tables, one to send the request to the file server, one to receive the reply from the file server, and one to receive the data from the file. These three requests were generally submitted at one time to the message system, sometimes bundled with other requests. Control bits could be set in the buffer tables to awaken (unblock) a process whenever any of the buffer tables submitted were marked "Done". A library call to read a file would typically block until the control reply was received from the file server, though asynchronous I/O would of course not block and could check or block later. Any such differences on the user side were invisible to the file server.

Process server

In NLTSS the process server was quite similar to the file server in that user processes could ask for the creation of processes, the starting or stopping of processes, reading or writing process memory or registers, and to be notified of process faults. The process server was an ordinary user mode process that was simply trusted to communicate with the CPU driver, just like the file server was trusted to communicate with the disk driver. The process server stored process state in files provided by the file server and in that regard appeared like any other user process to the file server.

Directory server

An example higher level server in NLTSS was the directory server. This server's task was to essentially turn files (invisible to the user) into directories that could be used to store and retrieve capabilities by name. Since capabilities were simply data this wasn't a particularly difficult task, consisting mostly of manipulating access permissions on the capabilities according to the conventions defined in the LINCS protocol suite. One place where this got a bit interesting was regarding an access permission named inheritance. If this bit was on (allowed), then capabilities could be fetched with their full access from the directory. If this bit was turned off (disallowed), then any permissions turned off in the directory capability were in turn turned off in the capability being fetched before it was returned to the requesting application. This mechanism allowed people to store, for example, read/write files in a directory, but to give other users only permission to fetch read-only instances of them.

Development

The bulk of the programming for NLTSS was done in a Pascal extension developed at Los Alamos National Laboratory known as "Model". Model extended Pascal to include an abstract data type (object) mechanism and some other features.

NLTSS was saddled with a compatibility legacy. NLTSS followed the development and deployment of the Livermore Time Sharing System (LTSS) in the Livermore Computer Center at LLNL (~1968–1988?). NLTSS development began about the same time LTSS was ported to the Cray-1 to become the Cray Time Sharing System. To stay backward compatible with the many scientific applications at LLNL, NLTSS was forced to emulate the prior LTSS operating system's system calls. This emulation was implemented in the form of a compatibility library named "baselib". As one example, while the directory structure and thus the process structure for NLTSS was naturally a directed graph (process capabilities could be stored in directories just like file capabilities or directory capabilities), the baselib library emulated a simple linear (controller – controllee) process structure (not even a tree structure as in Unix) to stay compatible with the prior LTSS. Since scientific users never accessed NLTSS services outside the baselib library, NLTSS ended up looking nearly exactly like LTSS to its users. Most users weren't aware of capabilities, didn't realize that they could access resources across the network, and generally weren't aware that NLTSS offered any services beyond those of LTSS. NLTSS did support shared memory symmetric multiprocessing, a development that paralleled a similar development in the Cray Time Sharing System.

Even the name NLTSS was something of a legacy. The "New Livermore Time Sharing System" name was initially considered a temporary name to use during development. Once the system began to run some applications in a dual system mode (sort of a virtual machine sharing drivers with LTSS) a more permanent name, LIncs Network Operating System (LINOS), was chosen by the developers. Unfortunately, the management at LLNL decided that the name couldn't be changed at that point (seemingly because the prior term had been used in budget requests) so the temporary development NLTSS name stayed with the system throughout its lifetime.

A mass storage system was also developed in parallel with NLTSS that used the LINCS protocols (same file and directory protocols as NLTSS). This system/software was later commercialized as the Unitree product. Unitree was generally superseded by the High Performance Storage System (HPSS) that could loosely be considered a legacy of LINCS and NLTSS. For example, LINCS and NLTSS introduced a form of third party transfer (to copy file to file in NLTSS a process could send two requests to file servers, one to read and one to write and direct the file servers to transfer the data between themselves) that carried through in modified form to Unitree and HPSS.

Implementation and design issues

The biggest knock against NLTSS during its production lifetime was performance. The one performance issue that affected users most was file access latency. This generally wasn't a significant problem for disk input/output (I/O), but the systems that NLTSS ran on also supported a significant complement of very low latency solid state disks with access times under 10 microseconds. The initial latencies for file operations under NLTSS were comparable to the latency for solid state disk access and significantly higher than the LTSS latency for such access. To improve file access latency under NLTSS the implementation was changed significantly to put the most latency sensitive processes (in particular the file server) "in the kernel". This effort wasn't as significant as it might at first sound as all NLTSS servers worked on a multithreading model. What this change really amounted to was to move the threads responsible for file server services from a separate file server process into the kernel "process". Communication to users was unchanged (still through buffer tables, LINCS tokens, etc.), but file operations avoided some significant context changes that were the primary cause of the higher latencies over what the older LTSS and the competing Cray Time Sharing System provided. This change did significantly (~3x) improve the latency of file I/O operations, but it also meant that the file server became a trusted part of the kernel (by implementation, not by design).

A second implementation issue with NLTSS related to the security/integrity of its capability as data implementation. This implementation used a password capability model (e.g., see Control by Password). [2] With this model any person or process that could get access to the memory space of a process would have the authority to access the capability represented by the data found in that memory. Some system architects (e.g., Andrew S. Tanenbaum, the architect of the Amoeba distributed operating system) have suggested that this property of access to memory implying access to capabilities is not an inherent problem. In the environment of NLTSS, it sometimes happened that people took program memory dumps to others for analysis. Because of this and other concerns, such password capabilities were considered a vulnerability in NLTSS. A design was done to protect against this vulnerability, the Control by Public Key Encryption [3] mechanism. This mechanism wasn't put into production in NLTSS both because of its significant performance cost and because users weren't aware of the vulnerability from password capabilities. Modern advances in cryptography would make such protection for capabilities practical, especially for Internet/Web capabilities (e.g., see YURLs [4] or WideWORD). [5]

A design issue with NLTSS that wasn't considered until years after it was removed from production was its open network architecture. In NLTSS processes were considered as virtual processors in a network with no firewalls or other restrictions. Any process could communicate freely to any other. This meant that it was not possible to do confinement even in the sense of limiting direct communication, e.g., vs. limiting covert channels such as "wall banging". To correct this problem NLTSS would have to require capabilities to enable communication. Late development work on NLTSS such as "stream numbers" was getting close to such a facility, but by the time active development stopped in 1988, communication in NLTSS was still unconfined.

See also

Related Research Articles

<span class="mw-page-title-main">Cache (computing)</span> Additional storage that enables faster access to main storage

In computing, a cache is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewhere. A cache hit occurs when the requested data can be found in a cache, while a cache miss occurs when it cannot. Cache hits are served by reading data from the cache, which is faster than recomputing a result or reading from a slower data store; thus, the more requests that can be served from the cache, the faster the system performs.

<span class="mw-page-title-main">Operating system</span> Software that manages computer hardware resources

An operating system (OS) is system software that manages computer hardware and software resources, and provides common services for computer programs.

<span class="mw-page-title-main">Web server</span> Computer software that distributes web pages

A web server is computer software and underlying hardware that accepts requests via HTTP or its secure variant HTTPS. A user agent, commonly a web browser or web crawler, initiates communication by making a request for a web page or other resource using HTTP, and the server responds with the content of that resource or an error message. A web server can also accept and store resources sent from the user agent if configured to do so.

DNIX is a discontinued Unix-like real-time operating system from the Swedish company Dataindustrier AB (DIAB). A version named ABCenix was developed for the ABC 1600 computer from Luxor. Daisy Systems also had a system named Daisy DNIX on some of their computer-aided design (CAD) workstations. It was unrelated to DIAB's product.

DECnet is a suite of network protocols created by Digital Equipment Corporation. Originally released in 1975 in order to connect two PDP-11 minicomputers, it evolved into one of the first peer-to-peer network architectures, thus transforming DEC into a networking powerhouse in the 1980s. Initially built with three layers, it later (1982) evolved into a seven-layer OSI-compliant networking protocol.

<span class="mw-page-title-main">Exokernel</span> Operating system kernel developed by the MIT Parallel and Distributed Operating Systems group

Exokernel is an operating system kernel developed by the MIT Parallel and Distributed Operating Systems group, and also a class of similar operating systems.

<span class="mw-page-title-main">Server Message Block</span> Network communication protocol for providing shared access to resources

Server Message Block (SMB) is a communication protocol mainly used by Microsoft Windows equipped computers normally used to share files, printers, serial ports, and miscellaneous communications between nodes on a network. SMB implementation consists of two vaguely named Windows services: "Server" and "Workstation". It uses NTLM or Kerberos protocols for user authentication. It also provides an authenticated inter-process communication (IPC) mechanism.

<span class="mw-page-title-main">NetWare</span> Computer network operating system developed by Novell, Inc

NetWare is a discontinued computer network operating system developed by Novell, Inc. It initially used cooperative multitasking to run various services on a personal computer, using the IPX network protocol.

Banyan VINES is a discontinued network operating system developed by Banyan Systems for computers running AT&T's UNIX System V.

<span class="mw-page-title-main">Network-attached storage</span> Computer data storage server

Network-attached storage (NAS) is a file-level computer data storage server connected to a computer network providing data access to a heterogeneous group of clients. The term "NAS" can refer to both the technology and systems involved, or a specialized device built for such functionality.

<span class="mw-page-title-main">Diskless node</span> Computer workstation operated without disk drives

A diskless node is a workstation or personal computer without disk drives, which employs network booting to load its operating system from a server.

Spring is a discontinued project in building an experimental microkernel-based object-oriented operating system (OS) developed at Sun Microsystems in the early 1990s. Using technology substantially similar to concepts developed in the Mach kernel, Spring concentrated on providing a richer programming environment supporting multiple inheritance and other features. Spring was also more cleanly separated from the operating systems it would host, divorcing it from its Unix roots and even allowing several OSes to be run at the same time. Development faded out in the mid-1990s, but several ideas and some code from the project was later re-used in the Java programming language libraries and the Solaris operating system.

The Cray Time Sharing System, also known in the Cray user community as CTSS, was developed as an operating system for the Cray-1 or Cray X-MP line of supercomputers in 1978. CTSS was developed by the Los Alamos Scientific Laboratory in conjunction with the Lawrence Livermore Laboratory. CTSS was popular with Cray sites in the United States Department of Energy (DOE), but was used by several other Cray sites, such as the San Diego Supercomputing Center.

Lustre is a type of parallel distributed file system, generally used for large-scale cluster computing. The name Lustre is a portmanteau word derived from Linux and cluster. Lustre file system software is available under the GNU General Public License and provides high performance file systems for computer clusters ranging in size from small workgroup clusters to large-scale, multi-site systems. Since June 2005, Lustre has consistently been used by at least half of the top ten, and more than 60 of the top 100 fastest supercomputers in the world, including the world's No. 1 ranked TOP500 supercomputer in November 2022, Frontier, as well as previous top supercomputers such as Fugaku, Titan and Sequoia.

<span class="mw-page-title-main">OpenVZ</span> Operating-system level virtualization technology

OpenVZ is an operating-system-level virtualization technology for Linux. It allows a physical server to run multiple isolated operating system instances, called containers, virtual private servers (VPSs), or virtual environments (VEs). OpenVZ is similar to Solaris Containers and LXC.

The Parallel Virtual File System (PVFS) is an open-source parallel file system. A parallel file system is a type of distributed file system that distributes file data across multiple servers and provides for concurrent access by multiple tasks of a parallel application. PVFS was designed for use in large scale cluster computing. PVFS focuses on high performance access to large data sets. It consists of a server process and a client library, both of which are written entirely of user-level code. A Linux kernel module and pvfs-client process allow the file system to be mounted and used with standard utilities. The client library provides for high performance access via the message passing interface (MPI). PVFS is being jointly developed between The Parallel Architecture Research Laboratory at Clemson University and the Mathematics and Computer Science Division at Argonne National Laboratory, and the Ohio Supercomputer Center. PVFS development has been funded by NASA Goddard Space Flight Center, The DOE Office of Science Advanced Scientific Computing Research program, NSF PACI and HECURA programs, and other government and private agencies. PVFS is now known as OrangeFS in its newest development branch.

In computing, a shared resource, or network share, is a computer resource made available from one host to other hosts on a computer network. It is a device or piece of information on a computer that can be remotely accessed from another computer transparently as if it were a resource in the local machine. Network sharing is made possible by inter-process communication over the network.

A clustered file system (CFS) is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system. Clustered file systems can provide features like location-independent addressing and redundancy which improve reliability or reduce the complexity of the other parts of the cluster. Parallel file systems are a type of clustered file system that spread data across multiple storage nodes, usually for redundancy or performance.

<span class="mw-page-title-main">Kernel (operating system)</span> Core of a computer operating system

The kernel is a computer program at the core of a computer's operating system and generally has complete control over everything in the system. Kernel is also responsible for preventing and mitigating conflicts between different processes It is the portion of the operating system code that is always resident in memory and facilitates interactions between hardware and software components. A full kernel controls all hardware resources via device drivers, arbitrates conflicts between processes concerning such resources, and optimizes the utilization of common resources e.g. CPU & cache usage, file systems, and network sockets. On most systems, the kernel is one of the first programs loaded on startup. It handles the rest of startup as well as memory, peripherals, and input/output (I/O) requests from software, translating them into data-processing instructions for the central processing unit.

The Livermore Time Sharing System (LTSS) was a supercomputer operating system originally developed by the Lawrence Livermore Laboratories for the Control Data Corporation 6600 and 7600 series of supercomputers in 1965.

References

  1. "Components of a Network Operating System". webstart.com.
  2. "Managing Domains in a Network Operating System". webstart.com.
  3. "Managing Domains in a Network Operating System". webstart.com.
  4. "YURL". Waterken Inc.
  5. "Home". wideword.net.

Further reading