FPS AP-120B

Last updated

The FPS AP-120B was a 38-bit, pipeline-oriented array processor manufactured by Floating Point Systems. It was designed to be attached to a host computer such as a DEC PDP-11 as a fast number-cruncher. Data transfer was accomplished using direct memory access.

In computing, a vector processor or array processor is a central processing unit (CPU) that implements an instruction set containing instructions that operate on one-dimensional arrays of data called vectors, compared to the scalar processors, whose instructions operate on single data items. Vector processors can greatly improve performance on certain workloads, notably numerical simulation and similar tasks. Vector machines appeared in the early 1970s and dominated supercomputer design through the 1970s into the 1990s, notably the various Cray platforms. The rapid fall in the price-to-performance ratio of conventional microprocessor designs led to the vector supercomputer's demise in the later 1990s.

Floating Point Systems Inc. (FPS) was a Beaverton, Oregon vendor of attached array processors and minisupercomputers. The company was founded in 1970 by former Tektronix engineer Norm Winningstad, with partners Tom Prince, Frank Bouton and Robert Carter. Carter was a salesman for Data General Corp. who persuaded Bouton and Prince to leave Tektronix to start the new company. Winningstad was the fourth partner.

PDP-11 Series of 16-bit minicomputers

The PDP-11 is a series of 16-bit minicomputers sold by Digital Equipment Corporation (DEC) from 1970 into the 1990s, one of a succession of products in the PDP series. In total, around 600,000 PDP-11s of all models were sold, making it one of DEC's most successful product lines. The PDP-11 is considered by some experts to be the most popular minicomputer ever.

Contents

Processor cycle time was 167 nanoseconds, giving a speed of 6 MHz. Since it could present two floating point results per cycle, one from the adder and the other from the multiplier, a capacity of 12 Megaflops was claimed for the processor.

In computing, floating point operations per second is a measure of computer performance, useful in fields of scientific computations that require floating-point calculations. For such cases it is a more accurate measure than measuring instructions per second.

Architecture

The processor was designed around the concept of multiple parallel processing units operating in synchronization. A single 64-bit instruction word was divided into fields, each of which instructed a particular module under the control of the CPU. The modules were as follows:

The processor had access to dual-interleaved core memory in which odd numbered addresses were stored in one physical bank, and even numbered addresses were stored in the other. This represented an attempt to take advantage of typical sequential fetching of memory words. Fetching sequentially from one physical bank would result in a latency of two instruction cycles before the data was loaded into the destination data pad. Interleaving allowed a sequential access to occur immediately after the previous one. Both accesses took two cycles to complete, but the overlap and dual destination pads maximized the use of the data channel.

The floating point arithmetic modules were both multi-stage processors which were driven by explicit instructions. In the two-stage adder an assembler instruction such as FADD DX,DY would load values from data pads DX and DY into stage one of the adder. A subsequent FADD instruction would be required to present the result at the adder's output. This second FADD could be a dummy with no arguments, or it could be the next calculation in a sequence. In this fashion a stream of FADD operations could be performed in a pipeline, with a new result in every instruction cycle though every addition requires two cycles.

Similarly the multiplier, a three-stage unit, required one FMUL DX,DY to begin a multiplication, followed by two more FMUL instructions to produce the result. Careful programming of the pipeline allowed the production of one result per cycle, with each calculation taking three cycles in itself.

For maximum efficiency all calculations were programmed using the assembler language supplied with the hardware. A high-level language resembling Fortran was provided for coordinating tasks and controlling data transfers to and from the host computer.

Fortran General-purpose programming language

Fortran is a general-purpose, compiled imperative programming language that is especially suited to numeric computation and scientific computing.

Lookup tables

In order to support typical applications in signal processing, the hardware was delivered with a pre-calculated lookup table of sine and cosine values. Sines and cosines for angles from 0 to π/2 radians were stored in alternate addresses to take advantage of the interleaving described above. Values for all other angles could be calculated by using one or other of the values from the lookup table, negating if necessary, using well-known rules.

Sine ratio between the lengths of the opposite side and the hypotenuse in a right triangle

In mathematics, the sine is a trigonometric function of an angle. The sine of an acute angle is defined in the context of a right triangle: for the specified angle, it is the ratio of the length of the side that is opposite that angle to the length of the longest side of the triangle.

Typical programming style

This was unusual, being driven by the synchronous parallel processing architecture. The basic philosophy can be summarized as follows:

The final item was accomplished as follows: assume that the entire calculation requires 15 cycles, and the minimum loop size is 5 cycles. The first 5 instruction words begin iteration 1 of the calculation. The second 5 words contain both iteration 1, and the beginning of iteration 2 in parallel. This usually would be a copy of the operations beginning iteration 1. The next 5 words contain the final steps of iteration 1, the middle of iteration 2, and the beginning of iteration 3. These five words form the body of the loop which repeats until the desired number of data points have been processed.

Application

As an attached processor, the AP-120B was typically used as a low cost/cost-effective adjunct to systems like diagnostic medical imaging systems, and more. In the early 80's a VAX 11/780 or 11/785 with an FPS-AP-120B and a Versatec plotter were the workhorse systems for seismic data processing in the oil industry. Commercial seismic processing packages were written so that they could call FPS AP-120B routines if one were present.

Related Research Articles

Central processing unit Central component of any computer system which executes input/output, arithmetical, and logical operations

A central processing unit (CPU), also called a central processor or main processor, is the electronic circuitry within a computer that carries out the instructions of a computer program by performing the basic arithmetic, logic, controlling, and input/output (I/O) operations specified by the instructions. The computer industry has used the term "central processing unit" at least since the early 1960s. Traditionally, the term "CPU" refers to a processor, more specifically to its processing unit and control unit (CU), distinguishing these core elements of a computer from external components such as main memory and I/O circuitry.

P5 (microarchitecture) Intel microporocessor

The first Pentium microprocessor was introduced by Intel on March 22, 1993. Dubbed P5, its microarchitecture was the fifth generation for Intel, and the first superscalar IA-32 microarchitecture. As a direct extension of the 80486 architecture, it included dual integer pipelines, a faster floating-point unit, wider data bus, separate code and data caches and features for further reduced address calculation latency. In 1996, the Pentium with MMX Technology was introduced with the same basic microarchitecture complemented with an MMX instruction set, larger caches, and some other enhancements.

In computing, Streaming SIMD Extensions (SSE) is a single instruction, multiple data (SIMD) instruction set extension to the x86 architecture, designed by Intel and introduced in 1999 in their Pentium III series of Central processing units (CPUs) shortly after the appearance of Advanced Micro Devices (AMD's) 3DNow!. SSE contains 70 new instructions, most of which work on single precision floating point data. SIMD instructions can greatly increase performance when exactly the same operations are to be performed on multiple data objects. Typical applications are digital signal processing and graphics processing.

CDC 6600 computer

The CDC 6600 was the flagship of the 6000 series of mainframe computer systems manufactured by Control Data Corporation. Generally considered to be the first successful supercomputer, it outperformed the industry's prior recordholder, the IBM 7030 Stretch, by a factor of three. With performance of up to three megaFLOPS, the CDC 6600 was the world's fastest computer from 1964 to 1969, when it relinquished that status to its successor, the CDC 7600.

Digital signal processor specialized microprocessor optimized for digital signal processing

A digital signal processor (DSP) is a specialized microprocessor, with its architecture optimized for the operational needs of digital signal processing.

In the history of computer hardware, some early reduced instruction set computer central processing units used a very similar architectural solution, now called a classic RISC pipeline. Those CPUs were: MIPS, SPARC, Motorola 88000, and later the notional CPU DLX invented for education.

CDC 7600

The CDC 7600 was the Seymour Cray-designed successor to the CDC 6600, extending Control Data's dominance of the supercomputer field into the 1970s. The 7600 ran at 36.4 MHz and had a 65 Kword primary memory using magnetic core and variable-size secondary memory. It was generally about ten times as fast as the CDC 6600 and could deliver about 10 MFLOPS on hand-compiled code, with a peak of 36 MFLOPS. In addition, in benchmark tests in early 1970 it was shown to be slightly faster than its IBM rival, the IBM System/360, Model 195. When the system was released in 1969, it sold for around $5 million in base configurations, and considerably more as options and features were added.

In computer science and particularly in compiler design, loop nest optimization (LNO) is an optimization technique that applies a set of loop transformations for the purpose of locality optimization or parallelization or other loop overhead reduction of the loop nests. One classical usage is to reduce memory access latency or the cache bandwidth necessary due to cache reuse for some common linear algebra algorithms.

CDC Cyber Range of mainframe-class supercomputers manufectured by Control Data Corporation (CDC) during the 1970s and 1980s

The CDC Cyber range of mainframe-class supercomputers were the primary products of Control Data Corporation (CDC) during the 1970s and 1980s. In their day, they were the computer architecture of choice for scientific and mathematically intensive computing. They were used for modeling fluid flow, material science stress analysis, electrochemical machining analysis, probabilistic analysis, energy and academic computing, radiation shielding modeling, and other applications. The lineup also included the Cyber 18 and Cyber 1000 minicomputers. Like their predecessor, the CDC 6600, they were unusual in using the ones' complement binary representation.

Intel 8087

The Intel 8087, announced in 1980, was the first x87 floating-point coprocessor for the 8086 line of microprocessors.

CDC STAR-100

The CDC STAR-100 is a vector supercomputer that was designed, manufactured, and marketed by Control Data Corporation (CDC). It was one of the first machines to use a vector processor to improve performance on appropriate scientific applications. It was also the first supercomputer to use integrated circuits and the first to be equipped with one million words of computer memory.

RISC Single Chip

The RISC Single Chip, or RSC, is a single-chip microprocessor developed and fabricated by International Business Machines (IBM). The RSC was a feature-reduced single-chip implementation of the POWER1, a multi-chip central processing unit (CPU) which implemented the POWER instruction set architecture (ISA). It was used in entry-level workstation models of the IBM RS/6000 family, such as the Model 220 and 230.

The Orion was a series of 32-bit super-minicomputers designed and produced in the 1980s by High Level Hardware Limited (HLH), a company based in Oxford, UK. The company produced four versions of the machine:

In computer science, software pipelining is a technique used to optimize loops, in a manner that parallels hardware pipelining. Software pipelining is a type of out-of-order execution, except that the reordering is done by a compiler instead of the processor. Some computer architectures have explicit support for software pipelining, notably Intel's IA-64 architecture.

R10000 MIPS microprocessor

The R10000, code-named "T5", is a RISC microprocessor implementation of the MIPS IV instruction set architecture (ISA) developed by MIPS Technologies, Inc. (MTI), then a division of Silicon Graphics, Inc. (SGI). The chief designers are Chris Rowen and Kenneth C. Yeager. The R10000 microarchitecture is known as ANDES, an abbreviation for Architecture with Non-sequential Dynamic Execution Scheduling. The R10000 largely replaces the R8000 in the high-end and the R4400 elsewhere. MTI was a fabless semiconductor company; the R10000 was fabricated by NEC and Toshiba. Previous fabricators of MIPS microprocessors such as Integrated Device Technology (IDT) and three others did not fabricate the R10000 as it was more expensive to do so than the R4000 and R4400.

R4000 RISC microprocessor

The R4000 is a microprocessor developed by MIPS Computer Systems that implements the MIPS III instruction set architecture (ISA). Officially announced on 1 October 1991, it was one of the first 64-bit microprocessors and the first MIPS III implementation. In the early 1990s, when RISC microprocessors were expected to replace CISC microprocessors such as the Intel i486, the R4000 was selected to be the microprocessor of the Advanced Computing Environment (ACE), an industry standard that intended to define a common RISC platform. ACE ultimately failed for a number of reasons, but the R4000 found success in the workstation and server markets.

Saturn Launch Vehicle Digital Computer computer of the Saturn V rocket

The Saturn Launch Vehicle Digital Computer (LVDC) was a computer that provided the autopilot for the Saturn V rocket from launch to Earth orbit insertion. Designed and manufactured by IBM's Electronics Systems Center in Owego, N.Y., it was one of the major components of the Instrument Unit, fitted to the S-IVB stage of the Saturn V and Saturn IB rockets. The LVDC also supported pre- and post-launch checkout of the Saturn hardware. It was used in conjunction with the Launch Vehicle Data Adaptor (LVDA) which performed signal conditioning to the sensor inputs to the computer from the launch vehicle.

Multithreading (computer architecture) ability of a central processing unit (CPU) or a single core in a multi-core processor to execute multiple processes or threads concurrently

In computer architecture, multithreading is the ability of a central processing unit (CPU) to provide multiple threads of execution concurrently, supported by the operating system. This approach differs from multiprocessing. In a multithreaded application, the threads share the resources of a single or multiple cores, which include the computing units, the CPU caches, and the translation lookaside buffer (TLB).

The Mill architecture is a novel belt machine-based computer architecture for general-purpose computing. It has been under development since about 2003 by Ivan Godard and his startup Mill Computing, Inc., formerly named Out Of The Box Computing, in East Palo Alto, California. Mill Computing claims it has a "10x single-thread power/performance gain over conventional out-of-order superscalar architectures" but "runs the same programs, without rewrite".

References

The CRC Press, LLC is a publishing group based in the United States that specializes in producing technical books. Many of their books relate to engineering, science and mathematics. Their scope also includes books on business, forensics and information technology. CRC Press is now a division of Taylor & Francis, itself a subsidiary of Informa.

International Standard Book Number Unique numeric book identifier

The International Standard Book Number (ISBN) is a numeric commercial book identifier which is intended to be unique. Publishers purchase ISBNs from an affiliate of the International ISBN Agency.