TI Advanced Scientific Computer

Last updated

The Advanced Scientific Computer (ASC) is a supercomputer designed and manufactured by Texas Instruments (TI) between 1966 and 1973. [1] The ASC's central processing unit (CPU) supported vector processing, a performance-enhancing technique which was key to its high-performance. The ASC, along with the Control Data Corporation STAR-100 supercomputer (which was introduced in the same year), were the first computers to feature vector processing. However, this technique's potential was not fully realized by either the ASC or STAR-100 due to an insufficient understanding of the technique; it was the Cray Research Cray-1 supercomputer, announced in 1975 that would fully realize and popularize vector processing. The more successful implementation of vector processing in the Cray-1 would demarcate the ASC (and STAR-100) as first-generation vector processors, with the Cray-1 belonging in the second.

Contents

History

TI began as a division of Geophysical Service Incorporated (GSI), a company that performed seismic surveys for oil exploration companies. GSI was now a subsidiary of TI, and TI wanted to apply the latest computer technology to the processing and analysis of seismic datasets. The ASC project started as the Advanced Seismic Computer. As the project developed, TI decided to expand its scope. "Seismic" was replaced by "Scientific" in the name, allowing the project to retain the designation ASC.

Originally the software, including an operating system and a FORTRAN compiler, were done under contract by Computer Usage Company, under direction of George R. Trimble, Jr. [2] [3] but later taken over by TI itself. Southern Methodist University in Dallas developed an ALGOL compiler for the ASC.

Architecture

The ASC was based around a single high-speed shared memory, which was accessed by the CPU and eight I/O channel controllers, in an organization similar to Seymour Cray's groundbreaking CDC 6600. Memory was accessed solely under the control of the memory control unit (MCU). The MCU was a two-way, 256-bit per channel parallel network that could support up to eight independent processors, with a ninth channel for accessing "main memory" (referred to as "extended memory"). The MCU also acted as a cache controller, offering high-speed access to a semiconductor-based memory for the eight processor ports, and handling all communications to the 24-bit address space in main memory. The MCU was designed to operate asynchronously, allowing it to work at a variety of speeds and scale across a number of performance points. For instance, main memory could be constructed out of slower but less expensive core memory, although this was not used in practice. At the fastest, it could sustain transfer rates of 80 million 32-bit words per second per port, for a total transfer rate of 640 million words per second. This was well beyond the capabilities of even the fastest memories of the era.

The CPU had a 60 ns clock cycle (16.67 MHz clock frequency) and its logic was built from 20-gate emitter-coupled logic integrated circuits originally developed by TI for the ILLIAC IV supercomputer. The CPU had an extremely advanced architecture and organization for its era, supporting microcoded arithmetic and mathematical instructions that operated on scalars, vectors, or matrices. The vector processing facilities had a memory-to-memory architecture; where the vector operands were read from, and the resulting vector written to, memory. The CPU could have one, two, or four vector lanes, allowing the CPU to produce one to four vector results every cycle, depending on the number of vector lanes installed. The vector lanes were also used for scalar instructions, and each lane could keep up to 12 scalar instructions in-flight simultaneously. The CPU, with four lanes, allowed up to 36 instructions in total across the entire CPU.

The processor had forty-eight 32-bit registers, a huge number for the time. 16 of the registers were used for addressing, 16 for scalar operations, 8 for index offsets, and 8 for specifying the various parameters for vector instructions. Data was moved between the registers and memory by load/store instructions, which could transfer from 4–64 bits (two registers) at a time.

Most vector processors tended to be memory bandwidth-limited, that is, they could process data faster than they could get it from memory. This remains a major problem on modern SIMD designs as well, which is why considerable effort has been put into increasing memory throughput in modern computer designs (although largely unsuccessfully). In the ASC this was improved somewhat with a lookahead unit that predicted upcoming memory accesses and loaded them into the scalar registers invisibly, using a memory interface in the CPU called the memory buffer unit (MBU).

The "Peripheral Processor" was a separate system dedicated entirely to quickly running the operating system and programs running within it, as well as feeding data to the CPU. The PP was built out of eight "virtual processors" (VPs), which were designed to handle instructions and basic integer arithmetic only. Each VP had its own program counter and registers, and the system could thus run eight programs at the same time, limited only by memory accesses. Keeping eight programs running allowed the system to shuffle execution of programs on the CPU depending on what data was available on the memory bus at that time, minimizing "dead time" where the CPU had to wait for data from the memory.

The PP also included a set of sixty-four 32-bit communications registers (CRs). The CRs stored the state required for communication between the various parts of the ASC: the CPU, VPs, and channel controllers.

The ASC instruction set include a bit-reverse instruction that was intended to speed up the calculation of fast Fourier transforms (FFTs). By the time the ASC was in production, better FFT algorithms had been developed that did not require this operation. TI offered a bounty to the first person to come up with a valid use for this instruction, but was never collected.

Market reception

When ASC machines first became available in the early 1970s, they outperformed almost all other machines, including the CDC STAR-100, and under certain conditions matched that of the one-off ILLIAC IV. However, only seven had been installed when the Cray-1 was announced in 1975. The Cray-1 dedicated almost all of its design to sustained high-speed access to memory,[ clarification needed ][ citation needed ] including over one million 64-bit words of semiconductor memory and a cycle time that was one-fifth that of the ASC (12.5 ns). Although the ASC was in some ways a more expandable design, in the supercomputer market speed is preferred,[ clarification needed ] and the Cray-1 was much faster. ASC sales ended almost overnight, and although an upgraded ASC had been designed with a cycle time one-fifth that of the original, Texas Instruments decided to exit the market.

Vector processing applications

The ASC #1 prototype was a one pipe system and brought up in Austin, Texas, off site from TI's main plant for proprietary information reasons. It was later upgraded to two pipes and renamed as ASC # 1A. It was then used by TI's GSI division for seismic data processing. ASC #2 was leased to Shell Oil Company in the Netherlands and also used for seismic data processing. ASC #3 was installed at the Redstone Arsenal in Huntsville, Alabama, for Anti Ballistic Missile Interception technology development. With the SALT Treaty, the system was later redeployed to the Army Corps of Engineers in Vicksburg, Mississippi, for dam stress analysis. ASC #4 was used by NOAA at Princeton University for developing weather forecasting models. ASC systems #5 and #6 were installed at TI's main plant in Austin and also used by GSI for seismic data processing. ASC #7 went to the Naval Research Lab in Washington, D.C. [4] for plasma physics studies.

Related Research Articles

<span class="mw-page-title-main">Cray-1</span> Supercomputer manufactured by Cray Research

The Cray-1 was a supercomputer designed, manufactured and marketed by Cray Research. Announced in 1975, the first Cray-1 system was installed at Los Alamos National Laboratory in 1976. Eventually, eighty Cray-1s were sold, making it one of the most successful supercomputers in history. It is perhaps best known for its unique shape, a relatively small C-shaped cabinet with a ring of benches around the outside covering the power supplies and the cooling system.

<span class="mw-page-title-main">Single instruction, multiple data</span> Type of parallel processing

Single instruction, multiple data (SIMD) is a type of parallel processing in Flynn's taxonomy. SIMD can be internal and it can be directly accessible through an instruction set architecture (ISA), but it should not be confused with an ISA. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneously.

In computing, a vector processor or array processor is a central processing unit (CPU) that implements an instruction set where its instructions are designed to operate efficiently and effectively on large one-dimensional arrays of data called vectors. This is in contrast to scalar processors, whose instructions operate on single data items only, and in contrast to some of those same scalar processors having additional single instruction, multiple data (SIMD) or SIMD within a register (SWAR) Arithmetic Units. Vector processors can greatly improve performance on certain workloads, notably numerical simulation and similar tasks. Vector processing techniques also operate in video-game console hardware and in graphics accelerators.

<span class="mw-page-title-main">CDC 6600</span> Mainframe computer by Control Data

The CDC 6600 was the flagship of the 6000 series of mainframe computer systems manufactured by Control Data Corporation. Generally considered to be the first successful supercomputer, it outperformed the industry's prior recordholder, the IBM 7030 Stretch, by a factor of three. With performance of up to three megaFLOPS, the CDC 6600 was the world's fastest computer from 1964 to 1969, when it relinquished that status to its successor, the CDC 7600.

<span class="mw-page-title-main">ILLIAC IV</span> First massively parallel computer

The ILLIAC IV was the first massively parallel computer. The system was originally designed to have 256 64-bit floating point units (FPUs) and four central processing units (CPUs) able to process 1 billion operations per second. Due to budget constraints, only a single "quadrant" with 64 FPUs and a single CPU was built. Since the FPUs all processed the same instruction – ADD, SUB etc. – in modern terminology, the design would be considered to be single instruction, multiple data, or SIMD.

Alliant Computer Systems Corporation was a computer company that designed and manufactured parallel computing systems. Together with Pyramid Technology and Sequent Computer Systems, Alliant's machines pioneered the symmetric multiprocessing market. One of the more successful companies in the group, over 650 Alliant systems were produced over their lifetime. The company was hit by a series of financial problems and went bankrupt in 1992.

<span class="mw-page-title-main">CDC 7600</span> 1967 supercomputer

The CDC 7600 was designed by Seymour Cray to be the successor to the CDC 6600, extending Control Data's dominance of the supercomputer field into the 1970s. The 7600 ran at 36.4 MHz and had a 65 Kword primary memory using magnetic core and variable-size secondary memory. It was generally about ten times as fast as the CDC 6600 and could deliver about 10 MFLOPS on hand-compiled code, with a peak of 36 MFLOPS. In addition, in benchmark tests in early 1970 it was shown to be slightly faster than its IBM rival, the IBM System/360, Model 195. When the system was released in 1967, it sold for around $5 million in base configurations, and considerably more as options and features were added.

<span class="mw-page-title-main">CDC Cyber</span> Range of mainframe-class supercomputers

The CDC Cyber range of mainframe-class supercomputers were the primary products of Control Data Corporation (CDC) during the 1970s and 1980s. In their day, they were the computer architecture of choice for scientific and mathematically intensive computing. They were used for modeling fluid flow, material science stress analysis, electrochemical machining analysis, probabilistic analysis, energy and academic computing, radiation shielding modeling, and other applications. The lineup also included the Cyber 18 and Cyber 1000 minicomputers. Like their predecessor, the CDC 6600, they were unusual in using the ones' complement binary representation.

<span class="mw-page-title-main">CDC 8600</span>

The CDC 8600 was the last of Seymour Cray's supercomputer designs while he worked for Control Data Corporation. As the natural successor to the CDC 6600 and CDC 7600, the 8600 was intended to be about 10 times as fast as the 7600, already the fastest computer on the market. The design was essentially four 7600's, packed into a very small chassis so they could run at higher clock speeds.

<span class="mw-page-title-main">CDC STAR-100</span>

The CDC STAR-100 is a vector supercomputer that was designed, manufactured, and marketed by Control Data Corporation (CDC). It was one of the first machines to use a vector processor to improve performance on appropriate scientific applications. It was also the first supercomputer to use integrated circuits and the first to be equipped with one million words of computer memory.

<span class="mw-page-title-main">Cray-3</span> Supercomputer by Cray research

The Cray-3 was a vector supercomputer, Seymour Cray's designated successor to the Cray-2. The system was one of the first major applications of gallium arsenide (GaAs) semiconductors in computing, using hundreds of custom built ICs packed into a 1 cubic foot (0.028 m3) CPU. The design goal was performance around 16 GFLOPS, about 12 times that of the Cray-2.

<span class="mw-page-title-main">NEC SX-6</span>

The SX-6 is a NEC SX supercomputer built by NEC Corporation that debuted in 2001; the SX-6 was sold under license by Cray Inc. in the U.S. Each SX-6 single-node system contains up to eight vector processors, which share up to 64 GB of computer memory. The SX-6 processor is a single chip implementation containing a vector processor unit and a scalar processor fabricated in a 0.15 μm CMOS process with copper interconnects, whereas the SX-5 was a multi-chip implementation. The Earth Simulator is based on the SX-6 architecture.

The VP2000 was the second series of vector supercomputers from Fujitsu. Announced in December 1988, they replaced Fujitsu's earlier FACOM VP Model E Series. The VP2000 was succeeded in 1995 by the VPP300, a massively parallel supercomputer with up to 256 vector processors.

A barrel processor is a CPU that switches between threads of execution on every cycle. This CPU design technique is also known as "interleaved" or "fine-grained" temporal multithreading. Unlike simultaneous multithreading in modern superscalar architectures, it generally does not allow execution of multiple instructions in one cycle.

The Cray-3/SSS was a pioneering massively parallel supercomputer project that bonded a two-processor Cray-3 to a new SIMD processing unit based entirely in the computer's main memory. It was later considered as an add-on for the Cray T90 series in the form of the T94/SSS, but there is no evidence this was ever built.

<span class="mw-page-title-main">CDC 6000 series</span> Family of 1960s mainframe computers

The CDC 6000 series is a discontinued family of mainframe computers manufactured by Control Data Corporation in the 1960s. It consisted of the CDC 6200, CDC 6300, CDC 6400, CDC 6500, CDC 6600 and CDC 6700 computers, which were all extremely rapid and efficient for their time. Each is a large, solid-state, general-purpose, digital computer that performs scientific and business data processing as well as multiprogramming, multiprocessing, Remote Job Entry, time-sharing, and data management tasks under the control of the operating system called SCOPE. By 1970 there also was a time-sharing oriented operating system named KRONOS. They were part of the first generation of supercomputers. The 6600 was the flagship of Control Data's 6000 series.

<span class="mw-page-title-main">Cray T90</span>

The Cray T90 series was the last of a line of vector processing supercomputers manufactured by Cray Research, Inc, superseding the Cray C90 series. The first machines were shipped in 1995, and featured a 2.2 ns (450 MHz) clock cycle and two-wide vector pipes, for a peak speed of 1.8 gigaflops per processor; the high clock speed arises from the CPUs being built using ECL logic. As with the Cray J90, each CPU contained a scalar data cache, in addition to the instruction buffering/caching which has always been in Cray architectures.

<span class="mw-page-title-main">Cray J90</span>

The Cray J90 series was an air-cooled vector processor supercomputer first sold by Cray Research in 1994. The J90 evolved from the Cray Y-MP EL minisupercomputer, and is compatible with Y-MP software, running the same UNICOS operating system. The J90 supported up to 32 CMOS processors with a 10 ns clock. It supported up to 4 GB of main memory and up to 48 GB/s of memory bandwidth, giving it considerably less performance than the contemporary Cray T90, but making it a strong competitor to other technical computers in its price range. All input/output in a J90 system was handled by an IOS called IOS Model V. The IOS-V was based on the VME64 bus and SPARC I/O processors (IOPs) running the VxWorks RTOS. The IOS was programmed to emulate the IOS Model E, used in the larger Cray Y-MP systems, in order to minimize changes in the UNICOS operating system. By using standard VME boards, a wide variety of commodity peripherals could be used.

The HITAC S-810 is a family of vector supercomputers developed, manufactured and marketed by Hitachi. The S-810, first announced in August 1982, was the second Japanese supercomputer, following the Fujitsu VP-200 but predating the NEC SX-2. The S-810 was Hitachi's first supercomputer, although the company had previously built a vector processor, the IAP.

Gather/scatter is a type of memory addressing that at once collects (gathers) from, or stores (scatters) data to, multiple, arbitrary indices. Examples of its use include sparse linear algebra operations, sorting algorithms, fast Fourier transforms, and some computational graph theory problems. It is the vector equivalent of register indirect addressing, with gather involving indexed reads, and scatter, indexed writes. Vector processors have hardware support for gather and scatter operations, as do many input/output systems, allowing large data sets to be transferred to main memory more rapidly.

References

  1. Electronics. McGraw-Hill Publishing Company. 1973. p. 36.
  2. George R. Trimble Jr. (June 24, 2005). "CUC History". Computer History Museum. Retrieved May 30, 2010.
  3. George R. Trimble Jr. (Summer 2001). "A brief history of computing. Memoirs of living on the edge". IEEE Annals of the History of Computing. 23 (3). IEEE Computer Society: 44–59. doi:10.1109/85.948905. S2CID   5259268.
  4. "ASC No. 6 used for GSI data" . Retrieved August 10, 2024.