Wafer-scale integration

Last updated April 21, 2024

Wafer-scale integration (WSI) is a system of building very-large integrated circuit (commonly called a "chip") networks from an entire silicon wafer to produce a single "super-chip". Combining large size and reduced packaging, WSI was expected to lead to dramatically reduced costs for some systems, notably massively parallel supercomputers but is now being employed for deep learning. The name is taken from the term very-large-scale integration, the state of the art when WSI was being developed.

Overview

In the normal integrated circuit manufacturing process, a single large cylindrical crystal (boule) of silicon is produced and then cut into disks known as wafers. The wafers are then cleaned and polished in preparation for the fabrication process. A photographic process is used to pattern the surface where material ought to be deposited on top of the wafer and where not to. The desired material is deposited and the photographic mask is removed for the next layer. From then on the wafer is repeatedly processed in this fashion, putting on layer after layer of circuitry on the surface.

Multiple copies of these patterns are deposited on the wafer in a grid fashion across the surface of the wafer. After all the possible locations are patterned, the wafer surface appears like a sheet of graph paper, with grid lines delineating the individual chips. Each of these grid locations is tested for manufacturing defects by automated equipment. Those locations that are found to be defective are recorded and marked with a dot of paint (this process is referred to as "inking a die" and more modern wafer fabrication techniques no longer require physical markings to identify defective die). The wafer is then sawed apart to cut out the individual chips. Those defective chips are thrown away, or recycled, while the working chips are placed into packaging and re-tested for any damage that might occur during the packaging process.

Flaws on the surface of the wafers and problems during the layering/depositing process are impossible to avoid, and cause some of the individual chips to be defective. The revenue from the remaining working chips has to pay for the entire cost of the wafer and its processing, including those discarded defective chips. Thus, the higher number of working chips or higher yield, the lower the cost of each individual chip. In order to maximize yield one wants to make the chips as small as possible, so that a higher number of working chips can be obtained per wafer.^{[ clarification needed ]}

Lowering cost

The significant fraction of the cost of fabrication (typically 30%-50%)^{[ citation needed ]} is related to testing and packaging the individual chips. Further cost is associated with connecting the chips into an integrated system (usually via a printed circuit board). Wafer-scale integration seeks to reduce this cost, as well as improve performance, by building larger chips in a single package – in principle, chips as large as a full wafer.^{[ citation needed ]}

Of course this is not easy, since given the flaws on the wafers a single large design printed onto a wafer would almost always not work. It has been an ongoing goal to develop methods to handle faulty areas of the wafers through logic, as opposed to sawing them out of the wafer. Generally, this approach uses a grid pattern of sub-circuits and "rewires" around the damaged areas using appropriate logic. If the resulting wafer has enough working sub-circuits, it can be used despite faults.

Challenges

Most yield loss in chipmaking comes from defects in the transistor layers or in the high-density lower metal layers. Another approach – silicon-interconnect fabric (Si-IF) – has neither on the wafer. Si-IF puts only relatively low-density metal layers on the wafer, roughly the same density as the upper layers of a system on a chip, using the wafer only for interconnects between tightly-packed small bare chiplets.^[1] Si-IF-based processors^[2] and network switches^[3] have been studied.

Production attempts

Many companies attempted to develop WSI production systems in the 1970s and 1980s, but all failed. Texas Instruments and ITT Corporation both saw it as a way to develop complex pipelined microprocessors and re-enter a market where they were losing ground, but neither released any products.

Gene Amdahl also attempted to develop WSI as a method of making a supercomputer, starting Trilogy Systems in 1980^[4]^[5]^[6] and garnering investments from Groupe Bull, Sperry Rand and Digital Equipment Corporation, who (along with others) provided an estimated $230 million in financing. The design called for a 2.5" square chip with 1200 pins on the bottom.

The effort was plagued by a series of disasters, including floods which delayed the construction of the plant and later ruined the clean-room interior. After burning through about 1⁄3 of the capital with nothing to show for it, Amdahl eventually declared the idea would only work with a 99.99% yield, which wouldn't happen for 100 years. He used Trilogy's remaining seed capital to buy Elxsi, a maker of superminicomputers, in 1985. The Trilogy efforts were eventually ended and "became" Elxsi.^[7]

In 1989 Anamartic developed a wafer stack memory based on the technology of Ivor Catt,^[8] but the company was unable to ensure a large enough supply of silicon wafers and folded in 1992.

Wafer-scale devices in production

Cerebras Systems processor

On August 19, 2019, American computer systems company Cerebras Systems presented their development progress of WSI for deep learning acceleration. Cerebras' Wafer-Scale Engine (WSE-1) chip is 46,225mm² (215mm × 215mm), around 56× larger than the largest GPU die. It is manufactured by TSMC using their 16nm process. The WSE-1 features 1.2 trillion transistors, 400,000 AI cores, 18GB of on-chip SRAM, 100Pbit/s on-wafer fabric bandwidth, and 1.2Pbit/s I/O off-wafer bandwidth. The price and clock rate have not been disclosed.^[9] In 2020, the company's product, the CS-1, was tested in computational fluid dynamics simulations. Compared to the Joule Supercomputer at NETL, the CS-1 was 200 times faster, while using much less power.^[10]

In April 2021, Cerebras announced the WSE-2, with twice the number of transistors and 100% claimed yield,^[11] which is achieved by designing a system in which any manufacturing defect can be bypassed.^[11] The Cerebras CS-2 system, which incorporates the WSE-2, is in serial production.

In March 2024, Cerebras announced the WSE-3 with twice the performance of the previous record-holder, the Cerebras WSE-2, at the same power draw and for the same price. It is aimed at AI training and built on TSMC's 5nm process.^[12]

Related Research Articles

An integrated circuit, also known as a microchip, chip or IC, is a small electronic device made up of multiple interconnected electronic components such as transistors, resistors, and capacitors. These components are etched onto a small piece of semiconductor material, usually silicon. Integrated circuits are used in a wide range of electronic devices, including computers, smartphones, and televisions, to perform various functions such as processing and storing information. They have greatly impacted the field of electronics by enabling device miniaturization and enhanced functionality.

Semiconductor device fabrication is the process used to manufacture semiconductor devices, typically integrated circuits (ICs) such as computer processors, microcontrollers, and memory chips that are present in everyday electronic devices. It is a multiple-step photolithographic and physio-chemical process during which electronic circuits are gradually created on a wafer, typically made of pure single-crystal semiconducting material. Silicon is almost always used, but various compound semiconductors are used for specialized applications.

A semiconductor device is an electronic component that relies on the electronic properties of a semiconductor material for its function. Its conductivity lies between conductors and insulators. Semiconductor devices have replaced vacuum tubes in most applications. They conduct electric current in the solid state, rather than as free electrons across a vacuum or as free electrons and ions through an ionized gas.

An application-specific integrated circuit is an integrated circuit (IC) chip customized for a particular use, rather than intended for general-purpose use, such as a chip designed to run in a digital voice recorder or a high-efficiency video codec. Application-specific standard product chips are intermediate between ASICs and industry standard integrated circuits like the 7400 series or the 4000 series. ASIC chips are typically fabricated using metal–oxide–semiconductor (MOS) technology, as MOS integrated circuit chips.

Flip chip, also known as controlled collapse chip connection or its abbreviation, C4, is a method for interconnecting dies such as semiconductor devices, IC chips, integrated passive devices and microelectromechanical systems (MEMS), to external circuitry with solder bumps that have been deposited onto the chip pads. The technique was developed by General Electric's Light Military Electronics Department, Utica, New York. The solder bumps are deposited on the chip pads on the top side of the wafer during the final wafer processing step. In order to mount the chip to external circuitry, it is flipped over so that its top side faces down, and aligned so that its pads align with matching pads on the external circuit, and then the solder is reflowed to complete the interconnect. This is in contrast to wire bonding, in which the chip is mounted upright and fine wires are welded onto the chip pads and lead frame contacts to interconnect the chip pads to external circuitry.

A gate array is an approach to the design and manufacture of application-specific integrated circuits (ASICs) using a prefabricated chip with components that are later interconnected into logic devices according to custom order by adding metal interconnect layers in the factory. It was popular during the upheaval in the semiconductor industry in the 1980s, and its usage declined by the end of the 1990s.

<span class="mw-page-title-main">Hybrid integrated circuit</span> Type of miniature electronic circuit

A hybrid integrated circuit (HIC), hybrid microcircuit, hybrid circuit or simply hybrid is a miniaturized electronic circuit constructed of individual devices, such as semiconductor devices and passive components, bonded to a substrate or printed circuit board (PCB). A PCB having components on a Printed Wiring Board (PWB) is not considered a true hybrid circuit according to the definition of MIL-PRF-38534.

Back end of the line or back end of line (BEOL) is a process in semiconductor device fabrication that consists of depositing metal interconnect layers onto a wafer already patterned with devices. It is the second part of IC fabrication, after front end of line (FEOL). In BEOL, the individual devices are connected to each other according to how the metal wiring is deposited.

The transistor count is the number of transistors in an electronic device. It is the most common measure of integrated circuit complexity. The rate at which MOS transistor counts have increased generally follows Moore's law, which observes that transistor count doubles approximately every two years. However, being directly proportional to the area of a chip, transistor count does not represent how advanced the corresponding manufacturing technology is: a better indication of this is transistor density.

Lam Research Corporation is an American supplier of wafer-fabrication equipment and related services to the semiconductor industry. Its products are used primarily in front-end wafer processing, which involves the steps that create the active components of semiconductor devices and their wiring (interconnects). The company also builds equipment for back-end wafer-level packaging (WLP) and for related manufacturing markets such as for microelectromechanical systems (MEMS).

A die, in the context of integrated circuits, is a small block of semiconducting material on which a given functional circuit is fabricated. Typically, integrated circuits are produced in large batches on a single wafer of electronic-grade silicon (EGS) or other semiconductor through processes such as photolithography. The wafer is cut (diced) into many pieces, each containing one copy of the circuit. Each of these pieces is called a die.

In semiconductor electronics fabrication technology, a self-aligned gate is a transistor manufacturing approach whereby the gate electrode of a MOSFET is used as a mask for the doping of the source and drain regions. This technique ensures that the gate is naturally and precisely aligned to the edges of the source and drain.

Trilogy Systems Corporation was a computer systems company started in 1980. Originally called ACSYS, the company was founded by Gene Amdahl, his son Carl Amdahl and Clifford Madden. Flush with the success of his previous company, Amdahl Corporation, Gene Amdahl was able to raise $230 million for his new venture. Trilogy was the most well funded start-up company up till that point in Silicon Valley history. It had corporate support from Groupe Bull, Digital Equipment Corporation, Unisys, Sperry Rand and others. The plan was to use extremely advanced semiconductor manufacturing techniques to build an IBM compatible mainframe computer that was both cheaper and more powerful than existing systems from IBM and Amdahl Corporation.

In electronic engineering, a through-silicon via (TSV) or through-chip via is a vertical electrical connection (via) that passes completely through a silicon wafer or die. TSVs are high-performance interconnect techniques used as an alternative to wire-bond and flip chips to create 3D packages and 3D integrated circuits. Compared to alternatives such as package-on-package, the interconnect and device density is substantially higher, and the length of the connections becomes shorter.

A three-dimensional integrated circuit is a MOS integrated circuit (IC) manufactured by stacking as many as 16 or more ICs and interconnecting them vertically using, for instance, through-silicon vias (TSVs) or Cu-Cu connections, so that they behave as a single device to achieve performance improvements at reduced power and smaller footprint than conventional two dimensional processes. The 3D IC is one of several 3D integration schemes that exploit the z-direction to achieve electrical performance benefits in microelectronics and nanoelectronics.

ASM is a Dutch headquartered multinational corporation that specializes in the design, manufacturing, sales and service of semiconductor wafer processing equipment for the fabrication of semiconductor devices. ASM's products are used by semiconductor manufacturers in front-end wafer processing in their semiconductor fabrication plants. ASM's technologies include atomic layer deposition, epitaxy, chemical vapor deposition and diffusion.

Embedded wafer level ball grid array (eWLB) is a packaging technology for integrated circuits. The package interconnects are applied on an artificial wafer made of silicon chips and a casting compound.

In integrated circuits (ICs), interconnects are structures that connect two or more circuit elements together electrically. The design and layout of interconnects on an IC is vital to its proper function, performance, power efficiency, reliability, and fabrication yield. The material interconnects are made from depends on many factors. Chemical and mechanical compatibility with the semiconductor substrate and the dielectric between the levels of interconnect is necessary, otherwise barrier layers are needed. Suitability for fabrication is also required; some chemistries and processes prevent the integration of materials and unit processes into a larger technology (recipe) for IC fabrication. In fabrication, interconnects are formed during the back-end-of-line after the fabrication of the transistors on the substrate.

Glossary of microelectronics manufacturing terms

Cerebras Systems Inc. is an American artificial intelligence company with offices in Sunnyvale and San Diego, Toronto, Tokyo and Bangalore, India. Cerebras builds computer systems for complex artificial intelligence deep learning applications.

References

↑ Puneet Gupta and Subramanian S. Iyer. "Goodbye, Motherboard. Hello, Silicon-Interconnect Fabric" 2019.
↑ Saptadeep Pal, Daniel Petrisko, Matthew Tomei, Puneet Gupta, Subbu Iyer, and Rakesh Kumar. "Architecting a Waferscale Processor - A GPU Case Study" 2019.
↑ Shuangliang Chen, Saptadeep Pal, and Rakesh Kumar. "Waferscale Network Switches"2024.
↑ Fortune Magazine article on Trilogy's history, 1986-09-01
↑ CAN TROUBLED TRILOGY FULFILL ITS DREAM? / ERIC N. BERG, NYTimes, July 8, 1984
↑ Trilogy definition in PCMag Encyclopedia
↑ Ivor Catt: Dinosaur Computers, ELECTRONICS WORLD, June 2003
↑ "Anamartic Wafer Stack". Computing History. Retrieved 27 September 2020.
↑ Cutress, Dr Ian. "Hot Chips 31 Live Blogs: Cerebras' 1.2 Trillion Transistor Deep Learning Processor". www.anandtech.com. Retrieved 2019-08-29.
↑ "Cerebras' wafer-size chip is 10,000 times faster than a GPU". VentureBeat. 2020-11-17. Retrieved 2020-11-26.
1 2 Cutress, Dr Ian. "Cerebras Unveils Wafer Scale Engine Two (WSE2): 2.6 Trillion Transistors, 100% Yield". www.anandtech.com. Retrieved 2021-07-26.
↑ "Cerebras Systems Unveils World's Fastest AI Chip with Whopping 4 Trillion Transistors". Cerebras Systems. 2024-03-11. Retrieved 2024-03-19.

External links

"Giant microcircuits for superfast computers", Jim Schefter, Popular Science , January 1984, pp 66–67, 155

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Puneet Gupta and Subramanian S. Iyer. "Goodbye, Motherboard. Hello, Silicon-Interconnect Fabric" 2019.

[2] Saptadeep Pal, Daniel Petrisko, Matthew Tomei, Puneet Gupta, Subbu Iyer, and Rakesh Kumar. "Architecting a Waferscale Processor - A GPU Case Study" 2019.

[3] Shuangliang Chen, Saptadeep Pal, and Rakesh Kumar. "Waferscale Network Switches"2024.

[4] Fortune Magazine article on Trilogy's history, 1986-09-01

[5] CAN TROUBLED TRILOGY FULFILL ITS DREAM? / ERIC N. BERG, NYTimes, July 8, 1984

[6] Trilogy definition in PCMag Encyclopedia

[7] Ivor Catt: Dinosaur Computers, ELECTRONICS WORLD, June 2003

[8] "Anamartic Wafer Stack". Computing History. Retrieved 27 September 2020.

[9] Cutress, Dr Ian. "Hot Chips 31 Live Blogs: Cerebras' 1.2 Trillion Transistor Deep Learning Processor". www.anandtech.com. Retrieved 2019-08-29.

[10] "Cerebras' wafer-size chip is 10,000 times faster than a GPU". VentureBeat. 2020-11-17. Retrieved 2020-11-26.

[:0-11] 1 2 Cutress, Dr Ian. "Cerebras Unveils Wafer Scale Engine Two (WSE2): 2.6 Trillion Transistors, 100% Yield". www.anandtech.com. Retrieved 2021-07-26.

[12] "Cerebras Systems Unveils World's Fastest AI Chip with Whopping 4 Trillion Transistors". Cerebras Systems. 2024-03-11. Retrieved 2024-03-19.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]