Quasi-delay-insensitive circuit

Last updated September 19, 2024

Overview

Pros

Robust to process variation, temperature fluctuation, circuit redesign, and FPGA remapping.
Natural event sequencing facilitates complex control circuitry.
Automatic clock gating and compute-dependent cycle time can save dynamic power and increase throughput by optimizing for average-case workload characteristics instead of worst-case.

Cons

Delay insensitive encodings generally require twice as many wires for the same data.
Communication protocols and encodings generally require twice as many devices for the same functionality.

Chips

QDI circuits have been used to manufacture a large number of research chips, a small selection of which follows.

Theory

The simplest QDI circuit is a ring oscillator implemented using a cycle of inverters. Each gate drives two events on its output node. Either the pull up network drives node's voltage from GND to Vdd or the pull down network from VDD to GND. This gives the ring oscillator six events in total.

Multiple cycles may be connected using a multi-input gate. A c-element, which waits for its inputs to match before copying the value to its output, may be used to synchronize multiple cycles. If one cycle reaches the c-element before another, it is forced to wait. Synchronizing three or more of these cycles creates a pipeline allowing the cycles to trigger one after another.

If cycles are known to be mutually exclusive, then they may be connected using combinational logic (AND, OR). This allows the active cycle to continue regardless of the inactive cycles, and is generally used to implement delay insensitive encodings.

For larger systems, this is too much to manage. So, they are partitioned into processes. Each process describes the interaction between a set of cycles grouped into channels, and the process boundary breaks these cycles into channel ports. Each port has a set of request nodes that tend to encode data and acknowledge nodes that tend to be dataless. The process that drives the request is the sender while the process that drives the acknowledgement is the receiver. Now, the sender and receiver communicate using certain protocols ^{[synthesis 1]} and the sequential triggering of communication actions from one process to the next is modeled as a token traversing the pipeline.

Stability and non-interference

The correct operation of a QDI circuit requires that events be limited to monotonic digital transitions. Instability (glitch) or interference (short) can force the system into illegal states causing incorrect/unstable results, deadlock, and circuit damage. The previously described cyclic structure that ensures stability is called acknowledgement. A transition T1 acknowledges another T2 if there is a causal sequence of events from T1 to T2 that prevents T2 from occurring until T1 has completed.^{[timing 1]}^{[timing 2]}^{[timing 3]} For a DI circuit, every transition must acknowledge every input to its associated gate. For a QDI circuit, there are a few exceptions in which the stability property is maintained using timing assumptions guaranteed with layout constraints rather than causality.^{[layout 1]}

Isochronic fork assumption

An isochronic fork is a wire fork in which one end does not acknowledge the transition driving the wire. A good example of such a fork can be found in the standard implementation of a pre-charge half buffer. There are two types of Isochronic forks. An asymmetric isochronic fork assumes that the transition on the non-acknowledging end happens before or when the transition has been observed on the acknowledging end. A symmetric isochronic fork ensures that both ends observe the transition simultaneously. In QDI circuits, every transition that drives a wire fork must be acknowledged by at least one end of that fork. This concept was first introduced by A. J. Martin to distinguish between asynchronous circuits that satisfy QDI requirements and those that do not. Martin also established that it is impossible to design useful systems without including at least some isochronic forks given reasonable assumptions about the available circuit elements.^{[timing 3]} Isochronic forks were long thought to be the weakest compromise away from fully delay-insensitive systems.

In fact, every CMOS gate has one or more internal isochronic forks between the pull-up and pull-down networks. The pull-down network only acknowledges the up-going transitions of the inputs while the pull-up network only acknowledges the down-going transitions.

Adversarial path assumption

The adversarial path assumption also deals with wire forks, but is ultimately weaker than the isochronic fork assumption. At some point in the circuit after a wire fork, the two paths must merge back into one. The adversarial path is the one that fails to acknowledge the transition on the wire fork. This assumption states that the transition propagating down the acknowledging path reaches the merge point after it would have down the adversarial path.^{[timing 2]} This effectively extends the isochronic fork assumption beyond the confines of the forked wire and into the connected paths of gates.

Half-cycle timing assumption

This assumption relaxes the QDI requirements a little further in the quest for performance. The c-element is effectively three gates, the logic, the driver, and the feedback and is non-inverting. This gets to be cumbersome and expensive if there is a need for a large amount of logic. The acknowledgement theorem states that the driver must acknowledge the logic. The half-cycle timing assumption assumes that the driver and feedback will stabilize before the inputs to the logic are allowed to switch.^{[timing 4]} This allows the designer use the output of the logic directly, bypassing the driver and making shorter cycles for higher frequency processing.

Atomic complex gates

A large amount of the automatic synthesis literature uses atomic complex gates. A tree of gates is assumed to transition completely before any of the inputs at the leaves of the tree are allowed to switch again.^{[timing 5]}^{[timing 6]} While this assumption allows automatic synthesis tools to bypass the bubble reshuffling problem, the reliability of these gates tends to be difficult to guarantee.

Relative timing

Relative Timing is a framework for making and implementing arbitrary timing assumptions in QDI circuits. It represents a timing assumption as a virtual causality arc to complete a broken cycle in the event graph. This allows designers to reason about timing assumptions as a method to realize circuits with higher throughput and energy efficiency by systematically sacrificing robustness.^{[timing 7]}^{[timing 8]}

Representations

Communicating hardware processes (CHP)

Communicating hardware processes (CHP) is a program notation for QDI circuits inspired by Tony Hoare's communicating sequential processes (CSP) and Edsger W. Dijkstra's guarded commands. The syntax is described below in descending precedence.^{[synthesis 2]}

Skipskip does nothing. It simply acts as a placeholder for pass-through conditions.
Dataless assignmenta+ sets the voltage of the node a to Vdd while a- sets the voltage of a to GND.
Assignmenta := e evaluates the expression e then assigns the resulting value to the variablea.
SendX!e evaluates the expression e then sends the resulting value across the channelX. X! is a dataless send.
ReceiveX?a waits until there is a valid value on the channelX then assigns that value to the variablea. X? is a dataless receive.
Probe#X returns the value waiting on the channelX without executing the receive.
Simultaneous compositionS * T executes the process fragmentsS and T at the same time.
Internal parallel compositionS, T executes the process fragmentsS and T in any order.
Sequential compositionS; T executes the process fragmentsS followed by T.
Parallel compositionS || T executes the process fragmentsS and T in any order. This is functionally equivalent to internal parallel composition but with lower precedence.
Deterministic selection[G0 -> S0[]G1 -> S1[]...[]Gn -> Sn] implements choice in which G0,G1,...,Gn are guards which are dataless boolean expressions or data expressions that are implicitly cast using a validity check and S0,S1,...,Sn are process fragments. Deterministic selection waits until one of the guards evaluates to Vdd, then proceeds to execute the guard's associated process fragment. If two guards evaluate to Vdd during the same window of time, an error occurs. [G] is shorthand for [G -> skip] and simply implements a wait.
Non-deterministic selection[G0 -> S0:G1 -> S1:...:Gn -> Sn] is the same as deterministic selection except that more than one guard is allowed to evaluate to Vdd. Only the process fragment associated with the first guard to evaluate to Vdd is executed.
Repetition*[G0 -> S0[]G1 -> S1[]...[]Gn -> Sn] or *[G0 -> S0:G1 -> S1:...:Gn -> Sn] is similar to the associated selection statements except that the action is repeated while any guard evaluates to Vdd. *[S] is shorthand for *[Vdd -> S] and implements infinite repetition.

Hand-shaking expansions (HSE)

Hand-shaking expansions are a subset of CHP in which channel protocols are expanded into guards and assignments and only dataless operators are permitted. This is an intermediate representation toward the synthesis of QDI circuits.

Petri nets (PN)

A petri net (PN) is a bipartite graph of places and transitions used as a model for QDI circuits. Transitions in the petri net represent voltage transitions on nodes in the circuit. Places represent the partial states between transitions. A token inside a place acts as a program counter identifying the current state of the system and multiple tokens may exist in a petri net simultaneously. However, for QDI circuits multiple tokens in the same place is an error.

When a transition has tokens on every input place, that transition is enabled. When the transition fires, the tokens are removed from the input places and new tokens are created on all of the output places. This means that a transition that has multiple output places is a parallel split and a transition with multiple input places is a parallel merge. If a place has multiple output transitions, then any one of those transitions could fire. However, doing so would remove the token from the place and prevent any other transition from firing. This effectively implements choice. Therefore, a place with multiple output transitions is a conditional split and a place with multiple input transitions is a conditional merge.

Event-rule systems (ER)

Event-rule systems (ER) use a similar notation to implement a restricted subset of petri net functionality in which there are transitions and arcs, but no places. This means that the baseline ER system lacks choice as implemented by conditional splits and merges in a petri net and disjunction implemented by conditional merges. The baseline ER system also doesn't allow feedback.

While petri nets are used to model the circuit logic, an ER system models the timing and execution trace of the circuit, recording the delays and dependencies of each transition. This is generally used to determine which gates need to be faster and which gates can be slower, optimizing the sizing of devices in the system.^{[sizing 1]}

Repetitive event-rule systems (RER) add feedback by folding the trace back on itself, marking the fold point with a tick mark.^{[sizing 1]}Extended event-rule systems (XER) add disjunction.^{[sizing 2]}

Production rule set (PRS)

A production rule specifies either the pull-up or pull-down network of a gate in a QDI circuit and follows the syntax G -> S in which G is a guard as described above and S is one or more dataless assignments in parallel as described above. In states not covered by the guards, it is assumed that the assigned nodes remain at their previous states. This can be achieved using a staticizor of either weak or combinational feedback (shown in red). The most basic example is the C-element in which the guards do not cover the states where A and B are not the same value.

Synthesis

There are many techniques for constructing a QDI circuits, but they can generally be classified into two strategies.

Formal synthesis

Formal synthesis was introduced by Alain Martin in 1991.^{[synthesis 2]} The method involves making successive program transformations which are proven to maintain program correctness. The goal of these transformations is to convert the original sequential program into a parallel set of communicating process which each map well to a single pipeline stage. The possible transformations include:

Projection splits a process which has disparate, non-interacting sets of variables into a separate process per set. ^{[synthesis 3]}
Process decomposition splits a process with minimally interacting variables sets into a separate process per set in which each process communicates to another only as necessary across channels.
Slack matching involves adding pipeline stages between two communicating processes in order to increase overall throughput. ^{[synthesis 4]}

Once the program is decomposed into a set of small communicating processes, it is expanded into hand-shaking expansions (HSE). Channel actions are expanded into their constituent protocols and multi-bit operators are expanded into their circuit implementations. These HSE are then reshuffled to optimize the circuit implementation by reducing the number of dependencies.^{[synthesis 5]} Once the reshuffling is decided upon, state variables are added to disambiguate circuit states for a complete state encoding.^{[synthesis 6]} Next, minimal guards are derived for each signal assignment, producing production rules. There are multiple methods for doing this including guard strengthening, guard weakening, and others.^{[synthesis 2]} The production rules are not necessarily CMOS implementable at this point, so bubble reshuffling moves signal inversions around the circuit in an attempt to make it so. However, bubble reshuffling is not guaranteed to succeed. This is where atomic complex gates are generally used in automated synthesis programs.

Syntax directed translation

The second strategy, syntax directed translation, was first introduced in 1988 by Steven Burns. This seeks a simpler approach at the expense of circuit performance by mapping each CHP syntax to a hand-compiled circuit template.^{[synthesis 7]} Synthesizing a QDI circuit using this method strictly implements the control flow as dictated by the program. This was later adopted by Philips Research Laboratories in their implementation of Tangram. Unlike Steven Burns' approach using circuit templates, Tangram mapped the syntax to a strict set of standard cells, facilitating layout as well as synthesis.^{[synthesis 8]}

Templated synthesis

A hybrid approach introduced by Andrew Lines in 1998 transforms the sequential specification into parallel specifications as in formal synthesis, but then uses predefined pipeline templates to implement those parallel processes similar to syntax-directed translation.^{[synthesis 9]} Lines outlined three efficient logic families or reshufflings.

Weak condition half buffer (WCHB)

Weak condition half buffer (WCHB) is the simplest and fastest of the logic families with a 10 transition pipeline cycle (or 6 using the half-cycle timing assumption). However, it is also limited to simpler computations because more complex computations tend to necessitate long chains of transistors in the pull-up network of the forward driver. More complex computations can generally be broken up into simpler stages or handled directly with one of the pre-charge families. The WCHB is a half buffer meaning that a pipeline of N stages can contain at most N/2 tokens at once. This is because the reset of the output request Rr must wait until after the reset of the input Lr.

Pre-charge half buffer (PCHB)

Pre-charge half buffer (PCHB) uses domino logic to implement a more complex computational pipeline stage. This removes the long pull-up network problem, but also introduces an isochronic fork on the input data which must be resolved later in the cycle. This causes the pipeline cycle to be 14 transitions long (or 10 using the half-cycle timing assumption).

Pre-charge full buffer (PCFB)

Pre-charge full buffers (PCFB) are very similar to PCHB, but adjust the reset phase of the reshuffling to implement full buffering. This means that a pipeline of N PCFB stages can contain at most N tokens at once. This is because the reset of the output request Rr is allowed to happen before the reset of the input Lr.

Verification

Along with the normal verification techniques of testing, coverage, etc, QDI circuits may be verified formally by inverting the formal synthesis procedure to derive a CHP specification from the circuit. This CHP specification can then be compared against the original to prove correctness. ^{[verification 1]}^{[verification 2]}

Related Research Articles

<span class="mw-page-title-main">Digital electronics</span> Electronic circuits that utilize digital signals

Digital electronics is a field of electronics involving the study of digital signals and the engineering of devices that use or produce them. This is in contrast to analog electronics which work primarily with analog signals. Despite the name, digital electronics designs includes important analog design considerations.

VHDL is a hardware description language that can model the behavior and structure of digital systems at multiple levels of abstraction, ranging from the system level down to that of logic gates, for design entry, documentation, and verification purposes. The language was developed for the US military VHSIC program in the 1980s, and has been standardized by the Institute of Electrical and Electronics Engineers (IEEE) as IEEE Std 1076; the latest version of which is IEEE Std 1076-2019. To model analog and mixed-signal systems, an IEEE-standardized HDL based on VHDL called VHDL-AMS has been developed.

Complementary metal–oxide–semiconductor is a type of metal–oxide–semiconductor field-effect transistor (MOSFET) fabrication process that uses complementary and symmetrical pairs of p-type and n-type MOSFETs for logic functions. CMOS technology is used for constructing integrated circuit (IC) chips, including microprocessors, microcontrollers, memory chips, and other digital logic circuits. CMOS technology is also used for analog circuits such as image sensors, data converters, RF circuits, and highly integrated transceivers for many types of communication.

Verilog, standardized as IEEE 1364, is a hardware description language (HDL) used to model electronic systems. It is most commonly used in the design and verification of digital circuits at the register-transfer level of abstraction. It is also used in the verification of analog circuits and mixed-signal circuits, as well as in the design of genetic circuits. In 2009, the Verilog standard was merged into the SystemVerilog standard, creating IEEE Standard 1800-2009. Since then, Verilog has been officially part of the SystemVerilog language. The current version is IEEE standard 1800-2023.

In electronics and especially synchronous digital circuits, a clock signal is an electronic logic signal which oscillates between a high and a low state at a constant frequency and is used like a metronome to synchronize actions of digital circuits. In a synchronous logic circuit, the most common type of digital circuit, the clock signal is applied to all storage devices, flip-flops and latches, and causes them all to change state simultaneously, preventing race conditions.

A delay-insensitive circuit is a type of asynchronous circuit which performs a digital logic operation often within a computing processor chip. Instead of using clock signals or other global control signals, the sequencing of computation in delay-insensitive circuit is determined by the data flow.

Asynchronous circuit is a sequential digital logic circuit that does not use a global clock circuit or signal generator to synchronize its components. Instead, the components are driven by a handshaking circuit which indicates a completion of a set of instructions. Handshaking works by simple data transfer protocols. Many synchronous circuits were developed in early 1950s as part of bigger asynchronous systems. Asynchronous circuits and theory surrounding is a part of several steps in integrated circuit design, a field of digital electronics engineering.

<span class="mw-page-title-main">Hardware acceleration</span> Specialized computer hardware

Hardware acceleration is the use of computer hardware designed to perform specific functions more efficiently when compared to software running on a general-purpose central processing unit (CPU). Any transformation of data that can be calculated in software running on a generic CPU can also be calculated in custom-made hardware, or in some mix of both.

In digital computing, the Muller C-element is a small binary logic circuit widely used in design of asynchronous circuits and systems. It outputs 0 when all inputs are 0, it outputs 1 when all inputs are 1, and it retains its output state otherwise. It was specified formally in 1955 by David E. Muller and first used in ILLIAC II computer. In terms of the theory of lattices, the C-element is a semimodular distributive circuit, whose operation in time is described by a Hasse diagram. The C-element is closely related to the rendezvous and join elements, where an input is not allowed to change twice in succession. In some cases, when relations between delays are known, the C-element can be realized as a sum-of-product (SOP) circuit. Earlier techniques for implementing the C-element include Schmitt trigger, Eccles-Jordan flip-flop and last moving point flip-flop.

Asymmetric C-elements are extended C-elements which allow inputs which only effect the operation of the element when transitioning in one of the directions. Asymmetric inputs are attached to either the minus (-) or plus (+) strips of the symbol. The common inputs which effect both the transitions are connected to the centre of the symbol. When transitioning from zero to one, the C-element will take into account the common and the asymmetric plus inputs. All these inputs must be high for the up transition to take place. Similarly when transitioning from one to zero the C-element will take into account the common and the asymmetric minus inputs. All these inputs must be low for the down transition to happen.

In electronics, metastability is the ability of a digital electronic system to persist for an unbounded time in an unstable equilibrium or metastable state. In digital logic circuits, a digital signal is required to be within certain voltage or current limits to represent a '0' or '1' logic level for correct circuit operation; if the signal is within a forbidden intermediate range it may cause faulty behavior in logic gates the signal is applied to. In metastable states, the circuit may be unable to settle into a stable '0' or '1' logic level within the time required for proper circuit operation. As a result, the circuit can act in unpredictable ways, and may lead to a system failure, sometimes referred to as a "glitch". Metastability is an instance of the Buridan's ass paradox.

Static timing analysis (STA) is a simulation method of computing the expected timing of a synchronous digital circuit without requiring a simulation of the full circuit.

In computer architecture, clock gating is a popular power management technique used in many synchronous circuits for reducing dynamic power dissipation, by removing the clock signal when the circuit, or a subpart of it, is not in use or ignores clock signal. Clock gating saves power by pruning the clock tree, at the cost of adding more logic to a circuit. Pruning the clock disables portions of the circuitry so that the flip-flops in them do not switch state, as switching the state consumes power. When not being switched, the switching power consumption goes to zero, and only leakage currents are incurred.

Arbiters are electronic devices that allocate access to shared resources.

Logic simulation is the use of simulation software to predict the behavior of digital circuits and hardware description languages. Simulation can be performed at varying degrees of physical abstraction, such as at the transistor level, gate level, register-transfer level (RTL), electronic system-level (ESL), or behavioral level.

The primary focus of this article is asynchronous control in digital electronic systems. In a synchronous system, operations are coordinated by one, or more, centralized clock signals. An asynchronous system, in contrast, has no global clock. Asynchronous systems do not depend on strict arrival times of signals or messages for reliable operation. Coordination is achieved using event-driven architecture triggered by network packet arrival, changes (transitions) of signals, handshake protocols, and other methods.

The asynchronous array of simple processors (AsAP) architecture comprises a 2-D array of reduced complexity programmable processors with small scratchpad memories interconnected by a reconfigurable mesh network. AsAP was developed by researchers in the VLSI Computation Laboratory (VCL) at the University of California, Davis and achieves high performance and energy efficiency, while using a relatively small circuit area. It was made in 2006.

High-level synthesis (HLS), sometimes referred to as C synthesis, electronic system-level (ESL) synthesis, algorithmic synthesis, or behavioral synthesis, is an automated design process that takes an abstract behavioral specification of a digital system and finds a register-transfer level structure that realizes the given behavior.

In electronics, flip-flops and latches are circuits that have two stable states that can store state information – a bistable multivibrator. The circuit can be made to change state by signals applied to one or more control inputs and will output its state. It is the basic storage element in sequential logic. Flip-flops and latches are fundamental building blocks of digital electronics systems used in computers, communications, and many other types of systems.

Signal Transition Graphs (STGs) are typically used in electronic engineering and computer engineering to describe dynamic behaviour of asynchronous circuits, for the purposes of their analysis or synthesis.

References

Synthesis

↑ Tse, Jonathan; Hill, Benjamin; Manohar, Rajit (May 2013). "A Bit of Analysis on Self-Timed Single-Bit On-Chip Links" (PDF). 2013 IEEE 19th International Symposium on Asynchronous Circuits and Systems. Proceedings of the 19th IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC). pp. 124–133. CiteSeerX 10.1.1.649.294 . doi:10.1109/ASYNC.2013.26. ISBN 978-1-4673-5956-6. S2CID 11196963.
1 2 3 Martin, Alain (1991). Synthesis of Asynchronous VLSI Circuits (PDF) (Report). California Institute of Technology.
↑ Manohar, Rajit; Lee, Tak-Kwan; Martin, Alain (1999). "Projection: A synthesis technique for concurrent systems". Proceedings. Fifth International Symposium on Advanced Research in Asynchronous Circuits and Systems (PDF). pp. 125–134. CiteSeerX 10.1.1.49.2264 . doi:10.1109/ASYNC.1999.761528. ISBN 978-0-7695-0031-7. S2CID 11051137.
↑ Manohar, Rajit; Martin, Alain J. (1998-06-15). "Slack elasticity in concurrent computing". Mathematics of Program Construction (PDF). Lecture Notes in Computer Science. Vol. 1422. Springer, Berlin, Heidelberg. pp. 272–285. CiteSeerX 10.1.1.396.2277 . doi:10.1007/bfb0054295. ISBN 9783540645917.
↑ Manohar, R. (2001). "An analysis of reshuffled handshaking expansions" (PDF). Proceedings Seventh International Symposium on Asynchronous Circuits and Systems. ASYNC 2001. pp. 96–105. CiteSeerX 10.1.1.11.55 . doi:10.1109/async.2001.914073. ISBN 978-0-7695-1034-7. S2CID 5156531. Archived from the original (PDF) on 2017-10-14.
↑ Cortadella, J.; Kishinevsky, M.; Kondratyev, A.; Lavagno, L.; Yakovlev, A. (March 1996). "Complete state encoding based on the theory of regions". Proceedings Second International Symposium on Advanced Research in Asynchronous Circuits and Systems (PDF). pp. 36–47. doi:10.1109/async.1996.494436. hdl:2117/129509. ISBN 978-0-8186-7298-9. S2CID 14297152.
↑ Burns, Steven; Martin, Alain (1988). "Syntax-Directed Translation of Concurrent Programs into Self-Timed Circuits" (PDF). California Institute of Technology.
↑ Berkel, Kees van; Kessels, Joep; Roncken, Marly; Saeijs, Ronald; Schalij, Frits (1991). "The VLSI-programming language Tangram and its translation into handshake circuits" (PDF). Proceedings of the European Conference on Design Automation. IEEE Design Automation. pp. 384–389. doi:10.1109/EDAC.1991.206431. S2CID 34437785.
↑ Lines, Andrew (1998). "Pipelined Asynchronous Circuits" (PDF) (M.S.). California Institute of Technology. doi:10.7907/z92v2d4z.

Timing

↑ Manohar, R.; Moses, Y. (May 2015). "Analyzing Isochronic Forks with Potential Causality". 2015 21st IEEE International Symposium on Asynchronous Circuits and Systems (PDF). pp. 69–76. doi:10.1109/async.2015.19. ISBN 978-1-4799-8716-0. S2CID 10262182.
1 2 Keller, S.; Katelman, M.; Martin, A. J. (May 2009). "A Necessary and Sufficient Timing Assumption for Speed-Independent Circuits". 2009 15th IEEE Symposium on Asynchronous Circuits and Systems (PDF). pp. 65–76. doi:10.1109/async.2009.27. ISBN 978-0-7695-3616-3. S2CID 6612621.
1 2 Martin, Alain J. (1990). "The Limitations to Delay-Insensitivity in Asynchronous Circuits" (PDF). Sixth MIT Conference on Advanced Research in VLSI. MIT Press.
↑ LaFrieda, C.; Manohar, R. (May 2009). "Reducing Power Consumption with Relaxed Quasi Delay-Insensitive Circuits". 2009 15th IEEE Symposium on Asynchronous Circuits and Systems (PDF). pp. 217–226. CiteSeerX 10.1.1.153.3557 . doi:10.1109/async.2009.9. ISBN 978-0-7695-3616-3. S2CID 6282974.
↑ Meng, T. H. Y.; Brodersen, R. W.; Messerschmitt, D. G. (November 1989). "Automatic synthesis of asynchronous circuits from high-level specifications". IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. 8 (11): 1185–1205. doi:10.1109/43.41504. ISSN 0278-0070.
↑ Pastor, E.; Cortadella, J.; Kondratyev, A.; Roig, O. (November 1998). "Structural methods for the synthesis of speed-independent circuits" (PDF). IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. 17 (11): 1108–1129. doi:10.1109/43.736185. hdl:2117/125785. ISSN 0278-0070.
↑ Stevens, K. S.; Ginosar, R.; Rotem, S. (February 2003). "Relative timing" (PDF). IEEE Transactions on Very Large Scale Integration (VLSI) Systems. 11 (1): 129–140. doi:10.1109/tvlsi.2002.801606. ISSN 1063-8210.
↑ Manoranjan, J. V.; Stevens, K. S. (May 2016). "Qualifying Relative Timing Constraints for Asynchronous Circuits". 2016 22nd IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC) (PDF). pp. 91–98. doi:10.1109/async.2016.23. ISBN 978-1-4673-9007-1. S2CID 6239093.

Verification

↑ Longfield, S. J.; Manohar, R. (May 2013). "Inverting Martin Synthesis for Verification". 2013 IEEE 19th International Symposium on Asynchronous Circuits and Systems (PDF). pp. 150–157. CiteSeerX 10.1.1.645.9939 . doi:10.1109/async.2013.10. ISBN 978-1-4673-5956-6. S2CID 762078.
↑ Longfield, Stephen; Nkounkou, Brittany; Manohar, Rajit; Tate, Ross (2015). "Preventing glitches and short circuits in high-level self-timed chip specifications". Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PDF). PLDI '15. New York, NY, USA: ACM. pp. 270–279. doi:10.1145/2737924.2737967. ISBN 9781450334686. S2CID 6363535.

Sizing

1 2 Burns, Steven (1991). Performance Analysis and Optimization of Asynchronous Circuits (Ph.D.). California Institute of Technology.
↑ Lee, Tak-Kwan (1995). A General Approach to Performance Analysis and Optimization of Asynchronous Circuits (Ph.D.). Defense Technical Information Center.^{[ dead link ]}

Layout

↑ Karmazin, R.; Longfield, S.; Otero, C. T. O.; Manohar, R. (May 2015). "Timing Driven Placement for Quasi Delay-Insensitive Circuits". 2015 21st IEEE International Symposium on Asynchronous Circuits and Systems (PDF). pp. 45–52. doi:10.1109/async.2015.16. ISBN 978-1-4799-8716-0. S2CID 10745504.

Chips

↑ Martin, Alain; Burns, Steven; Lee, Tak-Kwan (1989). "The design of an asynchronous microprocessor". ACM SIGARCH Computer Architecture News. 17 (4): 99–110. doi: 10.1145/71317.1186643 .
↑ Martin, Alain; Lines, Andrew; Manohar, Rajit; Nystrom, Mika; Penzes, Paul; Southworth, Robert; Cummings, Uri; Lee, Tak-Kwan (1997). "The design of an asynchronous MIPS R3000 microprocessor". Proceedings Seventeenth Conference on Advanced Research in VLSI. pp. 164–181. doi:10.1109/ARVLSI.1997.634853. ISBN 0-8186-7913-1.
↑ Nanya, T.; Ueno, Y.; Kagotani, H.; Kuwako, M.; Takamura, A. (Summer 1994). "TITAC: design of a quasi-delay-insensitive microprocessor" (PDF). IEEE Design and Test of Computers. 11 (2): 50–63. doi:10.1109/54.282445. ISSN 0740-7475. S2CID 9351043.
↑ Takamura, A.; Kuwako, M.; Imai, M.; Fujii, T.; Ozawa, M.; Fukasaku, I.; Ueno, Y.; Nanya, T. (October 1997). "TITAC-2: An asynchronous 32-bit microprocessor based on scalable-delay-insensitive model". Proceedings International Conference on Computer Design VLSI in Computers and Processors (PDF). pp. 288–294. CiteSeerX 10.1.1.53.7359 . doi:10.1109/iccd.1997.628881. ISBN 978-0-8186-8206-3. S2CID 14119246. Archived from the original (PDF) on 2017-10-14.

External links

Tools

"Petrify: a tool for synthesis of Petri Nets and asynchronous circuits". UPC/DAC VLSI CAD Group. Retrieved 6 October 2017.
Fang, David. "The Hierarchical Asynchronous Circuit Kompiler Toolkit" . Retrieved 6 October 2017.
"Balsa Asynchronous Synthesis System". GitHub . Retrieved 6 October 2017.
Manohar, Rajit. "The ACT language and core tools". GitHub . Retrieved 14 February 2020.
Bingham, Ned (14 September 2024). "Loom". Broccoli. Retrieved 14 September 2024.

Tutorials

Introduction to Self Timed Circuits ( "web"., "slides"., "videos". YouTube .)
ASYNC 2022 Summer School ( "web".)
Silicon Compilation at Yale ( "web".)

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[tse2013-5] Tse, Jonathan; Hill, Benjamin; Manohar, Rajit (May 2013). "A Bit of Analysis on Self-Timed Single-Bit On-Chip Links" (PDF). 2013 IEEE 19th International Symposium on Asynchronous Circuits and Systems. Proceedings of the 19th IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC). pp. 124–133. CiteSeerX 10.1.1.649.294 . doi:10.1109/ASYNC.2013.26. ISBN 978-1-4673-5956-6. S2CID 11196963.

[martin1991-15] 1 2 3 Martin, Alain (1991). Synthesis of Asynchronous VLSI Circuits (PDF) (Report). California Institute of Technology.

[manohar1999-18] Manohar, Rajit; Lee, Tak-Kwan; Martin, Alain (1999). "Projection: A synthesis technique for concurrent systems". Proceedings. Fifth International Symposium on Advanced Research in Asynchronous Circuits and Systems (PDF). pp. 125–134. CiteSeerX 10.1.1.49.2264 . doi:10.1109/ASYNC.1999.761528. ISBN 978-0-7695-0031-7. S2CID 11051137.

[manohar1998-19] Manohar, Rajit; Martin, Alain J. (1998-06-15). "Slack elasticity in concurrent computing". Mathematics of Program Construction (PDF). Lecture Notes in Computer Science. Vol. 1422. Springer, Berlin, Heidelberg. pp. 272–285. CiteSeerX 10.1.1.396.2277 . doi:10.1007/bfb0054295. ISBN 9783540645917.

[manohar2001-20] Manohar, R. (2001). "An analysis of reshuffled handshaking expansions" (PDF). Proceedings Seventh International Symposium on Asynchronous Circuits and Systems. ASYNC 2001. pp. 96–105. CiteSeerX 10.1.1.11.55 . doi:10.1109/async.2001.914073. ISBN 978-0-7695-1034-7. S2CID 5156531. Archived from the original (PDF) on 2017-10-14.

[cortadella1996-21] Cortadella, J.; Kishinevsky, M.; Kondratyev, A.; Lavagno, L.; Yakovlev, A. (March 1996). "Complete state encoding based on the theory of regions". Proceedings Second International Symposium on Advanced Research in Asynchronous Circuits and Systems (PDF). pp. 36–47. doi:10.1109/async.1996.494436. hdl:2117/129509. ISBN 978-0-8186-7298-9. S2CID 14297152.

[burns1988-22] Burns, Steven; Martin, Alain (1988). "Syntax-Directed Translation of Concurrent Programs into Self-Timed Circuits" (PDF). California Institute of Technology.

[berkel1991-23] Berkel, Kees van; Kessels, Joep; Roncken, Marly; Saeijs, Ronald; Schalij, Frits (1991). "The VLSI-programming language Tangram and its translation into handshake circuits" (PDF). Proceedings of the European Conference on Design Automation. IEEE Design Automation. pp. 384–389. doi:10.1109/EDAC.1991.206431. S2CID 34437785.

[lines1998-24] Lines, Andrew (1998). "Pipelined Asynchronous Circuits" (PDF) (M.S.). California Institute of Technology. doi:10.7907/z92v2d4z.

[manohar2015-6] Manohar, R.; Moses, Y. (May 2015). "Analyzing Isochronic Forks with Potential Causality". 2015 21st IEEE International Symposium on Asynchronous Circuits and Systems (PDF). pp. 69–76. doi:10.1109/async.2015.19. ISBN 978-1-4799-8716-0. S2CID 10262182.

[keller2009-7] 1 2 Keller, S.; Katelman, M.; Martin, A. J. (May 2009). "A Necessary and Sufficient Timing Assumption for Speed-Independent Circuits". 2009 15th IEEE Symposium on Asynchronous Circuits and Systems (PDF). pp. 65–76. doi:10.1109/async.2009.27. ISBN 978-0-7695-3616-3. S2CID 6612621.

[martin1990-8] 1 2 Martin, Alain J. (1990). "The Limitations to Delay-Insensitivity in Asynchronous Circuits" (PDF). Sixth MIT Conference on Advanced Research in VLSI. MIT Press.

[lafrieda2009-10] LaFrieda, C.; Manohar, R. (May 2009). "Reducing Power Consumption with Relaxed Quasi Delay-Insensitive Circuits". 2009 15th IEEE Symposium on Asynchronous Circuits and Systems (PDF). pp. 217–226. CiteSeerX 10.1.1.153.3557 . doi:10.1109/async.2009.9. ISBN 978-0-7695-3616-3. S2CID 6282974.

[meng1989-11] Meng, T. H. Y.; Brodersen, R. W.; Messerschmitt, D. G. (November 1989). "Automatic synthesis of asynchronous circuits from high-level specifications". IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. 8 (11): 1185–1205. doi:10.1109/43.41504. ISSN 0278-0070.

[cortadella1998-12] Pastor, E.; Cortadella, J.; Kondratyev, A.; Roig, O. (November 1998). "Structural methods for the synthesis of speed-independent circuits" (PDF). IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. 17 (11): 1108–1129. doi:10.1109/43.736185. hdl:2117/125785. ISSN 0278-0070.

[stevens2003-13] Stevens, K. S.; Ginosar, R.; Rotem, S. (February 2003). "Relative timing" (PDF). IEEE Transactions on Very Large Scale Integration (VLSI) Systems. 11 (1): 129–140. doi:10.1109/tvlsi.2002.801606. ISSN 1063-8210.

[manoranjan2016-14] Manoranjan, J. V.; Stevens, K. S. (May 2016). "Qualifying Relative Timing Constraints for Asynchronous Circuits". 2016 22nd IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC) (PDF). pp. 91–98. doi:10.1109/async.2016.23. ISBN 978-1-4673-9007-1. S2CID 6239093.

[longfield2013-25] Longfield, S. J.; Manohar, R. (May 2013). "Inverting Martin Synthesis for Verification". 2013 IEEE 19th International Symposium on Asynchronous Circuits and Systems (PDF). pp. 150–157. CiteSeerX 10.1.1.645.9939 . doi:10.1109/async.2013.10. ISBN 978-1-4673-5956-6. S2CID 762078.

[longfield2015-26] Longfield, Stephen; Nkounkou, Brittany; Manohar, Rajit; Tate, Ross (2015). "Preventing glitches and short circuits in high-level self-timed chip specifications". Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PDF). PLDI '15. New York, NY, USA: ACM. pp. 270–279. doi:10.1145/2737924.2737967. ISBN 9781450334686. S2CID 6363535.

[burns1991-16] 1 2 Burns, Steven (1991). Performance Analysis and Optimization of Asynchronous Circuits (Ph.D.). California Institute of Technology.

[lee1995-17] Lee, Tak-Kwan (1995). A General Approach to Performance Analysis and Optimization of Asynchronous Circuits (Ph.D.). Defense Technical Information Center.^{[ dead link ]}

[karmazin2015-9] Karmazin, R.; Longfield, S.; Otero, C. T. O.; Manohar, R. (May 2015). "Timing Driven Placement for Quasi Delay-Insensitive Circuits". 2015 21st IEEE International Symposium on Asynchronous Circuits and Systems (PDF). pp. 45–52. doi:10.1109/async.2015.16. ISBN 978-1-4799-8716-0. S2CID 10745504.

[martin1989-1] Martin, Alain; Burns, Steven; Lee, Tak-Kwan (1989). "The design of an asynchronous microprocessor". ACM SIGARCH Computer Architecture News. 17 (4): 99–110. doi: 10.1145/71317.1186643 .

[martin1997-2] Martin, Alain; Lines, Andrew; Manohar, Rajit; Nystrom, Mika; Penzes, Paul; Southworth, Robert; Cummings, Uri; Lee, Tak-Kwan (1997). "The design of an asynchronous MIPS R3000 microprocessor". Proceedings Seventeenth Conference on Advanced Research in VLSI. pp. 164–181. doi:10.1109/ARVLSI.1997.634853. ISBN 0-8186-7913-1.

[nanya1994-3] Nanya, T.; Ueno, Y.; Kagotani, H.; Kuwako, M.; Takamura, A. (Summer 1994). "TITAC: design of a quasi-delay-insensitive microprocessor" (PDF). IEEE Design and Test of Computers. 11 (2): 50–63. doi:10.1109/54.282445. ISSN 0740-7475. S2CID 9351043.

[takamura1997-4] Takamura, A.; Kuwako, M.; Imai, M.; Fujii, T.; Ozawa, M.; Fukasaku, I.; Ueno, Y.; Nanya, T. (October 1997). "TITAC-2: An asynchronous 32-bit microprocessor based on scalable-delay-insensitive model". Proceedings International Conference on Computer Design VLSI in Computers and Processors (PDF). pp. 288–294. CiteSeerX 10.1.1.53.7359 . doi:10.1109/iccd.1997.628881. ISBN 978-0-8186-8206-3. S2CID 14119246. Archived from the original (PDF) on 2017-10-14.

[chips 1]

[chips 2]

[chips 3]

[chips 4]

[synthesis 1]

[timing 1]

[timing 2]

[timing 3]

[layout 1]

[timing 4]

[timing 5]

[timing 6]

[timing 7]

[timing 8]

[synthesis 2]

[sizing 1]

[sizing 2]

[synthesis 3]

[synthesis 4]

[synthesis 5]

[synthesis 6]

[synthesis 7]

[synthesis 8]

[synthesis 9]

[verification 1]

[verification 2]

Quasi-delay-insensitive circuit

Overview

Chips

Contents

Theory

Stability and non-interference

Isochronic fork assumption

Adversarial path assumption

Half-cycle timing assumption

Atomic complex gates

Relative timing

Representations

Communicating hardware processes (CHP)

Hand-shaking expansions (HSE)

Petri nets (PN)

Event-rule systems (ER)

Production rule set (PRS)

Synthesis

Formal synthesis

Syntax directed translation

Templated synthesis

Weak condition half buffer (WCHB)

Pre-charge half buffer (PCHB)

Pre-charge full buffer (PCFB)

Verification

Related Research Articles

References

Synthesis

Timing

Verification

Sizing

Layout

Chips

External links

Tools

Tutorials