main-content

## Über dieses Buch

In 2002, the International Conference on Computer Aided Design (ICCAD) celebrates its 20th anniversary. This book commemorates contributions made by ICCAD to the broad field of design automation during that time. The foundation of ICCAD in 1982 coincided with the growth of Large Scale Integration. The sharply increased functionality of board-level circuits led to a major demand for more powerful Electronic Design Automation (EDA) tools. At the same time, LSI grew quickly and advanced circuit integration became widely avail­ able. This, in turn, required new tools, using sophisticated modeling, analysis and optimization algorithms in order to manage the evermore complex design processes. Not surprisingly, during the same period, a number of start-up com­ panies began to commercialize EDA solutions, complementing various existing in-house efforts. The overall increased interest in Design Automation (DA) re­ quired a new forum for the emerging community of EDA professionals; one which would be focused on the publication of high-quality research results and provide a structure for the exchange of ideas on a broad scale. Many of the original ICCAD volunteers were also members of CANDE (Computer-Aided Network Design), a workshop of the IEEE Circuits and Sys­ tem Society. In fact, it was at a CANDE workshop that Bill McCalla suggested the creation of a conference for the EDA professional. (Bill later developed the name).

## Inhaltsverzeichnis

### Formal Methods for Functional Verification

Formal hardware verification ranges from proving that two combinational circuits compute the same functions to the much more ambitious task of proving that a sequential circuit obeys some abstract property expressed in temporal logic. In tracing the history of work in this area, we find a few efforts in the 1970s and 1980s, with a big increase in verification capabilities the late 1980s up through today. The advent of efficient Boolean inference methods, starting with Binary Decision Diagrams (BDDs) and more recently with efficient Boolean satisfiability (SAT) checkers has provided the enabling technology for these advances.

Randal E. Bryant, James H. Kukula

### Automating the Diagnosis and the Rectification of Design Errors with PRIAM

This paper presents the original extensions brought to Priam to automate both the diagnosis and the rectification of the design errors detected by this tool. Priam is an industrial automated formal verifier used to check the functional correctness of digital circuits of up to 20000 transistors. These extensions implement a new approach to diagnosis based on Boolean equation solving. In particular, no enumeration of the faulty patterns is necessary to find out the incorrect gates in the circuit. The diagnosis system can handle any circuit that can be verified by Priam.

Jean Christophe Madre, Olivier Coudert, Jean Paul Billon

### Functional Comparison of Logic Designs for VLSI Circuits

Determining whether or not two circuits are functionally equivalent is of fundamental importance in many phases of the design of computer logic. We describe a new method for circuit equivalence which proceeds by reducing the question of whether two circuits are equivalent to a number of more easily answered questions concerning the equivalence of smaller, related circuits. This method can be used to extend the power of any given equivalence-checking algorithm. We report the results of experiments evaluating our technique.

C. Leonard Berman, Louise H. Trevillyan

### A Unified Framework for the Formal Verification of Sequential Circuits

Hardware description languages (HDLs) dramatically change the way circuit designers work. These languages can be used to describe circuits at a very high level of abstraction, which allows the designers to specify the behavior of a circuit before realizing it. The validation of these specifications is currently done by executing them, which is very costly [2]. This cost motivates the research [3, 5, 7, 10] done on the automatic verification of temporal properties of finite state machines.

### Dynamic Variable Ordering for Ordered Binary Decision Diagrams

The Ordered Binary Decision Diagram (OBDD) has proven useful in many applications as an efficient data structure for representing and manipulating Boolean functions. A serious drawback of OBDD’S is the need for application-specific heuristic algorithms to order the variables before processing. Further, for many problem instances in logic synthesis, the heuristic ordering algorithms which have been proposed are insufficient to allow OBDD operations to complete within a limited amount of memory. In this paper, I propose a solution to these problems based on having the OBDD package itself determine and maintain the variable order. This is done by periodically applying a minimization algorithm to reorder the variables of the OBDD to reduce its size. A new OBDD minimization algorithm, called the sifting algorithm, is proposed and appears especially effective in reducing the size of the OBDD. Experiments with dynamic variable ordering on the problem of forming the OBDD’S for the primary outputs of a combinational circuit show that many computations complete using dynamic variable ordering when the same computation fails otherwise.

R. Rudell

### Verification of Large Synthesized Designs

The problem of checking equality of Boolean functions can be solved successfully using existing techniques for only a limited range of examples. We extend the range by using a test generator and the divide and conquer paradigm.

Daniel Brand

### Grasp—A New Search Algorithm for Satisfiability

This paper introduces GRASP (Generic seaRch Algorithm for the Satisfiability Problem), an integrated algorithmic framework for SAT that unifies several previously proposed search-pruning techniques and facilitates identification of additional ones. GRASP is premised on the inevitability of conflicts during search and its most distinguishing feature is the augmentation of basic backtracking search with a powerful conflict analysis procedure. Analyzing conflicts to determine their causes enables GRASP to backtrack non-chronologically to earlier levels in the search tree, potentially pruning large portions of the search space. In addition, by “recording” the causes of conflicts, GRASP can recognize and preempt the occurrence of similar conflicts later on in the search. Finally, straightforward bookkeeping of the causality chains leading up to conflicts allows GRASP to identify assignments that are necessary for a solution to be found. Experimental results obtained from a large number of benchmarks, including many from the field of test pattern generation, indicate that application of the proposed conflict analysis techniques to SAT algorithms can be extremely effective for a large number of representative classes of SAT instances.

João P. Marques Silva, Karem A. Sakallah

### System Design and Analysis Overview

At the inception of ICCAD in 1983, system-level design was only a small fish in the EDA pond. In the earlier conferences, only one or at most 2 sessions were dedicated to the topic. This has changed dramatically over the years, and today system design is one of the pillars of the conference. In this paper, we describe the major trends in the field as can be traced from the papers published in the conference as well as other seminal publications. While doing so, we put the papers selected for this volume in the context of the ongoing trends at the time of publication, and their impact on the field. In addition, we provide some background that may help to determine why some proposed approaches did or did not succeed in the long term. We conclude the paper with some reflections on the past and the future.

Hugo De Man, Jan Rabaey

### An Efficient Microcode-Compiler for Custom DSP-Processors

In this paper, a microcode compiler for custom DSP-processors is presented. This tool is part of the CATHEDRAL II silicon compiler. Two optimization problems in the microcode compilation process are highlighted: microprogram scheduling and memory allocation. Algorithms to solve them, partly based on heuristics, are presented. Our compiler successfully handles repetitive programs, and is able to decide on hardware binding. In most practical examples, optimal solutions are found. Whenever possible, indications of the complexity are given.

Gert Goossens, Jan Rabaey, Joos Vandewalle, Hugo De Man

### HYPER-LP: A System for Power Minimization Using Architectural Transformations

The increasing demand for “portable” computing and communication, has elevated power consumption to be the most critical design parameter. An automated high-level synthesis system, HYPER-LP, is presented for minimizing power consumption in application specific datapath intensive CMOS circuits using a variety of architectural and computational transformations. The sources of power consumption are reviewed and the effects of architectural transformations on the various power components are presented. The synthesis environment consists of high-level estimation of power consumption, a library of transformation primitives (local and global), and heuristic/probabilistic optimization search mechanisms for fast and efficient scanning of the design space. Examples with varying degree of computational complexity and structures are optimized and synthesized using the HYPER-LP system. The results indicate that an order of magnitude reduction in power can be achieved over current-day design methodologies while maintaining the system throughput; in some cases this can be accomplished while preserving or reducing the implementation area.

Anantha P. Chandrakasan, Miodrag Potkonjak, Jan Rabaey, Robert W. Brodersen

### Power Analysis of Embedded Software: First Step Towards Software Power Minimization

Embedded computer systems are characterized by the presence of a dedicated processor and the software that runs on it. Power constraints are increasingly becoming the critical component of the design specification of these systems. At present, however, power analysis tools can only be applied at the lower levels of the design — the circuit or gate level. It is either impractical or impossible to use the lower level tools to estimate the power cost of the software component of the system. This paper describes the first systematic attempt to model this power cost. A power analysis technique is developed that has been applied to two commercial microprocessors — Intel 486DX2 and Fujitsu SPARClite 934. This technique can be employed to evaluate the power cost of embedded software and also be used to search the design space in software power optimization.

Vivek Tiwari, Sharad Malik, Andrew Wolfe

### A Methodology for Correct-by-Construction Latency Insensitive Design

In Deep Sub-Micron (DSM) designs, performance will depend critically on the latency of long wires. We propose a new synthesis methodology for synchronous systems that makes the design functionally insensitive to the latency of long wires. Given a synchronous specification of a design, we generate a functionally equivalent synchronous implementation that can tolerate arbitrary communication latency between latches. By using latches we can break a long wire in short segments which can be traversed while meeting a single clock cycle constraint. The overall goal is to obtain a design that is robust with respect to delays of long wires, in a shorter time by reducing the multiple iterations between logical and physical design, and with performance that is optimized with respect to the speed of the single components of the design. In this paper we describe the details of the proposed methodology as well as report on the latency insensitive design of PDLX, an out-of-order microprocessor with speculative-execution.

Luca P. Carloni, Kenneth L. McMillan, Alexander Saldanha, Alberto L. Sangiovanni-Vincentelli

### Exploring Performance Tradeoffs for Clustered VLIW ASIPs

VLIW ASIPs provide an attractive solution for increasingly pervasive real-time multimedia and signal processing embedded applications. In this paper we propose an algorithm to support trade-off exploration during the early phases of the design/specialization of VLIW ASIPs with clustered datapaths. For purposes of an early exploration step, we define a parameterized family of clustered datapaths D(m, n), where m and n denote interconnect capacity and cluster capacity constraints on the family. Given a kernel, the proposed algorithm explores the space of feasible clustered datapaths and returns: a datapath configuration; a binding and scheduling for the operations; and a corresponding estimate for the best achievable latency over the specified family. Moreover, we show how the parameters m and n, as well as a target latency optionally specified by the designer, can be used to effectively explore trade-offs among delay, power/energy, and latency. Extensive empirical evidence is provided showing that the proposed approach is strikingly effective at attacking this complex optimization problem.

Margarida F. Jacome, Gustavo de Veciana, Viktor Lapinskii

### Logic Synthesis Overview

The dream of generating efficient logic implementations from higher level specifications originated with the works of Boole,[57] Shannon [58], Quine and McCluskey[35, 36]. Interest in logic synthesis grew during the 60’s and 70’s, as the computers being designed became more complex. Although many theoretical advances were made, the first examples of practical synthesis did not occur until the later 70’s. Programmable logic arrays, PLAs, were minimized with the program, MINI, [15] and used on many product chips in IBM. LSS[10, 9] was the first example of production synthesis of gate array chips. It was based on local transformations and compiler techniques for optimizing logic, mapping gates to a specific technology. It was used on hundreds of product chips in IBM and was rewritten later with many significant refinements as BooleDozer[11]. These optimization methods were also used in the first offerings from Synop-sys[12, 13], a company formed in 1986 (originally Optimal Solutions) to market synthesis technology developed at General Electric. Synopsys succeeded in bringing synthesis to the commercial market enabling a dramatic advance in design productivity. The years that followed saw many important developments that produced improvements in execution speed, quality of results and ability to deal with real technologies. Today, logic synthesis is a critical part of almost all chip development projects.

Robert K. Brayton, John A. Darringer

### Multiple-Level Logic Optimization System

MIS is a multi-level logic synthesis and minimization system and is an integral part of the Berkeley Synthesis Project. MIS starts from a description of a combinational logic macro-cell and produces an optimized set of logic equations which preserves the input-output behavior of the macro-cell. The system includes algorithms for minimizing the area required to implement the logic equations, and a global timing optimization step which is used to change the form of the logic equations along the critical path in order to meet system-level timing constraints. This paper provides an overview of the optimization system including the input language, the algorithms which minimize the area of the implementation, and the algorithms used to re-structure the logic network to meet the system-level timing constraints. Although the system is still under development, pieces of an industrially designed chip have been re-synthesized with MIS and the results compare favorably with the manual designs.

R. Brayton, E. Detjens, N. Phillips, S. Krishna, T. Ma, P. McGeer, L. Pei, N. Phillips, R. Rudell, R. Segal, A. Wang, R. Yung, A. Sangiovanni-Vincentelli

### Exact Minimization of Multiple-Valued Functions for PLA Optimization

We present an algorithm for determining the minimum representation of an incompletely-specified, multiple-valued input, binary-valued output, function. The overall strategy is similar to the well-known Quine-McCluskey algorithm; however, the techniques used to solve each step are new. The advantages of the algorithm include a fast technique for detecting and eliminating from further consideration the essential prime implicants and the totally redundant prime implicants, and a fast technique for generating a reduced form of the prime implicant table. The minimum cover problem is solved with a branch and bound algorithm using a maximal independent set heuristic to control the selection of a branching variable and the bounding. Using this algorithm, we have derived minimum representations for several mathematical functions whose unsuccessful exact minimization has been previously reported in the literature. The exact algorithm has been used to determine the efficiency and solution quality provided by the heuristic minimize Espresso-MV [11] Also, a detailed comparison with McBoole [2] shows that the algorithm presented here is able to solve a larger percentage of the problems from a set of industrial examples within a fixed allocation of computer resources.

R. Rudell, A. Sangiovanni-Vincentelli

### Improved Logic Optimization Using Global-Flow Analysis

This paper is concerned with techniques for automatically reducing circuit size and improving testability. In an earlier paper [2], we introuced a new method for circuit optimization based on ideas of global-flow analysis. In this paper, we describe two extensions to the method. The first is a basic improvement in the primary result on which the earlier optimization was based, the second extends the applicability of the method to “conditional” optimizations as well. Together, these enhancements result in improved performance for the original algorithm, as well as the ability to handle designer-specified “don’t cares” and redundancy removal uniformly in the framework of a graph-based synthesis system, such as LSS[8].

C. Leonard Berman, Louise H. Trevillyan

### A Method for Concurrent Decomposition and Factorization of Boolean Expressions

This paper describes efficient algorithms for decomposition and factorization of Boolean expressions. The method uses only two-literal single-cube divisors and double-cube divisors considered concurrently with their complementary expressions. We demonstrate that these objects, despite their simplicity, provide a very good framework to reason about common algebraic divisors and the duality relations between expressions. The algorithm has been implemented and excellent results on several benchmark circuits illustrate its efficiency and effectiveness.

### An Optimal Technology Mapping Algorithm for Delay Optimization in Lookup-Table Based FPGA Designs

In this paper we present a polynomial time technology mapping algorithm, called Flow-Map, that optimally solves the LUT based FPGA technology mapping problem for depth minimization for general Boolean networks. This theoretical breakthrough makes a sharp contrast with the fact that conventional technology mapping problem in library based designs is NP-hard. A key step in Flow-Map is to compute a minimum height K-feasible cut in a network, solved by network flow computation. Our algorithm also effectively minimizes the number of LUTs by maximizing the volume of each cut and by several postprocessing operations. We tested the Flow-Map algorithm on a set of benchmarks and achieved reductions on both the network depth and the number of LUTs in mapping solutions as compared with previous algorithms.

Jason Cong, Yuzheng Ding

### Logic Decomposition during Technology Mapping

A problem in technology mapping is that quality of the final implementation depends significantly on the initially provided circuit structure. To resolve this problem, conventional techniques iteratively but separately apply technology independent transformations and technology mapping.In this paper, we propose a procedure which performs logic decomposition and technology mapping simultaneously. We show that the procedure effectively explores all possible algebraic decompositions. It finds an optimal tree implementation over all the circuit structures examined, while the run time is typically logarithmic in the number of decompositions.

Eric Lehman, Yosinori Watanabe, Joel Grodstein, Heather Harkness

### Highlights in Analog and Digital Circuit Design and Synthesis at ICCAD

When ICCAD began in 1983, we had no robust tools for device modelling, analog circuit synthesis, electrical timing simulation, transistor-to-logic abstraction, or large-scale custom circuit tuning. Today, all these techniques are in common industrial usage, most commercially available. The seven papers in this section fundamentally transformed the way in which we model, manipulate, and solve these difficult tasks today. They are linked by a common thread of “deep circuits innovation”.

Ramesh Harjani, Philippe Magarshack, Gerard Mas, Rob A. Rutenbar

### An Interactive Device Characterization and Model Development System

A computer aided design system for integrated circuits is not complete without accurate circuit simulation capabilities. Accurate simulation is not possible without accurate models and model parameters. TECAP2 (Transistor Electrical Characterization and Analysis Program) is a program that greatly simplifies the development of models, the extraction of model parameters from measured data, and the comparison of the effectiveness of various models.

Ebrahim Khalily, Peter H. Decher, Darrell A. Teegarden

### TILOS: A Posynomial Programming Approach to Transistor Sizing

A new transistor sizing algorithm, which couples synchronous timing analysis with convex optimization techniques, is presented. Let A be the sum of transistor sizes, T the longest delay through the circuit, and K a positive constant. Using a distributed RC model, each of the following three programs is shown to be convex: 1) Minimize A subject to T < K. 2) Minimize T subject to A < K. 3) Minimize ATK. The convex equations describing T are a particular class of functions called posynomials. Convex programs have many pleasant properties, and chief among these is the fact that any point found to be locally optimal is certain to be globally optimal TILOS (Timed Logic Synthesizer) is a program that sizes transistors in CMOS circuits. Preliminary results of TILOS’s transistor sizing algorithm are presented.

J. P. Fishburn, A. E. Dunlop

### SPECS2: An Integrated Circuit Timing Simulator

SPECS2 is a prototype implementation of a new timing simulation and modeling methodology. SPECS2 (Simulation Program for Electronic Circuits and Systems 2) is a tree/link based, event-driven, timing simulator. A modeling technique, which is predicated on the conservation of charge and energy, is employed to produce table models for device evaluation. The tables may be constructed to model devices at any desired level of detail. Thus, SPECS2 is a variable accuracy simulator. Grossly differing accuracy requirements may be specified for different runs and also mixed over different parts of the same circuit. SPECS2 implements a novel oscillation detection and suppression scheme that prevents algorithmic oscillation, while leaving real circuit results undistorted. SPECS2 takes advantage of the tree/link formulation of the circuit equations to provide a formal and general approach to timing simulation. It encounters no special problems with floating capacitors or transmission gates. Further, SPECS2 provides the framework for a generalized macromodeling and simulation capability.

Chandu Visweswariah, Ronald A. Rohrer

### Automatic Synthesis of Operational Amplifiers based on Analytic Circuit Models

An automatic synthesis tool for CMOS op amps (OPASYN) has been developed. The program starts from one of a number of op amp circuits and proceeds to optimize various device sizes and bias currents to meet a given set of design specifications. Because it uses analytic circuit models in its inner optimization loop, it can search efficiently through a large part of the possible solution space. The program has a SPICE interface that automatically performs circuit simulations for the candidate solutions to verify the results of the synthesis and optimization procedure. The simulation results are also used to fine-tune the analytic circuit descriptions in the database. OPASYN has been implemented in Franz Lisp and demonstrated for three different basic circuits with a conventional 3 µm process and a more advanced 1.5 µm process. Experiments have shown that OPASYN quickly produces practical designs which will meet reasonable design objectives.

Han Young Koh, Carlo H. Séquin, Paul R. Gray

### Analog Circuit Synthesis for Performance in OASYS

This paper describes mechanisms needed to meet aggressive performance demands in a hierarchically-structured analog circuit synthesis tool. Experiences with adding a high-speed comparator design style to the OASYS synthesis tool are discussed. It is argued that design iteration — the process of making a heuristic design choice, following it through to possible failure, then diagnosing the failure and modifying the overall plan of attack for the synthesis —is essential to meet stringent performance demands. Examples of high-speed comparators automatically synthesized by OASYS are presented. Designs competitive in quality with manual expert designs, e.g., with response time of 6 ns and input drive of 1 mV, can be synthesized in under 5 seconds on a workstation.1

Ramesh Harjani, Rob A. Rutenbar, L. Richard Carley

### Extraction of Gate-Level Models from Transistor Circuits by Four-Valued Symbolic Analysis

The program TRANALYZE generates a gate-level representation of an MOS transistor circuit. The resulting model contains only four-valued unit and zero delay logic primitives, suitable for evaluation by conventional gate-level simulators and hardware simulation accelerators. TRANA-LYZE has the same generality and accuracy as switch-level simulation, generating models for a wide range of technologies and design styles, while expressing the detailed effects of bidirectional transistors, stored charge, and multiple signal strengths. It produces models with size comparable to ones generated by hand.

Randal E. Bryant

### Optimization of Custom MOS Circuits by Transistor Sizing

Optimization of a circuit by transistor sizing is often a slow, tedious and iterative manual process which relies on designer intuition. Circuit simulation is carried out in the inner loop of this tuning procedure. Automating the transistor sizing process is an important step towards being able to rapidly design high-performance, custom circuits. Jiffy Tune is a new circuit optimization tool that automates the tuning task. Delay, rise/fall time, area and power targets are accommodated. Each (weighted) target can be either a constraint or an objective function. Minimax optimization is supported. Transistors can be ratioed and similar structures grouped to ensure regular layouts. Bounds on transistor widths are supported.Jiffy Tune uses LANCELOT, a large-scale nonlinear optimization package with an augmented Lagrangian formulation. Simple bounds are handled explicitly and trust region methods are applied to minimize a composite objective function. In the inner loop of the optimization, the fast circuit simulator SPECS is used to evaluate the circuit. SPECS is unique in its ability to efficiently provide time-domain sensitivities, thereby enabling gradient-based optimization. Both the adjoint and direct methods of sensitivity computation have been implemented in SPECS.To assist the user, interfaces in the Cadence and SLED design systems have been constructed. These interfaces automate the specification of the optimization task, the running of the optimizer and the back-annotation of the results on to the circuit schematic.Jiffy Tune has been used to tune over 100 circuits for a custom, high-performance microprocessor that makes use of dynamic logic circuits. Circuits with over 250 tunable transistors have been successfully optimized. Automatic circuit tuning has been found to facilitate design re-use. The designers’ focus shifts from solving the optimization problem to specifying it correctly and completely. This paper describes the algorithms of Jiffy Tune, the environment in which it is used and presents a case study of the application of Jiffy Tune to individual circuits of the microprocessor.

Andrew R. Conn, Paula K. Coulman, Ruud A. Haring, Gregory L. Morrill, Chandu Visweswariah

### Highlights in Physical Simulation and Analysis at ICCAD

Six papers were chosen to represent twenty years of research in physical simulation and analysis, three papers addressing the problem of extracting and simulating interconnect effects and three papers describing techniques for simulating steady-state and noise behavior in RF circuits. In this commentary paper we will try to describe the contribution of each paper and place that contribution in some historical context.

Kenneth S. Kundert, Jacob White

### Nonlinear Circuit Simulation in the Frequency-Domain

Simulation in the frequency-domain avoids many of the severe problems experienced when trying to use traditional time-domain simulators such as Spice [1] to find the steady-state behavior of analog, RF, and microwave circuits. In particular, frequency-domain simulation eliminates problems from distributed components and high-Q circuits by forgoing a nonlinear differential equation representation of the circuit in favor of a complex algebraic representation.This paper describes the spectral Newton technique for performing simulation of nonlinear circuits in the frequency-domain, and its implementation in Harmonica. Also described are the techniques used by Harmonica to exploit both the structure of the spectral Newton formulation and the characteristics of the circuits that would be typically seen by this type of simulator. These techniques allow Harmonica to be used on much larger circuits than were normally attempted by previous nonlinear frequency-domain simulators, making it suitable for use on Monolithic Microwave Integrated Circuits (MMICs).

Kenneth S. Kundert, Alberto Sangiovanni-Vincentelli

### Modeling the Driving-Point Characteristic of Resistive Interconnect for Accurate Delay Estimation

In recent years, on-chip interconnect has had an increasingly important impact on overall system performance. Much work has been done to develop algorithms which can efficiently and accurately predict delay through on-chip interconnect. These algorithms compute a reduced-order approximation (usually based on the “Elmore delay”) for voltage-transfer ratios (from source to loads) in an RC-tree model for the interconnect. However, much less emphasis has been placed on accurately approximating the driving-point characteristic at the root of an RC-tree. A good driving-point approximation is needed to accurately predict how delay through a gate is influenced by the interconnect which that gate must drive. Macromodels for on-chip gates typically consider only total capacitance of the driven interconnect, completely ignoring series resistance. In this paper, we present an efficient algorithm which accounts for series resistance by computing a reduced-order approximation for the driving-point admittance of an RC-tree. Using an ECL clock buffer as an example, we demonstrate a significant improvement in accuracy.

Peter R. O’Brien, Thomas L. Savarino

### Efficient Techniques for Inductance Extraction of Complex 3-D Geometries

In this paper we describe combining a mesh analysis equation formulation technique with a preconditioned GMRES matrix solution algorithm to accelerate the determination of inductances of complex three-dimensional structures. Results from FASTHENRY, our 3-D inductance extraction program, demonstrate that the method is more than an order of magnitude faster than the standard solution techniques for large problems.

M. Kamon, M. J. Tsuk, C. Smithhisler, J. White

### Time-Domain Non-Monte Carlo Noise Simulation for Nonlinear Dynamic Circuits with Arbitrary Excitations

A new, time-domain, non-Monte Carlo method for computer simulation of electrical noise in nonlinear dynamic circuits with arbitrary excitations is presented. This time-domain noise simulation method is based on the results from the theory of stochastic differential equations. The noise simulation method is general in the sense that any nonlinear dynamic circuit with any kind of excitation, which can be simulated by the transient analysis routine in a circuit simulator, can be simulated by our noise simulator in time-domain to produce the noise variances and covariances of circuit variables as a function of time, provided that noise models for the devices in the circuit are available. Noise correlations between circuit variables at different time points can also be calculated. Previous work on computer simulation of noise in integrated circuits is reviewed with comparisons to our method. Shot, thermal and flicker noise models for integrated-circuit devices, in the context of our time-domain noise simulation method, are described. The implementation of this noise simulation method in a circuit simulator (SPICE) is described. Two examples of noise simulation (a CMOS ring-oscillator and a BJT active mixer) are given.

Alper Demir, Edward W. Y. Liu, Alberto L. Sangiovanni-Vincentelli

### PRIMA: Passive Reduced-Order Interconnect Macromodeling Algorithm

This paper describes PRIMA, an algorithm for generating provably passive reduced order N-port models for RCL interconnect circuits. It is demonstrated that, in addition to requiring macro- model stability, macromodel passivity is needed to guarantee the overall circuit stability once the active and passive driver/load models are connected. PRIMA extends the block Arnoldi technique to include guaranteed passivity. Moreover, it is empirically observed that the accuracy is superior to existing block Arnoldi methods. While the same passivity extension is not possible for MPVL, we observed comparable accuracy in the frequency domain for all examples considered. Additionally, a path tracing algorithm is used to calculate the reduced order macromodel with the utmost efficiency for generalized RLC interconnects.

Altan Odabasioglu, Mustafa Celik, Lawrence T. Pileggi

### Circuit Noise Evaluation by Padé Approximation Based Model-Reduction Techniques

This paper introduces a new circuit noise analysis and modeling method. The noise analysis method computes an analytic expression of frequency, in rational form, which represents the Pad6 approximation of the noise power spectral density. The approximation can be carried out efficiently, to the required accuracy, using a variant of the PVL [1] or MPVL [2] algorithms. The new method is significantly more efficient than traditional methods for noise computation at numerous frequency points. In addition, it allows for a compact and cascadable modeling of noise that can be used in system-level simulations.

Peter Feldmann, Roland W. Freund

### Physical Design Overview

Since the early 70’s, physical design has been a crucial part of chip design. Its origin came from the need for obtaining optimum electronic packaging, which involves the placement of components and modules on boards and interconnecting the pins on modules. Most early approaches were by and large empirical and ad. hoc. One major exception is perhaps Lee’s maze router [8], which has proven to be a powerful computational tool for routing. In the late 70’s more analytical tools for layout began to evolve both from industry and the universities. The field of physical design gradually attracted sizeable research interests.

Ernest S. Kuh, Hsu Chi-Ping

### Floorplan Design Using Annealing

An application of simulated annealing to floorplan design or macro placement is described. It uses a minimum size configuration space without invalid solutions, a constant object function, an adaptive control schedule, and an indicator for proper convergence. Fast convergence and improved flexibility are the salient features of this approach.

Ralph H. J. M. Otten, Lukas P. P. P. van Ginneken

### GOALIE: A Space-Efficient System for VLSI Artwork Analysis

This paper deals with the algorithmic foundations of the GOALIE artwork analysis system. GOALIE includes programs for boolean geometric operations, connectivity analysis, transistor extraction and measurement, circuit extraction, parasitic capacitance measurement, and design rule checking. One of our major results is showing how the expected main memory requirement for all these tasks can be limited to $$O(\sqrt {n} )$$ where n is the number of edges in the artwork, while still running at least as fast as previously published algorithms. GOALIE can therefore handle large layouts on small computers, or even on personal workstations.

Thomas G. Szymanski, Christopher J. Van Wyk

### Gordian: A New Global Optimization/ Rectangle Dissection Method for Cell Placement

A new placement method for cell-based layout styles is presented. It is composed of alternating and interacting global optimization and partitioning phases. In contrast to other methods using the divide-and-conquer paradigm, it maintains the simultaneous treatment of all cells during optimization over all levels of partitioning. In the global optimization phases, constrained quadratic optimization problems are solved having unique global minima. Their solutions induce the assignment of cells to regions during the partitioning phases. For general-cell circuits, a highly efficient exhaustive slicing procedure is applied to small subsets of cells. The designer may choose a configuration from a menu to meet his requirements on chip area, chip aspect ratio and wire length. Placements with high area utilization are obtained within low computation times. The method has been applied to general-cell and standard-cell circuits with up to 3000 cells and nets.

Jürgen M. Kleinhans, Georg Sigl, Frank M. Johannes

### Exact Zero Skew

An exact zero skew clock routing algorithm using Elmore delay model is presented. Recursively in a bottom-up fashion, two zero-skewed subtrees are merged into a new tree with zero skew. The algorithm can be applied to single-staged clock trees, multi-staged clock trees, and multi-chip system clock trees. It is ideal for hierarchical methods of constructing large systems. All subsystems can be constructed in parallel and independently, then interconnected with exact zero skew method.

Ren-Song Tsay

### Efficient Network Flow Based Min-Cut Balanced Partitioning

We consider the problem of bipartitioning a circuit into two balanced components that minimizes the number of crossing nets. Previously, Kernighan and Lin type (K&L) heuristics, simulated annealing approach, and analytical methods were given to solve the problem. However, network flow techniques were overlooked as viable approaches to min-cut balanced bipartition due to their high complexity. In this paper we propose a balanced bipartition heuristic based on repeated max-flow min-cut techniques, and give an efficient implementation that has the same asymptotic time complexity as that of one max-flow computation. We implemented our heuristic algorithm in a package called FBB. The experimental results demonstrate that FBB outperforms K&L heuristics and analytical methods in terms of the number of crossing nets, and our efficient implementation makes it possible to partition large circuit netlists with reasonable runtime. For example, the average elapsed time for bipartitioning a circuit S35932 of almost 20K gates is less than 20 minutes on a SPARC10 with 32MB memory.

Honghua Hannah Yang, D. F. Wong

### Rectangle-Packing-Based Module Placement

The first and the most critical stage in VLSI layout design is the placement, the background of which is the rectangle packing problem: Given many rectangular modules of arbitrary size, place them without overlapping on a layer in the smallest bounding rectangle. Since the variety of the packing is infinitely many (two-dimensionally continuous), the key issue for successful optimization is in the introduction of a P-admissible solution space, which is a finite set of solutions at least one of which is optimal. This paper proposes such a solution space where each packing is represented by a pair of module name sequences. Searching this space by simulated annealing, hundreds of modules could be successfully packed as demonstrated. Combining a conventional wiring method, the biggest MCNC benchmark ami49 is challenged.

Hiroshi Murata, Kunihiro Fujiyoshi, Shigetoshi Nakatake, Yoji Kajitani

### Timing, Test and Manufacturing Overview

Timing analysis is concerned with estimating and optimizing the performance of integrated circuits. It encompasses a wide range of activities including physical modeling of transistors and interconnect wires, derivation of analytical and empirical gate and wire delay models, accurate estimation of long- and short-path delays through combinational logic, detection of setup and hold violations in sequential circuits, as well as a variety of combinational and sequential circuit transformations aimed at maximizing operation speed. The three papers reviewed here represent particularly significant contributions to the field of timing analysis and optimization: Brand and Iyengar [4] built the foundation for the field of false-path analysis; Szymanski and Shenoy [33] had the last word on the field of timing verification of latch-based circuits; and Shenoy and Rudell [29] are credited with making retiming viable for industrial-sized circuits.

Karem A. Sakallah, Duncan M. Walker, Sani R. Nassif

### A Methodology for Worst Case Design of Integrated Circuits

This paper presents a formal approach to the worst-case design of Integrated Circuits, yielding realistic estimates of variations in device performances. The worst-case analysis is performed in terms of statistically independent process disturbances and employs the statistical process simulator, FABRICS II.

A. J. Strojwas, S. R. Nassif, S. W. Director

### Timing Analysis using Functional Relationships

The usual block oriented timing analysis for logic circuits does not take into account functional relations between signals. If we take functional relations into consideration we may find that a long path is never activated. This observation can be used to calculate improved and more accurate delays. It is not practical to consider the complete truth table with all the relationships between signals. We use a procedure that considers only a subset of the relationships between signals and checks for non-functional paths. The delay calculation then takes into account the discovered non-functional paths to determine less pessimistic delays through the logic.

Daniel Brand, Vijay S. Iyengar

### On the Design of Robust Multiple Fault Testable CMOS Combinational Logic Circuits

It is known that circuit delays and timing skews in input changes influence choice of tests to detect delay faults. Tests for stuck-open faults in CMOS logic circuits could also be invalidated by circuit delays and timing skews in input changes. Tests that detect modeled faults independent of the delays in the circuit under test are called robust tests. An integrated approach to the design of combinational logic circuits in which all single stuck-open faults and path delay faults are detectable by robust tests was presented by the authors earlier. This paper considers design of CMOS combinational logic circuits in which all multiple stuck-at, stuck-open and all multiple path delay faults are robustly testable.

Sandip Kundu, Sudhakar M. Reddy, Niraj K. Jha

### Circuit Optimization Driven by Worst-Case Distances

In this paper, a new method for circuit optimization in face of manufacturing process variations is presented. It is based on the characterization of the feasible design space by worst-case points and related gradients. The expense for this characterization is linear with the number of circuit performances. On the contrary, the complexity of other geometric approaches to tolerance oriented circuit designs increases exponentially with the dimension of the design space. A deterministic optimization procedure based on the so-called “worst-case distances” will be introduced, combining nominal and tolerance design in a single design objective. The entire optimization process with regard to performance, yield, and robustness uses sensitivity analyses and requires a much smaller number of simulations than the Monte Carlo based approaches. Moreover, the new method takes account of the partitioning of the parameter space into deterministic and statistical parameter, which is an inherent property of integrated circuit design.

Kurt J. Antreich, Helmut E. Graeb

### Verifying Clock Schedules

Many recent papers have formulated both timing verification and optimization as mathematical programming problems. Such formulations correctly handle level-sensitive latches, long and short path considerations, and sophisticated multi-phase clocking schemes.This paper deals with the computational aspects of using such a formulation for verifying clock schedules. We show that the formulation can have multiple solutions, and that these extraneous solutions can cause previously published algorithms to produce incorrect or misleading results. We characterize the conditions under which multiple solutions exist, and show that even when the solution is unique, the running times of these previous algorithms can be unbounded. By contrast, we exhibit a simple polynomial time algorithm for clock schedule verification. The algorithm was implemented and used to check the timing of all the circuits in the ISCAS-89 suite. Observed running times are linear in circuit size and quite practical.

Thomas G. Szymanski, Narendra Shenoy

### Efficient Implementation of Retiming

Retiming is a technique for optimizing sequential circuits. It repositions the registers in a circuit leaving the combinational cells untouched. The objective of retiming is to find a circuit with the minimum number of registers for a specified clock period. More than ten years have elapsed since Leiserson and Saxe first presented a theoretical formulation to solve this problem for single-clock edge-triggered sequential circuits. Their proposed algorithms have polynomial complexity; how- ever naive implementations of these algorithms exhibit O(n3) time complexity and O(n2) space complexity when applied to digital circuits with n combinational cells. This renders retiming ineffective for circuits with more than 500 combinational cells. This paper addresses the implementation issues required to exploit the sparsity of circuit graphs to allow min-period retiming and constrained min-area retiming to be applied to circuits with as many as 10,000 combinational cells. We believe this is the first paper to address these issues and the first to report retiming results for large circuits.

Narendra Shenoy, Richard Rudell

### Industry Viewpoints

#### Frontmatter

ICCAD is the premier technical conference for CAD tools for ICs, and Cadence is the largest commercial vendor of such tools. Therefore it is not surprising that the two have many connections. Research from Cadence has been presented almost every year at ICCAD, and Cadence contributes heavily to other facets of the conference such as tutorials and panels. There is by no means a uni-directional flow of information, though — considerable research first reported at ICCAD has made its way into commercial products sold by Cadence. This paper covers some of the contributions to and from Cadence in the areas of timing, circuit simulation, layout analysis, physical design, logic verification, synthesis, and place and route.

Louis K. Scheffer

Hiromu Hayashi

Over the last twenty years, ICCAD has been a major source of innovative ideas and valuable technical interactions for IBM, as well as a showcase for the many IBM advances in design automation. The quality of papers and presentations has been unparalleled, as is exemplified by the selected papers in this book. As a company that depends on advanced design automation tools for its products, and as an innovator in the design automation arena [7], IBM congratulates ICCAD on its 20th Anniversary and applauds the authors, tutorial presenters, and program and executive committees that have made ICCAD the best technical conference in design automation.

John A. Darringer, Leon Stok, Louise H. Trevillyan

Magma Design Automation, founded in 1997, sells a complete IC design system (from RTL to GDSII in one tool) under the name Blast Fusion™. This tool contains RTL HDL (VHDL or Verilog) entry with RTL synthesis and data path generation, floorplanning, netlist level optimization, placement and routing, clock optimization, as well as a multitude of analysis engines. Among these the most important are: an incremental static timing analyzer, parasitic extraction, crosstalk noise analysis, power analysis, rail (voltage drop) analysis.

Michel Berkelaar

### Designers Face Critical Challenges and Discontinuities of Analog/Mixed Signal Design and Physical Verification

Mentor Graphics Corporation is recognized as a mainstay in the electronic de- sign automation (EDA) industry, having developed leading-edge products that enable the design of electronic products for more than 20 years. Mentor’s break- through research in EDA ranges from high-speed board design to resolution enhancement technologies for subwavelength manufacturability.

Walden C. Rhines

### NEC and ICCAD — EDA Partners in Success

NEC is a premier semiconductor company with a tradition of strong internal technology and innovation in EDA. NEC has been an active contributor to developments in the EDA community for more than 20 years. In particular, NEC researchers have contributed significantly as organizers, reviewers and program committee members at ICCAD. Notably, Dr. Satoshi Goto of NEC was the Program Chair and General Chair of ICCAD in 1990 and 1991, respectively. NEC has pro-actively developed in-house tools, published papers in premier EDA conferences and journals, and assisted EDA vendors with technology and funds to develop EDA tools. Like other companies with semiconductor operations, NEC has, also been a beneficiary of innovations showcased at premier international technical conferences like ICCAD. These innovations have consistently fuelled semiconductor design methodologies world wide across diverse industry segments - whether at the system level (personal computers, hand-held devices etc.) or at the device level (processors, memory, logic, MEMS, RF devices etc.).

P. Ashar, S. Chakradhar, A. Gupta, J. Henkel, A. Raghunathan, K. Wakabayashi

### The Strong Mutual Impact between Philips Research and ICCAD

We briefly discuss the influence of the International Conference on Computer-Aided Design (ICCAD) on the research program carried out at Philips Research over the past twenty years. We highlight some of the developments in the areas of simulated annealing, media-processor design, logic synthesis, and formal verification.

Emile Aarts, Frans Theeuwen

### Contributions from the “Best of ICCAD” to Synopsys

This paper highlights some examples of contributions from the selected ICCAD papers to Synopsys. We do not attempt to be exhaustive. Given the breadth of the topics addressed at ICCAD and the number of products offered by Synopsys this would be very difficult. We also try not to make any value judgments on the importance of a topic: In terms of (current) economic relevance markets speak for themselves, predicting future relevance accurately is impossible, and judgment on the technical relevance is subjective. So the given examples merely reflect the papers and products the authors are most familiar with. We hope however, that they do illustrate what it takes to create practical design technology (tools) starting from outstanding ideas.

Raul Camposano, Ahsan Bootehsaz, Debashis Chowdhury, Brent Gregory, Jim Kukula, Narendra Shenoy, Tom Williams