Design Technology Co-Optimization in the Era of Sub-Resolution IC Scaling 1628419059, 9781628419054

The challenges facing the most-advanced technology nodes in the microelectronics industry can be overcome with the help

603 51 8MB

English Pages 178 [179] Year 1984

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Design Technology Co-Optimization in the Era of Sub-Resolution IC Scaling
 1628419059, 9781628419054

Table of contents :
Dedication
Introduction to the Series
Table of Contents
Preface
List of Acronyms and Abbreviations
1 The Escalating Design Complexity of Sub-Resolution Scaling
2 Multiple-Exposure-Patterning-Enhanced Digital Logic Design
3 Design for Manufacturability
4 Design Technology Co-Optimization
References
Index
About the Authors

Citation preview

Design Technology Co-Optimization in the Era of Sub-Resolution IC Scaling

Lars W. Liebmann Kaushik Vaidyanathan Lawrence Pileggi

Tutorial Texts Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Design Technology Co-Optimization in the Era of Sub-Resolution IC Scaling, Lars W. Liebmann, Kaushik C. Vaidyanathan, and Lawrence Pileggi, Vol. TT104 Special Functions for Optical Science and Engineering, Vasudevan Lakshminarayanan and Srinivasa Varadharajan, Vol. TT103 Discrimination of Subsurface Unexploded Ordnance, Kevin O’Neill, Vol. TT102 Introduction to Metrology Applications in IC Manufacturing, Bo Su, Eric Solecky, and Alok Vaid, Vol. TT101 Introduction to Liquid Crystals for Optical Design and Engineering, Sergio Restaino and Scott Teare, Vol. TT100 Design and Implementation of Autostereoscopic Displays, Byoungho Lee, Soon-gi Park, Keehoon Hong, and Jisoo Hong, Vol. TT99 Ocean Sensing and Monitoring: Optics and Other Methods, Weilin Hou, Vol. TT98 Digital Converters for Image Sensors, Kenton T. Veeder, Vol. TT97 Laser Beam Quality Metrics, T. Sean Ross, Vol. TT96 Military Displays: Technology and Applications, Daniel D. Desjardins, Vol. TT95 Interferometry for Precision Measurement, Peter Langenbeck, Vol. TT94 Aberration Theory Made Simple, Second Edition, Virendra N. Mahajan, Vol. TT93 Modeling the Imaging Chain of Digital Cameras, Robert D. Fiete, Vol. TT92 Bioluminescence and Fluorescence for In Vivo Imaging, Lubov Brovko, Vol. TT91 Polarization of Light with Applications in Optical Fibers, Arun Kumar and Ajoy Ghatak, Vol. TT90 Digital Fourier Optics: A MATLAB Tutorial, David G. Voeltz, Vol. TT89 Optical Design of Microscopes, George Seward, Vol. TT88 Analysis and Evaluation of Sampled Imaging Systems, Richard H. Vollmerhausen, Donald A. Reago, and Ronald Driggers, Vol. TT87 Nanotechnology: A Crash Course, Raúl J. Martin-Palma and Akhlesh Lakhtakia, Vol. TT86 Direct Detection LADAR Systems, Richard Richmond and Stephen Cain, Vol. TT85 Optical Design: Applying the Fundamentals, Max J. Riedl, Vol. TT84 Infrared Optics and Zoom Lenses, Second Edition, Allen Mann, Vol. TT83 Optical Engineering Fundamentals, Second Edition, Bruce H. Walker, Vol. TT82 Fundamentals of Polarimetric Remote Sensing, John Schott, Vol. TT81 The Design of Plastic Optical Systems, Michael P. Schaub, Vol. TT80 Fundamentals of Photonics, Chandra Roychoudhuri, Vol. TT79 Radiation Thermometry: Fundamentals and Applications in the Petrochemical Industry, Peter Saunders, Vol. TT78 Matrix Methods for Optical Layout, Gerhard Kloos, Vol. TT77 Fundamentals of Infrared Detector Materials, Michael A. Kinch, Vol. TT76 Practical Applications of Infrared Thermal Sensing and Imaging Equipment, Third Edition, Herbert Kaplan, Vol. TT75 Bioluminescence for Food and Environmental Microbiological Safety, Lubov Brovko, Vol. TT74 Introduction to Image Stabilization, Scott W. Teare and Sergio R. Restaino, Vol. TT73 Logic-based Nonlinear Image Processing, Stephen Marshall, Vol. TT72 The Physics and Engineering of Solid State Lasers, Yehoshua Kalisky, Vol. TT71 Thermal Infrared Characterization of Ground Targets and Backgrounds, Second Edition, Pieter A. Jacobs, Vol. TT70 Introduction to Confocal Fluorescence Microscopy, Michiel Müller, Vol. TT69 Artificial Neural Networks: An Introduction, Kevin L. Priddy and Paul E. Keller, Vol. TT68 Basics of Code Division Multiple Access (CDMA), Raghuveer Rao and Sohail Dianat, Vol. TT67 Optical Imaging in Projection Microlithography, Alfred Kwok-Kit Wong, Vol. TT66 Metrics for High-Quality Specular Surfaces, Lionel R. Baker, Vol. TT65 Field Mathematics for Electromagnetics, Photonics, and Materials Science, Bernard Maxum, Vol. TT64 (For a complete list of Tutorial Texts, see http://spie.org/tt.)

Design Technology Co-Optimization in the Era of Sub-Resolution IC Scaling

Lars W. Liebmann Kaushik Vaidyanathan Lawrence Pileggi

Tutorial Texts in Optical Engineering Volume TT104

SPIE PRESS Bellingham, Washington USA

Library of Congress Cataloging-in-Publication Data Names: Liebmann, Lars W., author. | Vaidyanathan, Kaushik, author. | Pileggi, Lawrence, 1962- author. Title: Design technology co-optimization in the era of sub-resolution IC scaling / Lars W. Liebmann, Kaushik Vaidyanathan, and Lawrence Pileggi. Other titles: Tutorial texts in optical engineering ; v. TT 104. Description: Bellingham, Washington : SPIE, [2016] | © 2016 | Series: Tutorial texts in optical engineering ; volume TT104 | Includes bibliographical references and index. Identifiers: LCCN 2015032506| ISBN 9781628419054 (alk. paper) | ISBN 1628419059 (alk. paper) Subjects: LCSH: Integrated circuits—Design and construction. | Lithography, Electron beam. Classification: LCC TK7874.L455 2015 | DDC 621.3815—dc23 LC record available at http://lccn.loc.gov/2015032506

Published by SPIE P.O. Box 10 Bellingham, Washington 98227-0010 USA Phone: +1 360.676.3290 Fax: +1 360.647.1445 Email: [email protected] Web: http://spie.org

Copyright © 2016 Society of Photo-Optical Instrumentation Engineers (SPIE) All rights reserved. No part of this publication may be reproduced or distributed in any form or by any means without written permission of the publisher. The content of this book reflects the work and thought of the authors. Every effort has been made to publish reliable and accurate information herein, but the publisher is not responsible for the validity of the information or for any outcomes resulting from reliance thereon. Printed in the United States of America. First Printing.

We dedicate this book to all of the challenges that have made this journey interesting and to all of the outstanding colleagues who have kept semiconductor scaling going far beyond anyone’s expectations.

Introduction to the Series Since its inception in 1989, the Tutorial Texts (TT) series has grown to cover many diverse fields of science and engineering. The initial idea for the series was to make material presented in SPIE short courses available to those who could not attend and to provide a reference text for those who could. Thus, many of the texts in this series are generated by augmenting course notes with descriptive text that further illuminates the subject. In this way, the TT becomes an excellent stand-alone reference that finds a much wider audience than only short course attendees. Tutorial Texts have grown in popularity and in the scope of material covered since 1989. They no longer necessarily stem from short courses; rather, they are often generated independently by experts in the field. They are popular because they provide a ready reference to those wishing to learn about emerging technologies or the latest information within their field. The topics within the series have grown from the initial areas of geometrical optics, optical detectors, and image processing to include the emerging fields of nanotechnology, biomedical optics, fiber optics, and laser technologies. Authors contributing to the TT series are instructed to provide introductory material so that those new to the field may use the book as a starting point to get a basic grasp of the material. It is hoped that some readers may develop sufficient interest to take a short course by the author or pursue further research in more advanced books to delve deeper into the subject. The books in this series are distinguished from other technical monographs and textbooks in the way in which the material is presented. In keeping with the tutorial nature of the series, there is an emphasis on the use of graphical and illustrative material to better elucidate basic and advanced concepts. There is also heavy use of tabular reference data and numerous examples to further explain the concepts presented. The publishing time for the books is kept to a minimum so that the books will be as timely and up-to-date as possible. Furthermore, these introductory books are competitively priced compared to more traditional books on the same subject. When a proposal for a text is received, each proposal is evaluated to determine the relevance of the proposed topic. This initial reviewing process has been very helpful to authors in identifying, early in the writing process, the need for additional material or other changes in approach that would serve to strengthen the text. Once a manuscript is completed, it is peer reviewed to ensure that chapters communicate accurately the essential ingredients of the science and technologies under discussion. It is my goal to maintain the style and quality of books in the series and to further expand the topic areas to include new emerging fields as they become of interest to our reading audience. James A. Harrington Rutgers University vii

Table of Contents Preface List of Acronyms and Abbreviations

xi xv

1 The Escalating Design Complexity of Sub-Resolution Scaling 1.1 1.2 1.3 1.4 1.5 1.6

k1 . 0.6: The Good Old Days 0.6 . k1 . 0.5: Optical Proximity Correction 0.5 . k1 . 0.35: Off-Axis Illumination 0.35 . k1 . 0.25: Asymmetric Off-Axis Illumination 0.25 . k1 . 0.125: Double Patterning k1 , 0.125: Higher-Order Frequency Multiplication

2 Multiple-Exposure-Patterning-Enhanced Digital Logic Design 2.1 2.2

2.3 2.4

2.5

Introduction to Physical Design The Evolution of Standard Cell Layouts 2.2.1 Attributes of standard logic cells 2.2.2 Impact of patterning-driven design constraints Standard Cell Layout in the Era of Double Patterning Two Generations of Double-Patterning-Enhanced P&R 2.4.1 Color-aware placement 2.4.2 Color-aware routing Beyond P&R: The Impact of Double Patterning on Floor Planning

3 Design for Manufacturability 3.1 3.2 3.3 3.4 3.5 3.6

3 5 8 15 20 27 33 33 35 35 41 45 48 49 55 63 67

Critical Area Optimization Recommended Design Rules Chemical Mechanical Polishing Lithography-Friendly Design Prescriptive Design Rules Case Study: Design Implications of Restrictive Patterning 3.6.1 Background 3.6.2 Impact of extremely restrictive lithography on SoC design

4 Design Technology Co-Optimization 4.1

1

The Four Phases of DTCO 4.1.1 Phase 1: establish scaling targets ix

67 72 73 75 85 87 88 91 101 101 102

x

Table of Contents

4.2

4.3

4.4

4.5

4.1.2 Phase 2: first architecture definition 4.1.3 Phase 3: cell-level refinement 4.1.4 Phase 4: block-level refinement Case Study 1: Leaf-Cell DTCO at N14 4.2.1 Technology definition 4.2.2 Embedded memory 4.2.3 Standard cell logic 4.2.4 Analog design blocks 4.2.5 Leaf-cell DTCO effective at N14: holistic DTCO provides further improvements Case Study 2: Holistic DTCO at N14 4.3.1 Holistic DTCO for embedded memories 4.3.2 Holistic standard-cell DTCO 4.3.3 Holistic DTCO for analog components 4.3.4 Test chip and experimental results Case Study 3: Using DTCO Techniques to Quantify the Scaling of N7 with EUV and 193i 4.4.1 Introduction 4.4.2 Scaling targets 4.4.3 Comparison of RET implications 4.4.4 Objectives for power, performance, and area scaling 4.4.5 Cell architecture comparison 4.4.6 Macro-level scaling assessment 4.4.7 Cell-area-limited scaling 4.4.8 Router-limited scaling Conclusion

References Index

105 106 106 107 108 109 110 114 116 117 117 121 124 125 131 131 133 134 136 136 143 143 145 150 153 159

Preface Design technology co-optimization (DTCO), at its core, is not a specific solution or even a rigorous engineering approach; it is fundamentally a mediation process between designers and process engineers that aims to ensure a competitive technology architecture definition while avoiding schedule or yield risks caused by unrealistically aggressive process assumptions. The authors of this book represent the two parties that come together in these discussions: Lars Liebmann joined IBM in 1991, when enhancements in the physical lithography resolution through a reduction of wavelength and an increase in numerical aperture were still keeping up with the semiconductor industry’s relentless pace of transistor density scaling. However, even back then, the advances in exposure hardware lagged behind the need for higher resolution in support of early device and process development. Liebmann started his career developing design solutions for layout-intensive resolution enhancement techniques (RETs). One such RET, alternating phase-shifted mask (altPSM) lithography, had just become lithographically viable, and one of Liebmann’s first jobs involved drawing phase shapes onto transistors in an early exploratory device test-chip at IBM’s Advanced Technology Lab. Naturally, this tedious work lead to him to explore means of automating the layout manipulations necessary to implement such RETs and introduced him to the engineering discipline of electronic design automation (EDA). He joined his colleagues Mark Lavin and Bill Leipold who had just begun work on a piece of code that could very well be the original ancestor to all optical proximity correction (OPC) solutions on the market today. This simple piece of EDA code, which they called ShrinkLonelyGates and that located and biased isolated transistors in a chip design, laid the foundation in 1992 for what many years later would become known as computational lithography.1 Access to these early (and by today’s standards extremely limited) shape-manipulation functions in IBM’s internal EDA engine, Niagara, not only opened the door for Liebmann to explore automatic altPSM design2 and more-complex OPC solutions but also led to spin-offs such as code to generate sub-resolution assist features (SRAFs).3 Although these automatic layout-manipulation routines were extremely useful in driving the adoption of these RETs for increasingly complex

xi

xii

Preface

chip designs, equally important was the observation that it was quite easy for designers to design shapes that were perfectly legal by that technology node’s design rules but would cause the automatic generation routines to fail. Successful and efficient implementation of strong RETs required negotiations with the designers and forced the conversations that many years later grew into DTCO. Soon after the advancements in the fundamental exposure-tool resolution slowed down and eventually stopped entirely after the introduction of 193-nm immersion lithography, scaling through increasingly complex and designrestrictive RETs became the semiconductor industry’s only path forward. Even though altPSM was never adopted as a semiconductor manufacturing solution for IBM or the majority of the semiconductor industry, much of what Liebmann learned in those early years of computational lithography held true for many technology nodes to follow: • The design space is enormously complicated, and designers operate under crushing time pressure. Maintaining design efficiency must be paramount in any restriction the process engineers intend to impose on designers. • Very few designers actually draw transistors and wires; the design space consists entirely of a complex set of automated design solutions. Any process-driven constraints or required design manipulations have to seamlessly integrate into established design flows. • Any substantial design constraints must be negotiated early in the technology node and implemented far upstream in the design flow to avoid design re-spins that put a product’s time-to-market schedule at risk. Liebmann’s interest in exploring the extent of semiconductor scaling by taking DTCO to its extreme limit caused him to cross paths with a research team at Carnegie Mellon University. Kaushik Vaidyanathan started his career as an application-specific integrated circuit (ASIC) designer at IBM in 2007. It was the time when designers across the industry were becoming increasingly reliant on electronic design automation (EDA) tools. Specifically, at IBM, their in-house EDA tools had matured to a point where someone straight from undergraduate study could be trained to design a multi-million gate 17 mm  17 mm N90 ASIC. Physicaldesign tools and methodologies were complex, and maneuvering them to accomplish design goals was challenging. However, after a year, Vaidyanathan found himself asking many questions about the inner workings of these tools and methodologies, only to realize that he did not have the background to seek or understand the answers. So, he decided to attend graduate school in 2009. Thanks to Prof. Pileggi (his Ph.D. advisor), Vaidyanathan had the opportunity to build a background and work alongside industry veterans such

Preface

xiii

as Liebmann, seeking answers and solutions to a daunting problem facing the IC industry, i.e., affordable and efficient scaling of systems-on-chip (SoCs) past N20. In their quest for answers, Vaidyanathan and his collaborators started with rigorous DTCO at the N14 technology node for different components of a SoC. This work lasted a couple of years, and they developed several insights, the two most-important ones for Vaidyanathan were 1. There is no substitute for experience—aside from input from experts such as Liebmann, a sensible exploration of the vast design and manufacturability tradeoff space becomes quickly unmanageable; and 2. Opportunities are hidden amidst challenges, a notion that Vaidyanathan learned from his Ph.D. advisor that enabled them to exploit the technology challenges to develop frameworks for affordable design beyond N20, such as construct-based design and smart memory synthesis. Much of the collaborative work between Carnegie Mellon and IBM is presented in this book as case studies in Sections 3.6, 4.2, and 4.3. Lawrence Pileggi began his career at Westinghouse Research and Development as an IC designer in 1984. His first chip project was an ASIC elevator controller in a 2-mm CMOS operating at a blazing clock frequency of 1 MHz. Intrigued by the challenges of performing very-large-scale design with somewhat unreliable CAD tools, he entered the Ph.D. program at Carnegie Mellon University in 1986 to participate in electronic design automation (EDA) research, with a specific focus on simulation algorithms for his thesis work. After six years as a faculty member at the University of Texas–Austin, he returned to Carnegie Mellon with the objective of working on research at the boundary between circuit design and design methodologies. In 1997, the Focus Center Research Program (FCRP) was launched by the Semiconductor Research Corporation (SRC), a consortium of US semiconductor companies. The FCRP was established to create the funding and collaboration that would be needed to perform long-range research. Pileggi became a member of one of the first FCRP programs, the Gigascale Silicon Research Center (GSRC), led by Richard Newton at Berkeley. While there were many challenges faced by the semiconductor community at that time, Pileggi chose to focus his GSRC research on an impending problem that was professed by two of his Carnegie Mellon colleagues, Wojtek Maly and Andrzej Strojwas, i.e., the impending manufacturability challenges due to subwavelength lithography. His colleagues had developed tools and methods at Carnegie Mellon to evaluate the difficult to print patterns and to count the number of unique patterns as a function of the lithography radius of influence. Captivated by the impact of these patterns on design methods and circuit topologies, Pileggi and his group created a regular-fabrics-based design methodology that followed a simple philosophy: rather than asking lithographers to

xiv

Preface

print any circuit patterns, as they had been doing for decades, instead ask them what patterns they can print well and then develop circuits and methodologies that best utilize those patterns. Pileggi and his group worked with researchers from IBM, Intel, and other sponsoring members of the GSRC to explore the benefits and possibilities of regular fabric design. Through some of those interactions Pileggi met Lars Liebmann at IBM. While Pileggi and his students worked with various companies, both through Carnegie Mellon and later via a small start-up company, Fabbrix (which was acquired by PDF Solutions in 2007), the deepest collaborations occurred with IBM, and Liebmann in particular. In 2010 a partnership with Lars and IBM to work on the DARPA GRATE program produced much of the work that comprises the later sections of this book. Now, as the industry deploys the 14-nm FinFET technology node, the regular-fabrics approach, pattern templates, and construct-based design methods that were proposed are clearly evident. Because DTCO has evolved from LFD and design for manufacturability (DFM), this book starts the DTCO discussion by first reviewing the impact that increasingly invasive RETs and multiple-exposure patterning (MEP) techniques have had on design. It then covers the major DFM techniques and highlights competing optimization goals of LFD and DFM. However, DTCO differs from LfD and DfM in that the goal of DTCO is not just to communicate processdriven constraints to the designers but to negotiate a more optimal tradeoff between designers’ needs and process developers’ concerns. To facilitate this cooptimization, it is important for the process engineers to understand the high-level goals of the design community. To that end, this book reviews the fundamental SoC design objectives as well as the resulting topological constraints on different building blocks of a SoC, such as standard cells, embedded memories, analog components, and place and route flows. Finally, the mechanics of the DTCO process are explained as a series of steps that incrementally refine the technology architecture using concepts such as a design-driven rules definition, design-rule-arc analysis, and construct-based technology definition. The efficacy of DTCO is illustrated using detailed case studies at N14 contrasting leaf-cell optimization against a more-comprehensive holistic DTCO approach. The final case study illustrates how DTCO can be applied to quantifying the achievable scaling to N7 under different lithography assumptions. While it is impossible to present a simple “how to” manual for DTCO, the goal of this book is to break down the abstract concept of DTCO into specific actionable components that collectively play an increasingly important role in maintaining the industry’s aggressive pace of semiconductor scaling. Lars W. Liebmann Kaushik Vaidyanathan Lawrence Pileggi December 2015

List of Acronyms and Abbreviations altPSM ASIC BEOL CAD CAM CMOS DDL DFM DOF DRC DRM DSA DTCO EDA eDRAM EUV FEOL FinFET FPU GDSII GL1 GOPS per W HDL IDM IP LE3 LELE LFD LVS MEP MP

Alternating phase-shift mask Application-specific integrated circuit Back end of line Computer-aided design Content addressable memory Complementary metal–oxide semiconductor Double dipole lithography Design for manufacturability Depth of focus Design rule check Design rule manual Directed self-assembly Design technology co-optimization Electronic design automation Embedded dynamic random access memory Extreme ultraviolet lithography Front end of line Fin-based field effect transistor Floating point unit Graphic Database System (format for physical design data) IBM’s old internal standard (format for physical design data) Giga-operations per second per watt Hardware description language Integrated device manufacturer Intellectual property Litho–etch–litho–etch–litho-etch triple patterning Litho–etch–litho–etch double patterning Lithography-friendly design Layout versus schematic Multiple-exposure patterning Microprocessor

xv

xvi

OASIS OPC ORC OTA PDK PEX PPA PRAF PWQ RDR RET RTL SADP SAQP SAR-ADC SES SIT SIT2 SMO SMSF SoC SRAF SRAM

List of Acronyms and Abbreviations

Open Artwork System Interchange Standard (format for physical design data) Optical proximity correction Optical rule check Operational transconductance amplifier Process design kit Parasitic extraction Power-performance-area metric Printing assist features Process window qualification Restricted design rule Resolution enhancement technique Register transfer level Self-aligned double patterning Self-aligned quadruple patterning Successive-approximation analog-to-digital converter Statistical element selection Sidewall image-transfer patterning Sidewall image-transfer double patterning Source mask optimization Smart memory synthesis framework System on chip Sub-resolution assist features Static random access memory

Chapter 1

The Escalating Design Complexity of Sub-Resolution Scaling To set the stage for a discussion about the increasing design impact of subresolution patterning, Fig. 1.1 illustrates the steady decline in the k1 factor over ten technology nodes. In a simple reformulation of the well-known Rayleigh resolution criterion, k1 is calculated as k 1 ¼ Resolution  ðNA∕lÞ,

(1.1)

where Resolution is given as the smallest half-pitch of the design (in nanometers), NA is the numerical aperture of the projection lens (i.e., the refractive index of the ambient material multiplied by the sine of the maximum angle of the diffracted orders that can be captured by the lens), and l is the wavelength of the illumination source (in nanometers). The x axis of Fig. 1.1 is labeled “Technology Node,” which identifies discrete increments in semiconductor scaling. The goal of the industry as a whole is to produce new technology definitions at intervals of two to three years, roughly doubling the transistor density and improving chip performance between 10% and 20%. The naming of these technology nodes is deliberately vague. Although the origins of the naming convention trace back to the physical gate length of the transistors used in a particular node, any direct correlation to physical dimensions has been lost in the more-recent technology nodes, and the naming scheme has become a significant component of the competitive marketing in the semiconductor foundry business. The ratio of two node names, however, is meant to indicate the linear scale factor between these nodes, i.e., N7 is roughly a 70% linear scaling of N10. As presented in Fig. 1.1, the k1 can be seen as a unitless measure of lithographic complexity. Because k1 , 0.6 represents the practical resolution limit of conventional lithography in semiconductor manufacturing, all of the technology nodes shown in Fig. 1.1 take place in the sub-resolution domain. 1

2 0 .6 0.6 0 .55 0.55 0.5 0 .5 0.45 0 .45 0 .4 0.4 0.35 0 .35 0 .3 0.3 0.25 0 .25 0 .2 0.2 0 .15 0.15 0 .1 0.1 0 .05 0.05 0

Conventional Lithography

Off-axis illumination, Immersion lithography

Asymmetr Asymmetric Off-axis ill illumination x2 x 2

Double patterning

5 N

7

xn

N

10 N

14 N

22 N

32 N

45 N

65 N

90

Higher order frequency multiplication

N

N

13 0

Rayleigh R ayleigh F Factor actor ((k k1 )

Chapter 1

Technology T ech ec hnology Node Nod ode e

Figure 1.1 A plot of the steadily decreasing k1 factor for ten technology nodes, showing five increasingly challenging sub-resolution domains.

Furthermore, several additional “hard” resolution barriers at k1 = 0.5, 0.35, 0.25, and 0.125 have been overcome by the use of increasingly complex resolution enhancement techniques. Figure 1.2 ties these three parameters together by k1 (the diagonal of the matrix) to help visualize the link between the physical patterning challenges

Figure 1.2 The “Rosetta Stone” of sub-resolution patterning relating physical lithography challenges to resolution enhancement techniques and design impact.

The Escalating Design Complexity of Sub-Resolution Scaling

3

encountered at the various resolution barriers (left border), the resolution enhancement techniques implemented to overcome these barriers (right border), and the resulting design impact (bottom row). Because k1 is a unitless measure of patterning complexity, if and when a new wavelength is finally introduced, the correlation between k1 and the design impact is largely preserved, although some adjustment must be made to the practical resolution limits to account for the many years of learning at the 193-nm wavelength that allows foundries to successfully operate extremely close to the fundamental physical resolution limits. The following sections briefly review each resolution barrier to provide further justification for the design restrictions that the design community had to endure as the semiconductor industry scaled further into the sub-resolution domain.

1.1 k1 > 0.6: The Good Old Days As stated earlier, the resolution of a lithographic exposure system is determined by three parameters: wavelength l, numerical aperture NA, and k1 as expressed by Rayleigh’s resolution equation: Resolution ¼ k 1 ðl∕NAÞ.

(1.2)

For many technology generations, the exposure-tool resolution was intended to maintain k1 . 0.65, ensure sufficient patterning-limited yield, and minimize rework in the semiconductor manufacturing line. To maintain a constant level of lithographic complexity, the resolution had to be improved by increasing the NA and reducing l. Scaling resolution by decreasing the wavelength becomes challenging in the UV wavelength spectrum. Light sources of sufficient intensity and spectral quality become increasingly difficult to obtain and, as the steps in wavelength become smaller, provide diminishing resolution benefit. The main components of an optical lithography system along with the l and NA scaling history are identified in Fig. 1.3. Because it became increasingly difficult to generate sufficient light intensity at shorter wavelengths, new photoresists with higher sensitivity had to be invented. Chemically amplified resists4 were key in developing the ability to scale into the deep-UV (248-nm) wavelength regime. Preparing these new resist systems for volume manufacturing took the industry about ten years and forced major improvements in the process and environmental control in the manufacturing line. Further scaling of the wavelength then forced the introduction of calcium fluoride as the main material for all optical elements at l ¼ 157 nm, and issues with birefringence5 caused enough delays in the development of this solution to effectively remove it from consideration at the time of 193-nm immersion lithography. At l ¼ 126 nm, a transition to all reflective optics in a vacuum environment would have been required; at the time, the industry consensus deemed that such a

4

Chapter 1

λ

Source

Mask

NA Lens

Wafer Shown as a 1x system for simplification

Source

λ (nm)

G-line

436

Δλ Δλ

Ready

I-line KrF

365

19%

1984

248

47%

1989

ArF

193

28%

2001

F2

157

23%

na

Ar2

126

25%

na

Medium

NA

ΔNA

Ready

air

0.5

air

.75

50%

1999

air

.85

13%

2003

air

.93

9%

2005

water

1.2

29%

2007

water

1.35

13%

2008

?

~1.55

4%

na

Figure 1.3 Main components of an optical lithography system (typically operating at 4 magnification but shown here at 1) along with the semiconductor industry’s wavelength l and numerical aperture NA scaling history.

major disruption in the exposure tool should come with a bigger reduction in wavelength. Thus 126-nm lithography was abandoned in favor of projection x-ray lithography, now known as extreme ultraviolet (EUV), operating at l ¼ 13.5 nm. As indicated in Fig. 1.3, wavelength-based resolution scaling ended for optical lithography with 193 nm and went on a long hiatus as x-ray-based lithography systems were developed. Reducing the resolution by increasing the NA in conventional lithography has the undesirable side effect of requiring disproportionately tighter process control. The depth of focus (DOF), i.e., the degree to which the exposure system can tolerate vertical offsets from local topography or wafer deformation, is inversely proportional to the square of the NA. To put that concept into perspective, at a NA of 0.5, focus variations must be controlled to 0.25l. After the transition to two-beam imaging is made (Section 1.3), the DOF constraint is relaxed, making it possible to take advantage of further NA increases. To increase the NA beyond the fundamental geometric limit of 1, the refractive index of the material between the lens and the wafer must be increased, leading to the introduction of immersion lithography. NA-based resolution scaling became popular with the introduction of water immersion at a NA of 1.35 because the semiconductor industry failed to find a material with all of the right chemical and physical parameters to provide a further increase in the refractive index. In addition to maintaining a constant level of patterning-limited yield, hardware-based scaling at k1 . 0.6 allowed for easy layout migration from one node to the next. Most design rules scaled linearly by the chosen scale

5

The Escalating Design Complexity of Sub-Resolution Scaling Node N

Node N+1

Figure 1.4 Layout migration by simple design-grid scaling is possible when a sufficiently large k1 factor is maintained through timely l and NA improvements.

factor, and designs could be reused from one node to the next simply by scaling the design grid on which they were drawn. This principle is illustrated in Fig. 1.4, where a logic cell is rendered for node N on a design grid of 1 and then migrated to node N þ 1 by scaling the design grid to 0.7. As shown in Fig. 1.3, wavelength scaling ended at 193 nm and started the semiconductor industry on the difficult journey into the sub-resolution domain that found a brief reprieve with the introduction of immersion lithography. Because NA scaling also ended with water immersion at a NA of 1.35, the resolution gap kept expanding. The end of l- and NAbased resolution scaling left k1 as the primary scaling knob, which increased design restrictions and forced closer collaboration between lithographers and designers to enable increasingly complex resolution enhancement techniques.

1.2 0.6 > k1 > 0.5: Optical Proximity Correction k1 ¼ 0.5 is the point at which the first diffracted orders, the minimum required to reconstruct an image, are at the outer limits of the projection lens, as shown in Fig. 1.5(a). As the k1 approaches this resolution limit, fewer diffracted orders are captured by the lens, causing two effects collectively referred to as

6

Chapter 1 k1 = 0.5

k 1 >> 0.5

(a)

(b)

Figure 1.5 Light beams diffracted by a mask and captured by the imaging lens: (a) diffracted light at a pitch of k1 ¼ 0.5, the smallest pitch that can be imaged by this lens; (b) diffracted light for larger pitches in the same lens.

proximity effects. The fidelity with which images can be replicated depends on the number of diffracted orders that can be captured by the projection lens. At large k1 factors, several diffracted orders combine to crisply resolve 2D details. However, because fewer diffracted orders are captured at smaller k1, corner rounding and line-end variation become more problematic yield limiters. The vast majority of designs do not consist only of minimum pitch features, and thus a range of pitches must be printed at the same time. Because more diffracted orders are captured for features at larger pitches, as illustrated in Fig. 1.5(b), a pitch-dependent exposure-intensity variation is introduced. This effect essentially causes different features to print at their target dimension at different exposure-dose settings. Pitch-dependent linewidth variation causes systematic patterning inaccuracy because the exposure dose must be set to one specific value for the entire exposure field. One response to this degradation in patterning robustness involves continuously tightening of control limits on mask dimensions, focus, and exposure dose. These tighter process-control limits come at a steep cost in tooling enhancements and increased rework, so an alternative was introduced in the form of optical proximity correction (OPC), which simply pre-distorts the mask pattern to account for systematic inter- and intrashape proximity effects, as shown in Fig. 1.6. In the early days of OPC, rule-based corrections were the industry standard. Correction tables could be generated to apply the appropriate feature bias to the mask image by simply characterizing the pitch-dependent feature size variations, line-end shortening effects, and corner-rounding radii for a specific exposure condition, mask write process, and wafer resist system.

7

The Escalating Design Complexity of Sub-Resolution Scaling Without OPC

With OPC

Figure 1.6 Improved image quality through OPC, a systematic pre-distortion of the mask to compensate for intershape and intrashape proximity effects at low k1 imaging.

The following three challenges brought rule-based OPC to an end for critical levels on leading-edge technology nodes: • A nonlinear response of the print bias, i.e., a nonlinear relationship between the change in mask size and the corresponding change in wafer size. This effect became known as the mask error factor (MEF) or mask error enhancement factor (MEEF) and lead to over- or undercorrection at smaller k1. After these nonlinearities introduced sufficiently significant errors in the final wafer dimension, they had to be incorporated into increasingly complex correction tables. • Rule-based correction worked well for coarse effects, such as ‘ShrinkLonelyGates,’ that simply distinguished between nested and isolated transistors. However, when tighter control limits forced OPC to correct at a much finer granularity, second-order effects, i.e., the fact that the correction itself changed the proximity environment, became significant and introduced correction errors. • Gaps in the rules, i.e., unanticipated layout configurations that were not covered by the correction table (such as line ends with small edge jogs introduced by poorly placed via landing pads), caused catastrophic yield failures and put an end to rule-based OPC. Rule-based OPC was replaced by model-based OPC, which uses semiempirical patterning models to either iteratively optimize the mask layout by minimizing the offset between the predicted and desired wafer image or by inverting the patterning model to derive an ideal mask image that is made manufacturable in iterative clean-ups. At the time, the computational efficiency of the forward iterative model-based OPC surpassed the inverse lithography approach,6 which only very recently made a comeback. OPC was implemented by the wafer manufacturers based on their characterization of systematic patterning errors with the specific goal of accurately replicating the

8

Chapter 1 Clean PV-bands

Pinching Hotspot

Figure 1.7 Poly over diffusion, with contacts on the poly along with the predicted wafer images. The layouts show very similar layout characteristics and follow the same design rules, but the layout on the right shows a distinct hotspot where OPC is overconstrained and yields marginal results (PV bands are explained in Section 3.4).

layouts provided by designers; thus, for the most part, designers were not affected by the introduction of OPC. However, the existence of fast and reasonably accurate predictive patterning models, along with the ever-present danger of rogue layout configurations that eluded accurate correction, lead to the idea of modelbased “hotspot” prevention as a component of the overall effort to achieve a more lithography-friendly design (LFD). Model-based layout checking and optimization was implemented on various commercial platforms under different names so that designers could use the patterning models to prescan their layouts for hotspots, i.e., layout configurations that passed design rule checking (DRC) but that had significant errors even after OPC. As shown in Fig. 1.7, this approach is very effective in detecting layout configurations where OPC is overconstrained by a complex confluence of edges and proximity changes that would be impossible to outlaw in conventional design rules. Although model-based layout optimization had almost immediate success in improving the manufacturability of very dense, sensitive layouts, such as memory cells, and then became a major component of model-based design for manufacturability (DFM), it took several more technology generations for this approach to be broadly used by the design community. More details on model-based layout optimization for LFD will be covered in Section 3.4.

1.3 0.5 > k1 > 0.35: Off-Axis Illumination This chapter more accurately refers to the two-beam imaging regime because there are two means of penetrating the k1 ¼ 0.5 resolution barrier. As the simple interference fringes in Fig. 1.8 show, the resolution limit of k1 ¼ 0.5 can

9

The Escalating Design Complexity of Sub-Resolution Scaling 3-beam imaging

2-beam imaging

(a)

(b)

Figure 1.8 Interference fringes and diffracted orders for (a) conventional three-beam imaging at k1 ¼ 0.5 and (b) a generic two-beam imaging solution with light beams emanating from alternate mask openings offset by 0.5l.

be overcome by effectively shifting the light on adjacent mask openings by half a wavelength, which re-centers the image forming diffracted orders, as shown in Fig. 1.8(b). Cutting the angle between the diffracted orders in half doubles the resolution to a theoretical limit of k1 ¼ 0.25. It also eliminates the zeroth order (hence the term “two-beam imaging”) and thereby significantly improves the depth of focus. Because all imaging beams travel roughly the same distance, pathlength differences that degrade the image quality in response to vertical offsets of the imaging plane are significantly reduced. One way to achieve this new diffracted pattern introduces a 0.5l difference in the pathlength that light travels in the high-refractive-index mask material between alternating openings on the mask. The path-length difference that achieves the desired phase shift can be introduced by either adding to or subtracting material from alternating mask openings. Figure 1.9 illustrates the subtractive process for alternating phase-shifted masks (altPSMs), which was widely adopted for its ease of process control. An alternative view of altPSMs is presented in Fig. 1.10, showing a simple planar layout along with the corresponding mask cross-section and amplitude and intensity plots. Traversing Fig. 1.10 backwards: the excellent image quality afforded by altPSMs is created by the intensity profile that is pinned to 0 in the dark regions of the design; this effect is achieved by a sign shift in the imaging amplitude (i.e., a 180-deg phase shift) across the dark regions of the design, which is created by recessing the mask by 0.5l(n  1) on one side of the dark regions in the design. Finally, an additional designed layer is required that identifies regions to be recessed, as shown in the simple layout in Fig. 1.10(a).

10

Chapter 1 Alternating Phase Shift

Figure 1.9 Two-beam imaging achieved by alternating phase-shifted mask (altPSM) lithography. The 0.5l pathlength difference is achieved by etching into the mask. Cross-Section

Top Down

Mask

180o

+1

Amplitude

0 -1 1

Intensity

(a)

0

(b)

Figure 1.10 Conceptual details of altPSMs: (a) top-down layout and (b) corresponding mask, amplitude, and intensity profiles.

Even though the absence of a shape that identifies the 180-deg phaseshifted regions implies a 0-deg phase region, it helps to visualize the design challenges as a two-color mapping problem with the simple goal of adding 0-deg and 180-deg phase regions on opposite sides of narrow features (note that for dark-field layers, i.e., shapes formed by clear openings in the mask, identical arguments apply to the shape rather than the background of the shape). It is easy to create layouts that pass conventional design rules but that are not two-color mappable to the aforementioned rules.7 Two classic examples are shown in Fig. 1.11: the T-shape and the odd–even cycle. When the semiconductor industry struggled to adopt altPSMs, much of the design work focused on efficient two-color mapping algorithms that minimized layout conflicts (i.e., odd cycles), but very little work was done to develop the

The Escalating Design Complexity of Sub-Resolution Scaling Odd-Cycle

‘T’ Conflict

Figure 1.11

11

Two classic two-color mapping conflicts: the T and the odd cycle.

physical design infrastructure necessary to seamlessly integrate these altPSM constraints into existing design solutions.8 For the majority of the industry, that challenge was deprioritized until it reappeared in almost identical fashion for double patterning a few technology nodes later. An alternative means of achieving the pattern of diffracted orders shown in Fig. 1.8 tilts the illumination off-axis by sin(u) ¼ 0.5l/Pitch, as illustrated in Fig. 1.12. In reality, this off-axis illumination is achieved not by tilting the illuminator but by adding an aperture or diffractive optical element above the mask to cause oblique illumination. The ideal illumination angle is a function of the feature pitch, introducing a challenge similar in nature to what drove Off-Axis Illumination

Figure 1.12 Two-beam imaging achieved by tilting the illumination angle off-normal, causing the image-forming light plane to hit one mask opening 0.5l ahead of the adjacent one.

12

Chapter 1 Optimized Pitch

Larger Pitch

(a)

(b)

Figure 1.13 Schematic illustration of why larger pitches have a worse process window than the tightest pitch for which the illumination angle is arranged.

the need for OPC but significantly more severe in its design implications. As shown in Fig. 1.13(a), the illumination is perfectly balanced for minimum pitch features, essentially forming the image by interfering the zeroth and first diffracted orders (the intensity difference between these two orders is rebalanced by allowing a small amount of light to leak through the opaque areas of the mask at a 180-deg phase shift, employing what is commonly referred to as attenuated phase-shifted mask technology). However, for pitches larger than the minimum, the image becomes very unbalanced because higher diffracted orders are captured only on one side of the optical axis, as shown in Fig. 1.13(b). With this strong pitch dependence, off-axis illumination (OAI) introduced the semiconductor industry to nonmonotonic resolution, i.e., the smallest features at the tightest pitches in previous technology nodes were naturally the hardest to print, and any relaxation in pitch resulted in an improvement in patterning; with OAI this is no longer the case. In aggressive OAI, as the feature pitch increases, the process window, i.e., the ability to tolerate variations in the exposure dose and focus at acceptable dimensional control, rapidly decreases. Sub-resolution assist features (SRAFs) are added to the design to counteract this phenomenon. As the name implies, these SRAFs are placed to assist the patterning by mimicking the feature pitch for which the illumination angle is set, but they are sized such that they do not leave an image on the wafer. The minimum SRAF size and spacing is constrained by mask manufacturing and inspection capability, and thus the primary feature pitch must be allowed to grow large enough to insert the smallest manufacturable SRAF. Because the maximum size of the SRAF is constrained by the requirement to remain sub-resolution, a single SRAF can only assist a finite pitch range before two SRAFs must be inserted, again

13

The Escalating Design Complexity of Sub-Resolution Scaling

constrained by the minimum manufacturable width and space. The final process window after SRAF insertion typically exhibits a sawtooth character, as shown in Fig. 1.14. The dip in the process window in the transition pitches—where the feature space is too large for a given illumination angle but not large enough to add a SRAF, or where one SRAF would grow to a size that would leave an image on the wafer despite lacking room for two SRAFs—resulted in a sharp rise in design rule complexity. Figure 1.15 compares the wire space rules in N90 to N45. A simple design rule to prevent sub-minimum metal spaces became a complex set of width-dependent spacing rules. Of course, the design rules are not published in the compact matrix shown in Fig. 1.15, the desired and necessary design behavior was enforced with numerous conditional design

Figure 1.14

Nonmonotonic process window resulting from using SRAFs with OAI.

Simple Space Rules

Complex Space Rules Minimum Space (N45)

15x

15x

10x

10x

Width of neighboring line

Width of neighboring line

Minimum Space (N90)

5x 4x 3x

1x

2x 1x 0.125: Double Patterning The resolution limit of fully optimized, single-orientation, dipole OAI is reached when the off-axis zeroth and first diffracted orders can no longer be captured by the projection lens, or when k1 ¼ 0.25. To push the resolution beyond this limit, multiple optically decoupled exposures must be used in conjunction with intermediate wafer processing to reconstruct the desired feature pitch. The reason to optically decouple the subsequent exposures (usually via an intermediate memorization etch) is shown in Fig. 1.22. An isolated feature [Fig. 1.22(a)] projects an image with significant intensity tails into the photoresist. The photoresist then thresholds that grayscale image to yield a well resolved feature after etch. For nested features exposed in a single exposure process, Fig. 1.22(b), the contrast between neighboring features at k1 , 0.25 is too low for the resist to threshold, and the images do not resolve. Because the intensities of subsequent exposures add in the photoresist, separating the dense pattern onto two lower-density masks and exposing them into the same photoresist [Fig. 1.22(c)] does not improve the resolution. Only by optically isolating the two exposures in a litho–etch–litho–etch (LELE) sequence can the resolution limit k1 ¼ 0.25 be beaten. An alternative process

Figure 1.22 Illustration of the need to optically decouple interdigitated pitches to achieve k1 , 0.25 imaging.

21

The Escalating Design Complexity of Sub-Resolution Scaling

uses a bake operation to neutralize the first layer of photoresist before applying the second layer, but this litho–freeze–litho–etch operation is conceptually identical to LELE. Figure 1.23 compares three types of double patterning techniques: DDL used to overcome k1 ¼ 0.35 single-orientation restrictions, LELE used to penetrate below k1 ¼ 0.25, and self-aligned double patterning (SADP), which helps overcome pattern placement constraints in LELE. As can be seen in Fig. 1.23, LELE requires neighboring shapes to be mapped onto different masks. This layout decomposition operation resembles a two-color mapping operation that simply enforces a larger space between same-color features and allows for smaller space between different-color features.16 While this decomposition is trivial for the simple L-bars shown in Fig. 1.23, it is easy to see how very basic layout configurations can yield nontrivial decomposition conflicts, as illustrated in Fig. 1.24. Once the layout shapes are decomposed by mapping neighboring shapes onto different colors, it is easy to identify same-color-space violations. However, unlike DRC, which has a clear cause-and-effect relationship between the identified design-rule violation and the required corrective action, the same-color-space violations simply identify a cluster of shapes that, in totality, cannot be decomposed; they do not necessarily pinpoint the location where corrective action must be taken. Several options are available to resolve decomposition errors (also called color conflicts or odd cycles), as shown in Fig. 1.25. In some cases LELE can take advantage of the intermediate memorization layer and allow stitching, i.e., introduce a localized double exposure of a feature segment that allows a Layout

Exposure 1

Exposure 2

Etch

Layout

Exposure 1

Etch 1

Exposure 2

Etch 2

Layout

Exposure 1

Exposure 2

Etch

Double dipole lithography (DDL) Resolution ~ 80nm

Litho, etch, litho, etch (LELE) Resolution ~ 50nm

Sidewall Deposition

Self-aligned double patterning (SADP) Resolution ~ 40nm

Figure 1.23 Top-down illustrations of double dipole lithography (DDL), litho–etch–litho– etch (LELE), and self-aligned double patterning (SADP).

22 Good Layout

Chapter 1 Bad Layout

Good Layout

Good Layout

Bad Layout

Bad Layout

Figure 1.24 Three pairs of layouts, each with a two-color mappable layout and an unmappable version. Although undecomposable layouts are easy to identify after two-color mapping is attempted, conventional design rules fail to predict these odd cycles.

Layouts fixable by stitch

(a)

Layouts fixable by 3rd color

(b) Layouts to be avoided in design

(c) Figure 1.25 Three ways to legalize undecomposable layouts: (a) introduce a stitch, (b) introduce a third color, or (c) flag conflicts during the design.

color transition within the feature. Because a minimum overlap region must be maintained between the two feature segments to account for overlay errors, i.e., unavoidable misalignment between the two exposures, as well as patterning variations, it is not possible to introduce a stitch into nested lines

The Escalating Design Complexity of Sub-Resolution Scaling

23

Figure 1.26 Lithography simulation of a stitch introduced to resolve a color conflict in a single track jog. Poor overlap between the stitched line-ends causes localized narrowing of the composite wire, whereas tip-to-side spacing in the stitch region already shows signs of bridging. The localized pinching will be exaggerated when the two separate exposures become misaligned due to overlay errors.

without causing a color violation with the neighboring line. Furthermore, because the final dimension of the stitched region depends on the accuracy of the alignment of the two mask levels, stitching inevitably introduces additional localized dimensional variation that may not be tolerable by all design levels. To prevent stitch regions from becoming yield concerns, design rule constraints must be introduced that prevent many configurations that designers would prefer to use to resolve color conflicts. The example shown in Fig. 1.26 is one where being able to stitch a one-track jog in a wiring pattern would be tremendously useful for LELE-aware routing (covered in detail in Section 2.4), yet the lithography simulation shows that there is no margin in the process window between failure due to poor overlap and failure due to shorting to the neighboring same-color wire. For layouts where stitching is not possible or not supported by the foundry due to yield concerns, decomposition conflicts can be resolved by introducing a third color, as shown in Fig. 1.25(b). Adding a third mask and upgrading to a LELELE (LE3) process has the obvious significant cost and yield implications but also complicates the design space. Figure 1.27(a) shows a simple layout that cannot be decomposed into two masks without a color conflict. Because the exact location of a same-color space violation depends on the specific but arbitrary color assignment, it is useful to show the designers the odd cycle that must be broken, resulting in the triangular error marker. Breaking any of the three color dependencies by increasing the feature space to the same-color value will resolve this odd cycle. The same conflict can also be resolved by adding a third color, as in Fig. 1.27(b). However, the third color does not guarantee decomposability for all possible layout configurations, as in Fig. 1.27(c). The good news is that introducing a third color

24

Chapter 1 Odd-Cycle

(a) 3-color Clean

(b) 3-color Conflict

(c) Figure 1.27 Three simple layouts: (a) an odd cycle that makes this layout undecomposable for LELE, (b) introducing a third color in LE3 resolves the conflict, and (c) three colors do not guarantee that all layouts can be decomposed.

significantly reduces the number of undecomposable layouts, but the bad news is that these undecomposable layouts are even harder to detect and present in an actionable manner to the designers. Figure 1.28 shows the complex shape-dependency diagram that must be presented to the designer to communicate the presence of a collection of shapes that cannot be mapped into three colors. What was a simple triangle in LELE becomes a “Chrysler star” configuration, i.e., an odd number of interconnected odd cycles, in LE3. As shown in Fig. 1.28, there are multiple ways to break up the Chrysler star by either increasing selective spaces or adding a stitch, but error reporting and correction guidance become vastly more complicated in LE3 compared to LELE. The additional cost, process risk, and design complexity associated with three-color LE3 force this option for LELE conflict resolution to be used sparingly. The third option for handling LELE decomposition errors is shown in Fig. 1.25(c), i.e., preventing nondecomposable layout configurations through double-patterning-aware design rules, tools, and methodologies has a profound impact on the design environment and is the primary topic of Sections 2.3–2.5. The resolution limit of LELE is reached when the overlay error between the two interdigitated exposures becomes large enough to cause yield or electrical variability concerns. There is no first-principle hard limit for this resolution barrier, but it is typically set to 50 nm for state-of-the-art 193-nm immersion tools patterning wiring levels, which means that LELE is sufficient for the N14 node but is insufficient for N10. To get past this overlay-induced

The Escalating Design Complexity of Sub-Resolution Scaling

25

3-color Conflict

3-color Clean

3-color Clean

3-color Clean

Figure 1.28 A three-color conflict represented as an odd number of interconnected odd cycles. Breaking any one leg of the complex conflict graph legalizes the layout.

resolution limit for N10, the industry must move to self-aligned double patterning (SADP), also referred to as sidewall image transfer (SIT), introduced in Fig. 1.23(c). To better understand the design constraints associated with SADP, Fig. 1.29 provides a more-detailed cartoon of the patterning operation. The general concept of SADP is that a relief pattern (referred to as a mandrel) is built on the wafer. Sidewall spacers are then deposited onto these mandrel patterns. Because each mandrel line has two sides, this sidewall deposition doubles the number of features transferred onto the wafer, so a k1 ¼ 0.25 exposure can effectively create a grating at k1 ¼ 0.125. Depending on the polarity of the decomposition and patterning process, the deposited sidewalls can form high-resolution features or spaces between features. This polarity choice is very significant because the deposited sidewalls are manufactured at a single dimension. For some levels, e.g., the poly conductor level, fixed width with very tight linewidth uniformity is advantageous; for other levels, e.g., wiring levels, fixed space is more tolerable than fixed width. The later polarity of SADP is illustrated in Fig. 1.29. Given a sample layout [Fig. 1.29(a)] and keeping in mind that the goal is to form all dielectric spaces between metal wires by deposited sidewalls, the decomposition performs a two-color mapping operation that separates the layout into

26

Chapter 1 Layout

Decomposition

Mandrel

(a)

(b)

(c)

Mandrel Pulled

Block Mask

Sidewall

(d) Final Image

(g)

(e) (f)

Figure 1.29 Cartoon of SADP patterning steps to form a wiring pattern: (a) original layout, (b) decomposed layout including two-color mapping and dummy feature insertion, (c) mandrel pattern to be built on the wafer, (d) sidewalls deposited onto the mandrel, (e) sidewalls after the mandrel has been removed, (f) second exposure to fill spaces between sidewalls that are not intended to be metal, and (g) final image after all open spaces are filled with metal.

alternating mandrel and not-mandrel shapes. The mandrel shapes will ultimately be manufactured on the wafer and provide a relief pattern for sidewall deposition. The not-mandrel shapes will be patterned by virtue of their neighbor providing a sidewall-formed space. Thus, it is ultimately the absence of sidewalls that forms the wires. If a shape is mapped onto “notmandrel” and does not have a neighbor at minimum space, a dummy neighbor must be provided, as illustrated in Fig. 1.29(b). The resulting mandrel shape, the deposited sidewalls, and the sidewall image after the mandrel is removed, are shown in Figs. 1.29(c)–(e). Because the white space between the sidewalls in Fig. 1.29(e) only loosely resembles the original layout, a second patterning operation, called a block mask here, is used to effectively add dielectricity to areas that are not intended to be metal but are not covered by sidewalls, as shown in Fig. 1.29(f). Finally, all of the white space is filled with metal, and the resulting wiring pattern matches the original layout. In addition to driving significant process cost and complexity, SADP raises the bar for patterning-induced design restrictions by a very substantial margin. To ensure layout decomposability into mandrel and not-mandrel, just like LELE, neighboring features at minimum space must be mappable to different

The Escalating Design Complexity of Sub-Resolution Scaling Table 1.1 Space 1 1 , S , 2 2 ≤ S 3 3 ≤ S , 5 S $ 5

27

Comparison of decomposition rules for LELE and SADP. LELE

SADP

opposite

opposite forbidden forbidden same arbitrary

arbitrary

colors. More accurately, because the space is formed by sidewall deposition, it is not so much the minimum space but simply “the space.” Larger spaces in the layout are filled with a dummy mandrel, which must be designed at least at minimum width, and thus the next-largest allowable space is 3, and layout features forming this 3 space must be mappable onto the same color (because a different color dummy will be inserted between them). Only at a feature space of 5 does the color dependence of shapes end because the decomposition can now insert either one 3-wide dummy between same-color features or two 1-wide dummies between different color features. The complexity and interaction range of the SADP decomposition rules are considerably higher than for LELE, as the comparison in Table 1.1 shows. In addition to the complex, nonmonotonic, and long-reaching decomposition rules that are necessary to ensure proper mandrel generation, the block mask must also be manufacturable. Much like the line end and PRAF cut mask described in Section 1.4, designers have to either deal with complex layout restrictions on their primary design level to ensure the manufacturable auto-generation of the block mask or create the block mask explicitly in the design space. The staggering design complexity of SADP is generally managed either by “correct by construction” design rules that ensure automatic decomposability by allowing only simple gratings to be designed (an approach that is acceptable for levels like the poly or fins) or by incorporating the design rules into automatic wiring tools (referred to as routers), which is the topic of Section 2.4.

1.6 k1 < 0.125: Higher-Order Frequency Multiplication Venturing into what at the time of this writing would have to be considered the ‘exploratory domain’, scaling beyond the limits of SADP requires higher order frequency multiplication. One approach simply adds one more pitch division to SADP by using the deposited sidewalls as mandrel shapes. The name, self-aligned quadruple patterning (SAQP), is a misnomer because only two exposures are involved: the top mandrel and the block mask. The process sequence in the polarity that it might be used for (wiring pitches at k1 , 0.25)

28

Chapter 1 TM Width

TM Space

Top Mandrel

Bottom Mandrel

Sidewall

Metal

Figure 1.30 Conceptual cross-section of SAQP, also known as sidewall on sidewall (SIT2). Depositing sidewalls onto the deposited sidewalls provides fourfold resolution enhancement but limits the degree to which feature sizes can be varied.

is illustrated in Fig. 1.30. Both the bottom mandrel and the final sidewalls are deposited rather than printed, and thus the only variables in adjusting the final metal width (Fig. 1.30 again assumes the deposited sidewalls to form the dielectric space between wires) are the width and space of the top mandrel, which leads to three classes of wires: those with a width depending on the top mandrel width a, those with a width depending on the bottom mandrel width b, and those with a width depending on the top mandrel space g. In terms of color decomposition, rather than a 1-2-1-2 dependency, as seen in LELE and SADP, SAQP forces a 2-1-2-3-2-1-2-3-2 pattern. An alternative to SAQP is a patterning technique that combines various lithography, polymer chemistry, thin film, etch, and surface engineering disciplines in what is called directed self-assembly (DSA). By choosing block copolymers of appropriate weight ratios and etch selectivity, films can be engineered that self-assemble while annealing into patterns of distinct pitch.17 By choosing a proper film thickness and providing chemical or topological guiding patterns, this phenomenon can be used for higher-order pitch division, as shown in Fig. 1.31. Even though some researchers18 are working on bidirectional SAQP layout decomposition solutions, the complexity of the associated design rules and the challenging dimensional and overlay control of the mandrel and block masks in SAQP or the guiding pattern in DSA, drive most early adopters of these higher order frequency multiplication techniques into the domain of grating-based designs. In 2011–2013, IBM and Carnegie Mellon University collaborated on a DARPA-sponsored project, “gratings of regular arrays and trim exposures” (GRATE),19 to explore the efficacy of grating-based patterning both as an extension into the next higher-order frequency

The Escalating Design Complexity of Sub-Resolution Scaling

Poly-B

29

Poly-A

Resist pattern by 193 nm litho 100 nm pitch (30 nm line)

1. apply neutral surface 2. apply block copolymer 3. etch Block copolymer pattern 25 nm pitch

Figure 1.31 Directed self-assembly (DSA), often referred to as “frequency multiplication in a bottle,” can provide higher-order frequency multiplication for grating-like layouts.

multiplication domain and as a means of obtaining more-favorable cost-perfunction scaling in the sub-resolution domains. The primary findings of this project can be summarized as: • Previous (i.e., pre-N14 node) attempts at ultra-regularized layouts suffered from the inevitable introduction of more vias. • Splitting a single bidirectional metal into two unidirectional metals added substantially more vias. • Adding more vias to the design hurt the yield (vias are very susceptible to defects) and performance (vias add significant resistance). • The introduction of a local interconnect makes it possible to engineer cell architectures that effectively overcome this challenge by using these new interconnects to achieve unidirectional layout styles without adding extra metal or vias.

30

Chapter 1

• The GRATE project demonstrated the structural and functional integrity of such grating-compatible designs using local interconnect levels. Forcing the cell-level wiring onto a fixed pitch grating, while possible and demonstrable, results in significant power and performance loss caused primarily by high-resistance power rails. A more-optimal design point uses unidirectional gratings with repeating units of varying pitches, referred to at the time as structured gratings. This finding had profound implications on the patterning techniques that are applicable to grating-based patterning: techniques, such as interference-based lithography, that can only produce a single-pitch grating have been deemed inferior to techniques that can provide high-frequency gratings with some degree of pitch modulation, such as SAQP, or DSA. Although grating-compatible designs were shown to be comparable to conventional designs in a commercial N14 technology node, and the designpatterning co-optimization learning from the GRATE project gave hope that higher-order frequency multiplication can provide a viable scaling path, there was no evidence to suggest overall cost savings by forcing grating-compatible designs onto mature technology nodes using conventional design practices. It was found that maintaining bidirectional cell-level wiring, even with the introduction of a local interconnect, for as long as the patterning process can afford it, provides designers with layout optimization choices that yield quantifiable performance and leakage benefits.20 The largest opportunity for cost savings in advanced-technology nodes was seen in a grating-enhanced design infrastructure that exploited layout regularity as a feature rather than treating it as an impediment. Significant improvements in design efficiency and design quality were demonstrated in this project. These design benefits can be exploited to curb the escalating

Figure 1.32 Comparing three layout environments for N14 memory cells: (a) memory adjacent to memory (provides good patterning results), (b) memory adjacent to random bidirectional logic (provides poor patterning results), and (c) memory adjacent to highly regularized logic (provides good patterning results).

The Escalating Design Complexity of Sub-Resolution Scaling

31

technology scaling cost at any advanced-technology node, not only when a lack of patterning resolution forces these extreme design restrictions. As an example, Fig. 1.32 shows the patterning improvement of regularized logic in close proximity to memory arrays. Reducing the topological difference between memory and logic creates the possibility of densely packing these different circuit types and allows DTCO to go beyond layout optimization and extend into micro-architecture optimization.21,22

Chapter 2

Multiple-Exposure-PatterningEnhanced Digital Logic Design To provide a very rudimentary level of design insight as part of the overall DTCO discussion, this section reviews the primary elements of a digital standard-cell design flow. Of course, a complete system-on-chip (SoC) design contains many more design elements in addition to auto-routed digital logic blocks: memory blocks, analog circuits, I/O designs, and e-fuses, to name a few. Each undergoes DTCO negotiations in a very similar fashion to the elements of the digital-logic-design flow and will be discussed in more detail in Section 3.6.1.

2.1 Introduction to Physical Design The primary elements of a digital design flow are shown in Fig. 2.1. In this simplistic flow, the semiconductor industry is cleanly separated into “design” and “technology.” The process side communicates with the design side via a process design kit (PDK), which contains the fundamental technology description along with instructions about how to design for the technology and code to check the validity of the designs prior to release to the wafer manufacturer. The design rules are communicated in a design rule manual (DRM) and enforced with a design-rule-checking (DRC) deck coded on the design team’s electronic design automation (EDA) platform of choice. To ensure the logical correctness of the design, the PDK contains layout-versusschematic (LVS) checking decks that ensure that the drawn shapes properly execute the desired logic functions. To close the timing on the design, the PDK contains device models and physical extraction (PEX) decks that help designers estimate the electrical performance of their designs. In return, the designers provide a design tape-out containing the final layout rendered as polygons in a variety of possible formats (GDSII, OASIS, GL1). As a point of clarification for younger readers, the term “tape-out” is commonly used to describe the hand-off of a design from the designers to the manufacturing 33

34

Chapter 2

Logic Design Functional Requirements High-Level Design RTL design

Physical Design Standard Cell Library

Placement Routing Timing

Design Technology

Process Design Kit (PDK)

Tapeout

Data Preparation Process Assumptions Mask Build Wafer Fabrication Test and Characterization

Figure 2.1

Simplified diagram of a typical standard cell design flow.

facility and dates back to the days when this hand-off occurred with data stored on reels of magnetic tape. The design space is separated into two areas: logic design and physical design. In the logic design, the functional requirements of a circuit are captured in a register transfer level (RTL) file that is essentially a text string describing the desired logic states in every clock cycle of the circuit. This RTL file then becomes the input to the physical design where the algorithmic logic description is converted into a polygon rendering by pulling the appropriate physical representations of desired logic functions out of a standard cell library, placing these standard cells into a logic block, and wiring them together to achieve the desired logic operations with an automatic wiring tool called a router. Throughout the physical design, the manufacturability of the rendered polygons is verified by DRC, the logical correctness of the design is checked with LVS, and the proper arrival times of signals at the desired logic nodes is verified by PEX and timing models. This book is not intended to provide a primer in synthesis, place, and route, but simply to give a high-level insight into the basic concepts of physical design to help readers better appreciate the impact of patterning constraints and see the value of DTCO. To that end, the

35

Multiple-Exposure-Patterning-Enhanced Digital Logic Design Vdd (1)

Vdd (1)

PMOS Input = 0

Output = 1

PMOS Input = 1

Output = 0

NMOS

GND (0)

NMOS

GND (0)

Figure 2.2 Basic schematic of a simple CMOS inverter.

following sections will focus on standard cell layout, placement, and routing, primarily in the context of double-patterning-driven design constraints.

2.2 The Evolution of Standard Cell Layouts Before discussing the design impact of sub-resolution patterning in general and double patterning in specific, this section reviews some basic topological design objectives through a discussion of a few key attributes of standard cell logic layouts. As a quick refresher of the basic operation of CMOS circuits, Fig. 2.2 shows a schematic of a simple inverter. With the PMOS source connected to Vdd (i.e., high voltage) and the NMOS source connected to ground, a low input to the gates connects the output to Vdd through the PMOS gate, and a high input connects the output to ground through the NMOS gate, yielding a 0 output for a 1 input, and vice versa. This simple inverter highlights the basic connectivity that must be established in the logic cell: power connections to the source, signal connections to the gate, and output connections to the drains. 2.2.1 Attributes of standard logic cells Adding one more level of logic complexity, Fig. 2.3 shows the logic truth table, the schematic, a stick-figure pseudo-layout, and a more-realistic layout for a two-input “not and” (NAND) gate. It is easy to see how the drawn schematic executes the logic shown in the table, and it is also quite clear how the stick figure is a reasonable and compact physical rendering of the schematic (two transistors in parallel on the PMOS side, and two transistors in series on the NMOS side). The primary difference between the layouts and the schematic is that the NMOS and PMOS gates share common input connections in the center of the cell rather than being contacted individually, as shown in the schematic. The more-realistic layout rendering simply maps the connectivity established in the stick figure rendering of the pseudo-layout onto manufacturable polygons as they would be prescribed by the design

36

Chapter 2 Vdd (1)

Input A Input B

Output Input A

0

0

1

1

0

1

0

1

1

Input B

PMOS

Output

1

1

Input B NMOS Input A

0

(a)

GND (0)

(b)

Vdd (1)

Power-rail

Output Input A

Input B

Track ‘Ruler’

Vdd

Output A

B

I/O pins

GND GND (0)

(c)

(d)

Figure 2.3 Four unique renderings of a NAND (i.e., not and) logic gate: (a) the truth table, (b) the schematic, (c) a stick-figure pseudo-layout, and (d) a fictitious but realistic layout.

rules. A quick orientation to the layouts discussed in this chapter: only four design levels are shown in the layout in Fig. 2.3: the active area (light grey), which forms the diffusion regions that provide the source and drain for the transistor; the poly conductor (diagonal hatch, referred to herein as poly), which forms the gate of the transistor; the first level of metal (dark grey), which wires the transistors together; and contacts (white), which connect the metal to the active area and poly. Vdd and ground power are provided to the cell in power rails that run as continuous lines across the full width of the logic macro (the power rail shapes extend to the edges of the cell so that abutting cells will form continuous power rails). The “ruler,” shown to the left of the cell layout, provides a dimensionless scale normalized to the wiring pitch. This particular cell image would be referred to as a ten-track (10T) image because it is ten wiring pitches tall (the cell image may not actually contain ten wiring tracks because some wires, like the power rail, may be drawn wider and use more tracks). The power connections (also referred to as power taps) for this particular cell image are created on the diffusion shape. The input and output

37

Multiple-Exposure-Patterning-Enhanced Digital Logic Design

Good Pin Access

Poor Pin Access

Figure 2.4 Two NAND renderings illustrate the difference between good and poor pin access (light grey: diffusion; // hatch: poly; dark grey: first metal; \\ hatch: second-level metal routing tracks).

connections to the poly and diffusion are drawn larger than necessary in order to provide more connectivity options for the wiring operation that follows the placement of these logic cells. Rather than connecting to a single point on the poly, the router can drop a connection anywhere on the input pin (also referred to as a port); more connectivity options allow the router to find a legal wiring solution faster. The discussion on input connections leads to the first quality attribute of a logic cell, as shown in Fig. 2.4. Two cell images for the NAND gate are shown with diagonal-hatched bars running across them. These bars represent the tracks available to the router to connect to the input pins. In this example, the A input of the left cell has four potential connection points (i.e., intersections of the routing tracks and the input pin), and the B input has three. Figure 2.3(d) (which, to illustrate this point, uses metal power taps rather than diffusion taps) only has two access points per pin because wiring space is lost to the power taps. There are many factors in a real design that would influence a design choice; the point of Fig. 2.4 is simply that designers will try to maximize pin access, i.e., they will try to design input and output pins that intersect as many routing tracks as possible to provide the router with a large number of connection possibilities. Because the power rails supply power to an entire row of logic cells, they need to be designed to carry sufficient current without causing performance or reliability issues. To that end, Fig. 2.5 compares two cell images with different power rail styles. Figure 2.5(a) has a narrow power rail with single connection points to the diffusion; Fig. 2.5(b) uses a wider power rail to reduce resistance and improve resilience to electromigration effects. Furthermore, it uses a long local-interconnect shape that runs underneath and parallel to the power-rail to

38

Chapter 2

Figure 2.5 Two NAND renderings comparing a simple power rail to a robust power rail (brick pattern: local interconnect).

allow multiple redundant contacts to establish a very-fault-tolerant power connection (see the DFM discussion in Chapter 3 for more on redundant vias). In addition to incurring a higher process cost due to the need for the local interconnect, the wider power rail also takes up wiring space and leaves less room for pin access. While designers would always like to design the most robust power rail possible, different designs favor different trade-offs. Although robust power and signal connections in the logic cell are important, they are ultimately just overhead necessary to wire transistors into logic functions. The designer’s goal is to use as much of the available cell area, more specifically the cell height, to build active transistors. A major measure of the transistor’s performance is the channel width, i.e., the width of the diffusion carrying current through the gate. A wider diffusion width carries more current, which, in turn, charges subsequent logic gates faster and leads to higher performance. The degree to which a cell image uses the available cell height for active diffusion width is often referred to as diffusion efficiency. Figure 2.6 contrasts the diffusion efficiency of two cell images. Figure 2.6(a), as gauged by the unitless ruler, uses 7 of the available 10 tracks in cell height for active diffusion width, yielding a diffusion efficiency of 0.7. The poly contacts were staggered in Fig. 2.6(b), perhaps due to resolution challenges, and the diffusion efficiency drops to 5 out of 10, or 0.5. For many technology generations, the goal of ASIC designers was to maintain two-thirds diffusion efficiency. In advanced-technology nodes this goal became increasingly difficult to achieve because patterning, manufacturability,

Multiple-Exposure-Patterning-Enhanced Digital Logic Design

39

Figure 2.6 Two NAND renderings that compare good and poor diffusion efficiency (the fraction of cell height that is available to form active channels).

and electrical constraints caused the overhead consumed by transistor wiring to take up larger portions of the cell height. A major relief came in the form of FinFET, as illustrated in Fig. 2.7. In a planar device the diffusion width is simply the length of the intersection of the poly gate and the active area. By adding a third dimension to the transistor profile, the device current can now flow on the sidewalls and the top of each fin, making the diffusion width 2 the fin height plus the fin width, multiplied by the number of fins in the device. Depending on the dimensions achieved in a particular technology node, the diffusion efficiency can easily increase from 0.5 with planar devices to 1.5 with FinFET. While it is important to optimize diffusion efficiency to allow dense packing of active transistors, it is equally important to provide a range of device widths within the chosen cell image. A very wide device can generate a lot of drive current that can overcome capacitive delays of long signal runs very quickly, but in doing so it uses a lot of power. Some instances in a design achieve acceptable performance at significantly reduced power by using narrower devices. Design efficiency benefits when narrow and wide device versions of a given cell can be easily generated by an automatic synthesis tool without having to move wires and contacts, so this becomes another layout attribute of the cell. On the other hand, even the full device width of a verydiffusion-efficient cell image will not be able to generate enough drive current for the most performance-critical instances in the design. To create evenstronger cells, the designers resort to multi-finger devices where multiple transistors switch in parallel to allow a large drive current to pass through the gate. A standard cell library can have more than 50 unique renderings of a

40

Chapter 2

Figure 2.7 (a) Planar and (b) FinFET rendering of the NAND cell. Three-dimensional devices significantly improve the achievable diffusion efficiency in a given cell image.

Figure 2.8 Three drive-strength renderings of the NAND: (a) a narrow device with low drive strength, (b) a wide device with medium drive strength, and (c) a multi-finger device with high drive strength.

NAND gate in a particular cell image to optimize power versus performance trade-offs; three such renderings are illustrated in Fig. 2.8. Additionally, the technology definition must support multiple cell heights to accommodate power-sensitive logic macros (that predominantly use low-drive-strength cells) as well as performance-sensitive logic macros (that predominantly use highdrive-strength cells) with the same technology definition.

Multiple-Exposure-Patterning-Enhanced Digital Logic Design

41

2.2.2 Impact of patterning-driven design constraints Following explanations of the various resolution barriers that were penetrated in the semiconductor industry’s relentless pace of scaling, and a review of some of the topological design objectives that concern standard cell logic designers, Fig. 2.9 illustrates the net impact on the NAND layout from five nodes of scaling across three unique resolution domains, as previously shown in Fig. 1.1; N65 and N32 are not shown because they reside in the same resolution domain as N45. A few qualitative differences between the cell images in Fig. 2.9 stand out: • In response to the increasing impact of corner rounding, the poly and diffusion shapes have become rectangular (no notches). This change initially forced the power connections from the diffusion (where they could be shared across the vertical cell boundaries) onto the metal and finally onto the local-interconnect levels. Similarly, signal connections to the poly have been moved from contacts that provide simple vertical connectivity to local interconnects that can also provide some degree of lateral connectivity. • In response to increasing routability challenges stemming from more constraints on upper-level metal layers, the designer put more emphasis on pin access, resulting in larger and more spread-out pins.

Figure 2.9 NAND examples from N90, N45, N22, and N14 that show the net effect of designer responses to various scaling challenges.

42

Chapter 2

• In response to the increasing impact of proximity effects, the poly was forced onto a fixed pitch, i.e., full-size dummy neighbors are designed into the layout (the only attribute distinguishing them from actual poly gate shapes is that they are not connected and do not form a functional transistor). The proximity effects gradually extended from lithography to other processes, such as etch, and then, with the advent of strained silicon as a performance boost, into device engineering. Selectively adding compressive and tensile stress layers to the transistor allowed device engineers to improve mobility in the channel and provide a performance boost when it became challenging to scale the channel length as the main means of improving device performance. Engineering and modeling these stress layers in strained silicon devices became very difficult without control over the exact dimensions of the source and drain regions. Having already inserted dummy poly for lithographic reasons, device engineers implemented “diffusion tuck under,” i.e., all source drain regions had to terminate under a dummy poly at a fixed space to the active poly. (Chapter 4.1 explains how this diffusion tuck under, in combination with a “double diffusion break,” i.e., an empty poly track to the next active transistor’s dummy poly, increased the cell width by one poly pitch. However, this increase in width was the only scaling detractor for the NAND over the entire N130-N10 range. • As discussed earlier, in response to diminishing diffusion efficiency, FinFET was introduced to provide a substantial boost in that scaling parameter. While the design styles of the NAND cell shown in Fig. 2.9 clearly evolve towards increasingly restricted geometries, the primary impact on cell area was an increase in the cell width to accomodate the diffusion tuck-under and doublediffusion break. Other than this one poly pitch growth in width, the cell area scaled at the same ratio as the critical pitches. Therefore, one might assume from this simple comparison that logic scaling was largely unaffected by all of the pattering constraints over these five technology nodes. However, it is important to consider that, for example, it took a single contact layer in N90 to connect the first metal to the devices; in N14 it takes seven mask levels on four different process layers to make that connection. Additionally, the process-complexity increase in moving from planar devices with poly gates to FinFET with high-k metal gates is staggering. Ultimately, the scaling impact must be measured not only by the loss of transistor density but also the process cost and complexity. The NAND used in these discussions is part of a class of logic cells referred to as combinatorial logic cells that execute Boolean logic functions such as AND, OR, NOR, and AOI (i.e., “and or invert”). As illustrated in Fig. 2.10, during the operation of a logic circuit a signal is sent through a series of combinatorial logic cells before it reaches a latch in which the state of the logic signal is memorized before it is launched through another set of

43

Multiple-Exposure-Patterning-Enhanced Digital Logic Design

Latch A Clock

N130 MUX

Boolean logic functions

Store state information until next clock cycle

Combinatorial Logic

Latch B Clock

N10 MUX

Figure 2.10 A large portion of a logic block is occupied by latches. The challenge of scaling these complex sequential logic cells is illustrated by comparing a N130 and N10 multiplexer (MUX) layout.

combinational logic cells in the next clock cycle. Because these latches contain memory as well as logic functions, they are referred to as sequential logic cells, more importantly; they tend to be the most complex layouts in a logic block. A typical logic block consists of 30–40% latches by area, so it is very important to scale these logic cells very efficiently. The bottom half of Fig. 2.10 compares the N130 and N10 node renderings of a multiplexer (MUX), a critical design element in a latch. Although a patterning engineer will marvel at the regularity exhibited by the N10 MUX layout, the scaling challenges encountered in the complex logic cells are undeniable. Figure 2.11 breaks down the cumulative scaling impact into individual steps based on the restrictions incurred by the most fundamental construct in the MUX, the poly gate over the diffusion intersection that forms the transistor. Figure 2.11(a) shows three transistors as they might have been used somewhere in a complex logic cell. Dense packing is achieved by staggering the transistors vertically, which is made possible by a complex diffusion shape that provides some local wiring capability in addition to forming source/drain regions and by the freedom to put poly on a range of pitches. After corner rounding became an issue (due to the loss of diffracted orders at low k1), diffusion corners had to be moved far away from active gates, and wiring on the diffusion became very inefficient; the resulting loss of stacked devices is shown in Fig. 2.11(b). As discussed earlier, restricting poly to a fixed

44

Chapter 2

(a)

(b)

(e)

(c)

(d)

(f)

Figure 2.11 Increasing the constraints on transistor layouts in complex logic cells: (a) densely packed transistors using diffusion routing, (b) loss of stacked devices, (c) dummy poly with diffusion tuck-under, (d) double diffusion break, (e) loss of tapered devices, and (f) introduction of fins.

pitch (to achieve better linewidth control at low k1) was quickly exploited by device engineers in the era of strained silicon, and diffusion tuck-under was introduced, as shown in Fig. 2.11(c). The formation of a robust isolation between separate diffusion shapes tucked under the same poly became a yield and reliability concern and lead to the introduction of double diffusion break, as seen in Fig. 2.11(d). Diffusion-shape corner rounding, especially in combination with the process complexity introduced by FinFET, eliminated the possibility of having two devices of different width share the same diffusion shape leading to the loss of tapered devices in Fig. 2.11(e). Finally, the price to pay for the high diffusion efficiency provided by FinFET is the coarse granularity with which the device width can be controlled (to balance power/performance for a given circuit). While it was previously possible to adjust the device width in single design-grid steps— often as small as 1 nm—FinFET changes the effective device width in integer multiples of the fin count. Cumulatively, the design restrictions outlined in Fig. 2.11 cause 20–40% less area scaling on complex logic23 than possibly achievable based on the pure linear scaling of critical dimensions. Although there is a clear correlation between patterning, device, and integration challenges and the resulting loss of scaling, the exact technology node at which scaling penalties were incurred in the path from N130 to N10 varies by design. Product designs focused on high-performance, early yield on large chips and extreme reliability tend to adopt design restrictions earlier than product designs competing on the basis of cost, density, and low power consumption.

Multiple-Exposure-Patterning-Enhanced Digital Logic Design

45

Furthermore, integrated device manufacturers (IDMs) tend to negotiate design restrictions with their internal design teams more effectively than foundries competing for fabless design customers. Many leading-edge fabless design companies have acquired deeper process expertise to not only negotiate more-aggressive design rules but also assess the risk of not adopting design restrictions in time. More discussion of construct-based technology scaling is covered in Section 4.1 as part of the DTCO overview. The scaling impact assessment in this section focused primarily on the loss of area scaling and the increase in process complexity. The following sections provide a qualitative view of the increase in design complexity encountered with the introduction of double patterning.

2.3 Standard Cell Layout in the Era of Double Patterning With the introduction of double patterning in the N14 node, designers were confronted with two-color decomposability (some felt a sense of déjà vu for when the semiconductor industry experimented with altPSMs). As outlined in Section 1.5, LELE double patterning requires the layout to be cleanly separable, or decomposable, into two masks. As Fig. 2.12 shows, even for a moderately complicated layout it is difficult to judge whether a particular collection of shapes is decomposable or not. After the layout is colored following simple same-color versus different-color spacing rules, as shown at the bottom of Fig. 2.12, it is easy to identify un-decomposable layouts by the presence of shapes that violate the same-color-space rule. Although many academic and industry papers have been written on efficient decomposition algorithms, it is more important to create a set of design rules, checking tools, and methodologies to prevent decomposition errors. The topic of how much color information designers need to see in LELE became a rich source of material for design and patterning conference evening panel discussions. The different foundries’ marketing teams became divided between advocates of “colored” and “colorless” design flows even though the fundamental differences in these flows were minor compared to the overall complexity introduced by double patterning. Figure 2.13 shows three variants of the LELE-enhanced cell-level design flow. Providing the designers with split-level design rules, i.e., simply stating that the space between shapes on different masks is n while the space between shapes on different masks is 2n, allows designers to create colored designs that pass DRC without any further complications as shown in Fig. 2.13(a). To assist this split-level design methodology, EDA tool suppliers developed interactive tools that resemble real-time spell checkers and automatically color shapes as they are placed into the context of other colored shapes. In contrast to this explicitly colored design methodology, the color-aware spacing rules can be provided to an automatic decomposition tool that runs under the covers just prior to DRC;

46

Chapter 2 Bad Layout

Good Layout

Odd Cycle

Clean Decomposition

Figure 2.12 Two layouts: one LELE compliant, the other not. Decomposability is nearly impossible to establish with conventional design rules.

this “colorless” design flow is illustrated in Fig. 2.13(b). The feedback of the automatic decomposition engine to the designers can take several forms. If the color assignment is shown to the designer and only same-color space violations or shapes that cause color conflicts are identified, the designer may be misled into believing that the particular shape or the specific space identified in the simplistic color-aware space check is the one that needs to be corrected, when in reality it is always a group of shapes that form the conflict. Known as odd cycles, these groups of color-related shapes must be explored in their entirety to find an acceptable means of layout legalization. Looking at the colored layout or not is largely a designer’s personal preference, but presenting clear and actionable odd-cycle information to the designer is a major design-efficiency requirement. Similarly, whether to keep the colored shapes or revert to an uncolored design after a legal layout solution has been confirmed largely depends on the chosen color-aware placement methodology, which is discussed in the next chapter. Some design flows favor certain shapes to receive a specific color, either to support a specific placement methodology or to ensure good matching between feature sizes after patterning. In what some refer to as “colorless

47

Multiple-Exposure-Patterning-Enhanced Digital Logic Design Manual ‘Split-Level’Design Split Level Design Rules

2-color DRC

(a) Auto-Decomposition in DRC Split Level Design Rules

Auto-Decomposition, 2-color DRC

Errors

(b)

Anchored Auto-Decomposition in DRC Split Level Design Rules

Auto-Decomposition with ‘anchored’ coloring, 2-color DRC

Errors

(c)

Figure 2.13 Three LELE legalization design flows that ensure decomposable logic cell layouts: (a) explicit coloring by the designer using split-level design rules, (b) automatic decomposition and odd-cycle reporting, and (c) anchored decomposition on partially colored layouts.

design with pre-coloring,” Fig. 2.13(c) shows the anchored automatic coloring variant of the design flow. Because the color of the anchor shapes can overconstrain the decomposition problem, they must be taken into account in the two-color mapping solution and appropriately reported to the designers. Figure 2.13(c) also illustrates that, in some cases, a color conflict is resolved by stitching, rather than increasing the space between shapes. If supported by the foundry for a given design level, stitching provides a useful means of conflict resolution, but because stitched regions of the final pattern can experience higher-dimensional-variation additional constraints, such as contact placement near stitches or stitches near corners, they often add significant complexity to the split-level design rules. Moving from LELE in N14 to LE3 (i.e., adding a third color in design and a corresponding third repetition of the lithography and etch process sequence) in N10 added one more level of complexity. As illustrated in Fig. 2.14, it is important to remember that the goal is not to change a 1-2-1-2 coloring scheme into a 1-2-3-1-2-3 scheme but rather provide a means of conflict resolution because a relative increase in the same color space causes more odd

48

Chapter 2

N10, 3-color LE3

N14, 2-color LELE

Same-color Space

Figure 2.14 Because the feature pitch scales between N14 and N10 and yet the single mask resolution stays constant, the minimum same-color space effectively increases, which creates more shape interactions and more odd cycles.

cycles. Neighboring features still have to be placed on different masks, whereas one-feature-over can occur on the same color again. The main benefit of the third color comes into play in tip-to-side space: because the single mask resolution does not change from N14 to N10 and yet the designed feature pitch shrinks, the distance at which a tip-to-side space can be formed between shapes of the same color effectively grows, as shown in Fig. 2.4. All of the design flow solutions described earlier equally apply to three-color-based flows, but the added complexity of efficiently reporting non-three-color decomposable clusters of shapes to designers as well as placement efficiency benefits that can be gained from pre-colored cells in LE3 (as discussed in Section 2.4.1) favor explicitly colored design flows, as can be seen reflected in many foundry’s N10 design rule manuals.

2.4 Two Generations of Double-Patterning-Enhanced P&R Although place and route (P&R) is often referred to as one operation, it is actually two distinct yet co-optimized operations that involve placing the cells followed by routing to wire them together. The logic cells are first stacked together in an optimized fashion, i.e., to minimize connection distances, avoid wire congestion, and optimize performance, and then the cells are wired together by the router.

49

Multiple-Exposure-Patterning-Enhanced Digital Logic Design

2.4.1 Color-aware placement To ensure robust power distribution and allow sharing of wide power rails between cells, a common practice abuts rows of cells in alternating orientations. As illustrated in Fig. 2.15, one row of cells is placed with the VDD power rail on the top edge, and the next row is placed with VDD at the bottom. A fundamental requirement to enable placement is that all cells in a logic block have the same height. However, because the width of logic cells cannot be fixed (different logic functions and different multi-finger implementations of these cells require different number of poly tracks), cells must be arbitrarily placeable along the horizontal cell boundaries. Finally, it is desirable but not absolutely necessary to be able to flip cells about the vertical axis as an additional knob in cell-placement optimization to shorten connection distances or relieve pin congestion. To illustrate what a digital design looks like after P&R, Fig. 2.16 shows a screen clip of a few cell’s M1 wired with two levels of router generated wiring (M2 runs horizontal, and M3 runs vertical). The following discussion on color-aware placement, made necessary because of the introduction of LELE double patterning in the N14 technology node, is limited to the first metal shapes (i.e., M1). Other shapes in the cell layout below M1 would potentially experience similar constraints, and the boundary conditions and placement constraints discussed herein would apply equally to these levels. The simple LELE objectives of ensuring color-consistency across the power rails, to allow arbitrary cell abutment on the horizontal edge, and color-clean spacing across the vertical boundaries presented nontrivial challenges to the placement methodology and introduced the need for new features in the placement tool. Figure 2.17 shows a spectrum of placement Small Block of Placed Cells GND Sample Cell VDD

VDD

GND

GND

Figure 2.15 Placement constraints: cells must have the same height but should be arbitrarily stackable into abutting rows.

50

Chapter 2

Figure 2.16 Screen clip of the logic layout after placement and routing (M1: black, M2: //, and M3: shaded).

methodology options, ranging from more-restrictive boundary conditions (top of the table) to more-advanced placement tool capability (bottom of the table).24 The first solution in Fig. 2.17 relies exclusively on the placement tool to prevent post-placement color conflicts, which, due to the abutment requirements outlined earlier, is not feasible for most commonly used digitallogic design flows. The second option enforces boundary conditions at the cell level that allow post-placement color conflicts to be resolved by swapping the color assignment of shapes interior to the cell (i.e., by reversing the color assignments of pins and signal wires while preserving the color of the power rails). This placement methodology is very costly in small cells because an entire wiring track is lost in to ensure color-clean spacing to the power rail regardless of color assignment of shapes in the cells. However, this approach has some merit for taller cells and becomes more attractive in LE3, as will be shown in Section 2.4.1. The third solution relies on color-aware placement, i.e., the placement tool must recognize a color conflict across the cell boundary and then either flip the cell, potentially giving up on some other optimization parameter such as pin density reduction, or give up on some placement density by inserting a fill

51

Multiple-Exposure-Patterning-Enhanced Digital Logic Design

Boundary Conditions

Horizontal: unconstrained Vertical: unconstrained

Horizontal: fixed color Vertical: unconstrained ‘same-color’ to horizontal Horizontal: fixed color Vertical: unconstrained ‘Anchor’ to horizontal

Horizontal: fixed color Vertical: ½ ‘same color’

Placement Challenge

Impact

Add space to avoid color conflicts across both boundaries.

Placement complexity not supportable.

Arbitrary placement is possible with postplacement color flipping.

Loss of horizontal wiring tracks due to ‘same-color’ bad for dense cells.

Avoid conflicts across vertical edges by mirroring cells or adding space.

‘Color-aware placer’ needed, area increase ~2%

Arbitrary placement possible.

Density impact has to be managed through smart design rule choices.

Figure 2.17 A range of methodology options must be considered to manage color-clean placement in LELE.

cell in between the offending cells. This capability has been developed and demonstrated by the commercial EDA tool providers but some design teams use internally developed tools with limited R&D support and may not have access to this color-aware placement capability. Highlighting the need for holistic design flow optimization, the final solution in Fig. 2.17 takes advantage of the double diffusion break (i.e., an empty poly track at the cell boundaries) adopted by some technology offerings in the N14 technology node. Even though the double diffusion break adds an additional poly track to the cell boundary to improve device performance, that extra space can also be used to manage color-clean cell placement on M1. By enforcing appropriate boundary conditions on M1, the additional space of the added poly track can be used to ensure color-agnostic placement without additional density loss. Although Fig. 2.17 offers a qualitative ranking of the proposed solutions, the optimal solution clearly depends on details of the specific design’s objectives, the design team’s access to leading-edge EDA capability, and the specific foundries’ constraints. To help visualize the reason for the continued pressure that sub-resolution scaling put on cell-level design and color-aware placement, Fig. 2.18 shows the correlation between minimum M1 feature pitch versus the single exposure resolution limit for the N22, N14, and N10 technology nodes. The N22 node is very deliberately positioned such that the minimum feature pitch exactly matches the practical resolution limit that can be achieved with reliable yield

52

Chapter 2

N22 Resolution = 1.0x feature pitch

N14 Resolution = 1.3x feature pitch

N10 Resolution = 1.7x feature pitch

Figure 2.18 Single exposure resolution vs. minimum feature pitch for N22, N14, and N10 illustrating the need for double patterning in N14 and N10 as well as the expanding color interaction distance (relative to minimum feature pitch) between N10 and N14.

in a single exposure. The N14 node pushes the M1 pitch below the single exposure resolution limit and forces the use of two interdigitated exposures in LELE. The N10 technology node further pushes the feature pitch below the resolution limit, effectively increasing the space over which two shapes have to be placed on different masks. Recall the numerical resolution limits outlined in Section 1.5: even though the N10 minimum pitch is dangerously close to the point where misalignment in LELE becomes the resolution limit, the design benefits of being able to maintain bidirectional layout styles on M1 (such as the simple combination of vertical input pins, bidirectional output pins, and horizontal power lines shown in Fig. 2.19) warrant the added process risk of

Multiple-Exposure-Patterning-Enhanced Digital Logic Design

53

N22

N14

N10

Figure 2.19 Two abutting logic cells in N22, N14, and N10. The increasing colorinteraction distance forces third-color and cell-to-cell-color interaction.

keeping a litho–etch multiple-exposure-patterning solution versus moving to a more-restrictive self-aligned double patterning solution. Note that LELE, i.e., two-color mapping, is still sufficient to resolve the minimum feature pitch in the N10 node. The effect that the increase in the distance over which color-transition between shapes must be ensured is shown in Fig. 2.19. In the N22 node, the last single-exposure node, shapes have to follow standard design rules (i.e., minimum width, space, area, etc.) but there are no color-dependent spaces. In the N14 node, two-color mapping is introduced and designers have to avoid

54

Chapter 2

odd cycles in the cell designs. However, the relatively short color interaction range (only 1.3 the minimum feature pitch), along with the selective introduction of a double diffusion break, allowed many designs to avoid cellto-cell color interactions. In the N10 node, the color interaction range increased to 1.7 of the minimum feature pitch, causing a rapid increase in the number of odd cycles in aggressively scaled designs with bidirectional M1 and, more importantly for this particular discussion, making it impossible to avoid cell-to-cell color interactions. The introduction of a third color with the switch to LE3 provides cell designers with a means of eliminating odd cycles, as explained in the previous paragraph, and it opens up more options for color-aware placement as shown in Fig. 2.20. Flipping a cell or adding extra space between cells by inserting a nonfunctional fill cell to avoid conflicts are still options, but with the introduction of a third color, color swapping of the two colors that are not used in the power rail can be used to eliminate cell-tocell color conflicts without area growth. It is important to distinguish between cell-level two-color swapping and full three-color remapping. Full recoloring can potentially change the relative color assignment of neighboring features. In LE3, a feature can have neighbors that are either the same color or different colors. The former means that an overlay error always moves one shape closer at the same rate the other shape moves farther away. The latter means that Cell-to-Cell Color-Conflict

Swap offending color

Flip offending cell

Separate offending neighbors

Figure 2.20 LE3-aware placement.

Multiple-Exposure-Patterning-Enhanced Digital Logic Design

55

both neighbors move independently. Because this subtle difference can affect the timing corners of the design, the relative color dependence of shapes should be preserved during the post-placement color-conflict removal. Of course, to enable color-aware placement in LE3, a colored design flow has to be enabled, and the designers and the design tools must be directly exposed to the split-level design rules. 2.4.2 Color-aware routing The development of double-patterning-enhanced routing solutions required in-depth collaboration between wafer manufacturers and the EDA industry. After the fundamental capability to comprehend and enforce color-dependent spaces or odd cycles was enabled in the router, the concern shifted to quantifying how much design impact these additional constraints would have. This section shares some LELE-aware routing results obtained early in the N14 technology node as part of an ongoing IBM–Cadence advanced-routing collaboration. To quantify the impact of LELE on a block-level design, a 10T IBM logic library was used to render a IBM floating point unit (FPU); it was auto-routed using the Cadence Encounter digital implementation system up to M5, but only the M2 and M3 levels were routed at minimum pitch and were subjected to LELE coloring constraints. A primary measure of scaling impact is the placement density or area utilization that can be achieved, i.e., the final block-level area scaling is a function of the raw wiring pitch scaling and the amount of white space the placement operation had to leave in the logic block to enable the router to find a legal wiring solution. The most obvious experiment would be an assessment of achievable routing density with and without LELE constraints. However the sequential nature of the placement and routing operations and the inherent complexity of routing itself requires experiments to be run as sequential trials with a particular target density as an input and the achievable routing quality as an output for each trial. In this case, experiments were run at 60% to 90% placement density in 5% density increments, letting each routing trial iterate until a DRC clean solution was achieved. To help visualize how these different densities correlate to final logic block area, Fig. 2.21 shows three of the routed blocks. Even though individual wires cannot be recognized in Fig. 2.21 the greyscale of the block clearly gets darker as the block gets smaller, indicating higher wire density. These routing trials were repeated for three levels of constraints, as shown in Fig. 2.22: • unconstrained bidirectional routing that was DRC-clean but not LELE compatible; • bidirectional routing constrained to be two-color mappable according to the LELE rules; and

56

Chapter 2

Figure 2.21 Incrementally denser logic blocks produced by the LELE routing experiments.

Colored Bidirectional

Uncolored Bidirectional

• 1 track jog • 2 track jog • 3 track jog

Colored Unidirectional

Figure 2.22 Three design styles explored in the LELE routing experiments. Both the orientation and color restrictions were varied.

• routing constrained to be unidirectional and essentially two-color mappable by design. Figure 2.22 also points out different jog heights encountered in the bidirectional routing solutions. The quality metrics that were compared between these runs were the total wire length and via count needed to close each individual routing solution. The results of these experiments in terms of the wire-length and via-count increase of the LELE-constrained routing relative to the unconstrained

57

Multiple-Exposure-Patterning-Enhanced Digital Logic Design

Wire-Length Increase of Colored Layouts (relative to uncolored bidirectional, M2-M5) 'FPU'

1.00% 0.80% 0.60% 0.40% 0.20% 0.00%

unidirectional bidirectional

0.6

0.65

0.7

0.75

0.8

0.85

Density

(a) Via-Count Increase of Colored Layouts (relative to uncolored bidirectional, V1-V4) 'FPU'

15.00% 10.00%

unidirectional bidirectional

5.00% 0.00% 0.6

0.65

0.7

0.75

0.8

0.85

Density

(b) Histogram of Jog-Height at 80% density 10000 1000 uncolored

100

colored

10 1 1-Track 2-Track 3-Track 4-Track 5-Track 6-Track >6-Track

(c) Figure 2.23 Data of the early LELE routing demonstrations conducted in a Cadence-IBM collaboration.

routing are shown in Fig. 2.23. An insignificant increase in the wire length appears in Fig. 2.23(a), and a substantial increase in the via count is shown in Fig. 2.23(b). Although it seems obvious why the unidirectional routing solution forces such a substantial increase in the via count (all jogs in the wire must be replaced with a step up though a via, a short length of wire in the perpendicular direction, and a jump back down through a via), it is less clear why the bidirectional colored routing solution also shows a large increase in the via count. Figure 2.23(c) sheds some light on this matter by showing a

58

Chapter 2

histogram of jog heights for the colored versus uncolored bidirectional routing solutions. The coloring constraints, along with the absence of a stitch solution (see Fig. 1.26 for single-track jog stitch concerns), eliminate the possibility of odd-track jogs (specifically, the single-track jog, which is heavily used in the unconstrained routing solution and is not available in the colored routing solution), forcing the introduction of more vias. Because vias play a major role in yield loss (see Section 3.1, part of the DFM discussion), the noticeable reduction in the via count justifies the more-difficult-to-implement bidirectional colored routing solution. Furthermore, the availability of wrong-way wiring allows for the use of more-robust via bars (i.e., 2:1 via rectangles to replace the conventional square vias), which helps mitigate the impact of the via-count increase. This section highlights how LELE-aware routing has been reduced to practice but does not come without a cost. Once again, the impact of patterning-induced scaling cannot be measured by the reduced area scaling alone; secondary effects, such as the via-count increase, are equally important. Even as the LELE-enhanced routing solutions continued to be refined, the relentless pace of scaling forced the EDA industry to immediately tackle the next set of formidable challenges with SADP-enhanced routing for the N10 node. The patterning sequence of SADP for an M2 level is shown in Fig. 2.24. The original layout must be decomposed into wires to be formed by mandrel and those to be formed by virtue of their neighbors being patterned by

Original M2

‘2-color mapped’ M2

Mandrel Mask (aka ‘core’)

Metal = All – (Sidewall + Block)

Block Mask (aka ‘Cover’)

Deposited sidewall

Figure 2.24 Illustration of the SADP patterning flow: the original layout is decomposed into two colors representing “mandrel” and “not mandrel.” Mandrel shapes form the basis for sidewall deposition, which forms the dielectric spacers between wires at 2 the patterned resolution. A block mask adds 2D details by defining more dielectric regions to reconstruct the original wiring pattern.

Multiple-Exposure-Patterning-Enhanced Digital Logic Design

59

mandrel. The wiring shapes mapped onto mandrel—along with dummy features added to support isolated, not-mandrel wires—form the basis for the actual mandrel pattern to be built on the wafer. Sidewalls deposited onto these mandrel shapes provide the 2 resolution enhancement and form the basis for the isolation patterns between the wires. A block mask adds more dielectric isolation to form line-end gaps and empty wiring tracks. The final wire pattern can be reconstructed as follows: !ðððmandrel expanded by the metal spaceÞ  mandrelÞ þ blockÞ. This complicated patterning sequence presents two unique design challenges. Fortunately, the complex and long-range coloring rules (S ¼ 1 forces a different color, 1 , S , 3 is forbidden, and 3 , S , 5 forces the same color) are automatically enforced in track-based coloring. As shown next to the two-color-mapped layout in Fig. 2.24, if wires are constrained to inherit the color of alternating color tracks, they will automatically obey the SADP color constraints, i.e., neighboring tracks are different colors; if a track is empty the space grows to 3, and the adjoining tracks are the same color. The downside of this track-based design solution is that arbitrary wire widths can only be enabled for an entire track, i.e., a wide power rail can be built into the track plan because it extends over the full width of the logic block. Inserting partial runs of wider wires, for example, to reduce the resistance on highdrive-strength cells, is only possible by merging three tracks into one wire, resulting in a 5-wide wire. Although a 5-wide wire is wider than necessary and the occupation of three wiring tracks blocks precious pin access (which contributes to wire congestion), all other wire widths would cause color conflicts with the underlying tracks. The second challenge lies in the block mask. As highlighted in Fig. 2.24, patterning and manufacturability constraints on the block mask, such as minimum width and space constraints, must be translated into wire line-end stagger rules to ensure clean decomposability of the wiring pattern not only into “mandrel” but also into “block.” The resulting neighboring-track line-end stagger rules are illustrated in Fig. 2.25. Enforcing line-end spaces on nonprojecting line-ends is hard enough, but these particular rules add the complexity of being non-monotonic: for line-ends facing each other, a large space ensures sufficient room on the block mask, and a large overlap ensures sufficient block mask width; however, there is a range of line-end staggers between these two bounding conditions that must be prevented. Again, extensive collaboration with the EDA industry was required to implement these rules. After these complex line-end stagger rules were implemented, another challenge in sub-resolution patterning was discovered. As shown in Fig. 2.26, the block mask is subjected to such severe corner rounding that the rectilinear approximation used in the pattern reconstruction of Fig. 2.24 is simply

60

Chapter 2

Block Violations

Figure 2.25 Block-mask minimum-width and minimum-space violations must be prevented with non-trivial line-end stagger and line-end overlap rules (block: light grey; two-colormapped wires: darker shades of grey).

Block Patterning Defects

Figure 2.26 Simulated block mask image and resulting wiring shapes overlaid onto the designed wires highlighting defects caused by the corner rounding on the block mask.

Multiple-Exposure-Patterning-Enhanced Digital Logic Design

61

inadequate at the dimensions where SADP is useful. The Boolean pattern synthesis of Fig. 2.24, used with a simulated image of the severely rounded block mask, introduces yield concerns due to sharp line-end tips, end-to-end shorts, and partially printing dummy mandrel shapes, as shown in Fig. 2.26. These problems can be overcome with a double-patterned block mask25 at the cost of an additional mask and increased patterning complexity, or they can be overcome with a fundamental design-style change. By disallowing any empty wiring tracks, the pattern of Fig. 2.26 can be changed to the pattern in Fig. 2.27. The actual contacted wires between the two are identical, but the latter forces all empty spaces to be filled with non-functional dummy wires. The patterning process is fundamentally the same, including the function of the second patterning operation, which continues to add dielectricity to form line-end gaps. For the sake of design and rendering, the polarity of the second mask is reversed from a block (i.e., protecting copper wires) to a cut (i.e., forming line-ends by adding dielectricity). The improved image quality achieved with this “sea of wires” approach is shown in Fig. 2.27, but the additional design complexity of forcing the router to generate line-end patterns and fill wires that yield printable cut-mask patterns is formidable and a subject of intense collaboration at the time of this writing.26 The increase in patterning-induced router complexity between the N14 and N10 technology nodes, unfortunately, does not end there. Just as the relative increase in the minimum same-color space from N14 to N10 caused ‘Sea of Wires’ and Line-end Cut

Figure 2.27 Simplifying the cut-mask patterning by filling all empty tracks with dummy wires improves patterning quality but increases design complexity.

62

Chapter 2 N14 Via resolution = 1.7x feature pitch Same-color via-pitch:

N10 Via resolution = 2.3x feature pitch Same-color via-pitch:

1st Via location 2nd Via possible with opposite color 2nd Via possible with arbitrary color

Figure 2.28 Illustration of the impact that the increased same-color space has on via placement. Given a via in the center position (black), many more neighboring via positions require a different color in N10 than they did in N14.

more odd cycles on M1, it had a similar effect on vias connecting critical wiring levels. Figure 2.28 shows the minimum same-color via pitch relative to the wiring pitch for the N14 and N10 technology nodes. In N14, the smallest resolvable single-exposure via pitch is barely smaller than two wiring pitches, which means that two vias can be placed on wires two tracks apart without any color constraints. Only vias on directly neighboring tracks (typically not allowed by basic design rules) and vias on the diagonal of one track over and one track up must be forced to be a different color. This color constraint was easily enforceable by the router, essentially by letting the vias inherit the color of the wiring tracks to which they were connected. In the N10 node, the minimum same-color via pitch increased to slightly more than two metal pitches (again, the resolvable single-exposure via pitch stayed more or less the same as the metal pitch decreased node to node). This slight increase in the minimum same-color via pitch forced color constraints on many more positions around any given via, as illustrated in Fig. 2.28. Forcing vias to be a different color over larger ranges increases the number of odd cycles that are created in an unconstrained layout. The increased risk in generating via odd cycles due to the longer colorinteraction range meant that the router had to learn to explicitly prevent via odd cycles. In addition to adding one more level of complexity to the router, as shown in Fig. 2.29, this additional constraint also further limits the number of possible pin-access configurations, making it inherently more difficult to efficiently find a legal routing solution. In short, the N14 technology node forced routers to solve the two-color mapping problem on critical wires that, at the time, seemed challenging. The N10 node then forced color-aware placement, colored sea-of-wires with

Multiple-Exposure-Patterning-Enhanced Digital Logic Design

Via Odd-cycle

63

Clean Pin Access

M1 output pin M2

3 M1 input pins

Figure 2.29 Two pin-access scenarios on the same logic cell represented by three input pins and one output pin. The wiring on the left is design-rule clean, but the vias cannot be decomposed for LELE.

complex cut-mask constraints, and explicit via odd-cycle removal. To say that sub-resolution patterning continues to drive the need for rapid and fundamental innovation in physical design tools is a gross understatement.

2.5 Beyond P&R: The Impact of Double Patterning on Floor Planning To complete the discussion on patterning-inflicted design complexity, two constraints (the full effect of which, at the time of this writing, has not yet been fully understood) deserve a brief mention. The increasing interaction range of proximity effects—not only of lithography but also of etch and deposition operations that are part of the patterning process—force the introduction of array-termination features. With no fault of the OPC model or convergence accuracy, there is simply no target dimension at which an adequate patterning-limited yield can be achieved across abrupt proximity and pattern-density transitions in extremely resolution-challenged technology nodes. The only effective means of protecting circuitry against catastrophic yield issues involves adding a series of dummy features to ease the transition to open areas surrounding a block of dense-layout features. In the domain of higher-order frequency multiplication, these array termination features must also replicate the track plan or pitch sequence of the main features in the array. Well-separated rectangular-layout blocks easily meet this requirement by adding boundary cells around the edges of the array, as shown in Fig. 2.30, which closely resembles a common practice in memory-bit cell arrays for

64

Chapter 2

Track-Plan Mismatch

Dense packing of Macros

Figure 2.30 The need for pitch-matched array termination cells around different blocks of design extends the impact of sub-resolution patterning into the floor-planning stage of chip design.

Table 2.1 A hierarchical wiring stack reuses previous node’s wiring pitches to leverage existing design rules and process learning. Scaling pushes complexity further up in the wiring stack making it harder to escape. Stack menu using inherited metal levels Node Origin

Name

7 nm 10 nm 14 nm 22 nm 32 nm 45 nm

1 1.5 2 2.5 3 4

Pitch 32 48 64 80 100 140

nm nm nm nm nm nm

193i RET SAQP SADP LELE asymmetric OAI OAI OAI

several technology nodes. Because chip real estate is very valuable and long wires slow down performance, chip floor-plans pack different functional units of various shapes and sizes, and each potentially features a unique pitch or orientation in close proximity. Matching functional blocks with compatible track plans or enforcing a global track plan to ensure patterning-friendly array transitions adds yet another level of complexity to sub-resolution design. Most of this section has focused on the increase in complexity caused by pushing the critical design features deeper into the sub-resolution domain. However, the wiring stack in an integrated circuit progresses hierarchically from the minimum technology pitch, through gradually increasing pitches, to finally arrive at terminal vias that are large enough to connect to the chip package. As scaling continues, not only is more complexity being added to the

Multiple-Exposure-Patterning-Enhanced Digital Logic Design

65

design levels at minimum pitch but also the previous node’s complexity is being pushed further up in the wiring stack, as shown in Table 2.1. Although the first technology nodes that faced the design restrictions associated with sub-resolution patterning were able to escape this complexity by moving wires further up the stack, more-advanced-technology nodes face several sequential layers of complex constraints, the cumulative impact of which remains to be quantified.

Chapter 3

Design for Manufacturability Design for manufacturability (DFM) is a term that collectively describes the rules, tools, and methodologies that act on the physical design of a chip to manipulate the on-wafer rendering of IC shapes to optimize the profitability of the product by facilitating rapid yield ramp at high performance with cost effective and predictable manufacturing processes while maintaining a competitive design point. DFM addresses many diverse and technically complex facets of the design flow, as illustrated in Fig. 3.1. A few of the most important aspects of DFM are reviewed here primarily to distinguish DFM from DTCO.

3.1 Critical Area Optimization Critical area analysis (CAA) quantifies the sensitivity of a given layout to specific random defect signatures. In very general terms the probability of yield loss due to random defects in a particular process step is proportional to the product of • the size of the defect (i.e., dimension of a particle or other contamination), • the defect density distribution as a function of the defect size, • the functional impact (i.e., the failure mechanism) of the defect on a particular process step, and • the location of the defect in the layout. The location component of this yield assessment is captured by the concept of critical area. The critical area of a layout for a particular failure mechanism at any given defect size is the area in which the center of the defect can land that will cause a failure. This simple concept is illustrated in Fig. 3.2 for a defect causing two neighboring wires to short together. Various commercial software solutions exist to help analyze the critical area of a chip to provide an accurate yield estimate for a given chip design. CAA was originally developed to properly price a chip in a foundry engagement and to properly anticipate the required wafer starts needed to fill

67

68

Process Limited Yield

Random Defect Tolerance

Critical Area Optimization

Generic Optimization

Recommended Rules Fidelity Loss

Physical Design Litho Friendly Layout

Architecture for DFM Floor-plan for DFM

Density Variation

DFM Design for Test & Characterization

Design for Performance

Design for Debug & Failure Analysis Design for Packaging Design for eFuse Electrical Design

Migrate-ability Design Efficiency Circuit Limited Yield Aggressive Design

Proximity

ACLV Rules

OPC Effects

OPC Rules

RET Effects

‘Litho’rules

Fill & Cheese

CMP Modeling

CMP-aware PD

Density Rules

Stress Var.

Performance Rules

Cont. Res. Var.

Litho Modeling

Chapter 3

Migration Rules

1st Time Right

Layout-aware device models Statistical Timing

Figure 3.1 In the broadest sense, DFM covers all aspects of “good design.” The most common components typically discussed specifically as DFM are critical area optimization, recommended rules, chemical–mechanical polishing models, and lithography-friendly design.

a certain demand. However, CAA can also be used to help the process engineers improve a product’s yield by targeting the most-critical defect sizes and failure mechanisms in their process. On the design side, the relative importance of different failure mechanisms, i.e., the knowledge whether a particular process step is more susceptible to causing fails because of electrical shorts between wires or electrical opens in the wires, can be communicated to the designers. The designers can then take appropriate corrective action to reduce the critical area for that failure mechanism at the cost of increasing the critical area for a less-prevalent failure mechanism. As shown in Fig. 3.3, critical area optimization essentially allows the designer to rebalance slack in the design, i.e., within the constraints of meeting the minimum design rules, CAA helps the designer use any available margin between the designed dimensions and the minimum allowed dimensions to improve the yield. Unlike errors reported by DRC, critical area can never be eliminated entirely; it can only be optimized or minimized for a specific defect distribution. Finally, although substantial layout improvement is possible solely based on knowledge of the failure ratios (e.g., the importance of shorts versus opens versus blocked vias) the incentive for critical area optimization, i.e., to motivate spending more time or money on the design, requires absolute yield prediction, which requires detailed calibration of the defect distribution for a given process.

69

Design for Manufacturability

Critical Area for Shorts

Figure 3.2 Illustration of critical area: a defect of a given size (represented by the crosshair circle) will cause two wires (//) to short together if its center lands in the cross hatched region. Further to the left or right (grey ‘defects’) and it will cause an image perturbation but not a hard electrical failure.

Critical Area Optimization Before

After

Figure 3.3 A designer who knows the relative importance of different failure mechanism can retune the design to reduce susceptibility to the more prevalent failure mechanisms.

70

Chapter 3

Some concerns with critical area optimization in advanced-technology nodes: • There is usually not a lot of slack in the design, i.e., aggressive scaling pushes large portions of the design to the extent of the minimum allowed design rules, and • Restrictive design rules deliberately make it difficult to adjust layout shapes in small increments. Furthermore, early in a technology node, when leading-edge designs are being finalized, the defect distributions and relative importance of different failure mechanisms continue to fluctuate, which favors DFM approaches that address persistent, node-agnostic yield issues. One persistent issue is that vias have a finite fail probability P; replacing single-cut vias with pairs of redundant vias for any given connection reduces the fail probability to P2. In addition to this significant yield enhancement from random via fails, sacrificial vias can also trap voids generated by electromigration effects. In older-technology nodes that date back to when bidirectional wiring was commonplace, post routing via insertion tools could achieve 70–80% via redundancy by opportunistically inserting small wire taps and extra vias, as shown in Fig. 3.4.27 In advanced-technology nodes, even if bidirectional wiring is allowed, the minimum via pitch and minimum metal-past-via enclosure rules, in addition to constraints on metal-end jogs, make it exceedingly difficult to find room to insert redundant vias or even via bars (a slightly less powerful but more compact variant of the same idea). Another persistent DFM truth states that, regardless of the process details, a uniform distribution of wires improves yield by • reducing the critical area, • opening up space for redundant via insertion, and • reducing susceptibility to process fails linked to density gradients (e.g., etch loading or polishing effects). Redundant Via Insertion

Figure 3.4 After a functional routing solution is found (solid wires), tabs can be added to the ends of lines to opportunistically insert redundant vias (show in grey //).

71

Design for Manufacturability

This insight motivated the development of two IBM internal DFM applications, shown in Fig. 3.5.28 The “wire spreader” is a piece of code that optimizes the distribution of wires across all available wiring levels. Of course, eliminating an unnecessary wiring level is even better for cost and yield, but after it is established that a certain number of wiring levels is required, this tool simply ensures that all levels are equally utilized. The wire-bender tool then evenly distributes wires across a given wiring level, essentially spreading out bunched-up wires. As with other DFM techniques, these wiring optimization approaches also suffer from the morerestricted design environment encountered in advanced-technology nodes. For the overall product yield, systematic yield limiters must be removed before random defect limiters become an issue. However, many of the techniques used to address systematic yield limiters, such as lithography-friendly designs in advanced-technology nodes, directly counteract approaches used to improve the random defect-limited yield. Designers are becoming increasingly constrained in what they can do to improve the design’s random defect robustness, which necessitates ever more stringent random-defect control in the advanced processes. Wirespreader (optimized routing) before

after M5 M4

M3 M2

Wirebender (post routing) before

after

Figure 3.5 The effect of evenly distributing wires across all available wiring levels is shown in cross-section, while the effect of uniformly spreading wires across a given level to avoid wires bunching up is shown from a top-down view.

72

Chapter 3

3.2 Recommended Design Rules Recommended design rules (R-rules) are perhaps the most effective means of communicating general process know-how to the designers. In essence, R-rules simply tell designers that after they have met the minimum design requirements, as verified by DRC, they should look for opportunities to • • • •

use a larger-than-minimum width, space, area, and overlap; improve the density uniformity; increase redundancy; and spread wiring evenly across mask levels.

R-rules address a variety of physical effects, some of which are poorly modeled. Although predictive models for patterning, chemical mechanical polishing (CMP), and etch (to a degree) can achieve reasonable accuracy after experimental calibration, other yield-limiting effects, such as rapid thermal anneal uniformity, silicide sheet resistance variations, variability due to angled implants, stress effects in strained silicon, feature specific metallization effects, or charging effects, to name a few, do not have computationally efficient models that would allow circuit-level yield predictions. R-rules do not simply tell the designers to use more space if it is available; they communicate the foundry’s complex process insights to the designers. Similar to CAA, there are two components to the R-rules: the relative importance of different failure mechanisms is captured in the priority ranking between different R-rules, and the interaction range of process effects is captured in the actual “recommended” rule values. i.e., the value of the R-rule is set to the point at which further rule relaxation no longer affects the yield. As illustrated in Fig. 3.6, an important concept in R-rules states that the recommended value for a rule is not the hard target that a layout must achieve; any relaxation above the minimum-design-rule value improves the yield until a point of diminishing return is reached. Even though R-rules started as a simple concept, their initial successes quickly lead to a rapidly expanding rule set that required detailed prioritization between rules that ended up constraining each other (i.e., many different recommendations competed for the same limited-layout space). In IBM’s 32-nm design rule manual, the recommended-design-rule section swelled to 13 priority-1 rules, 11 priority-2 rules, 8 priority-3 rules, and 4 priority-4 rules, which prompted the implementation of internal and commercial automated-layout-polishing tools based on compaction technology.29 These automated implementations of R-rules were demonstrated to achieve several percentage points of yield improvement that conventional rules neglected.30 To demonstrate that recommended rules could force layout optimization far more complex than a simple width or space increase, Fig. 3.7 shows a screenshot of a poly, diffusion, and source/drain contact optimization. All of the design shapes are interrelated (i.e., a contact cannot be moved without moving the corresponding metal), which makes this level of detailed layout optimization very difficult to achieve without automation. Once again,

73

Design for Manufacturability

Yield optimal achievable

acceptable

Minimum

Recommended

Design Rule Value (width, space, area….) Figure 3.6 Illustration of the concept of R-rules. Any design relaxation between the minimum and recommended value will improve yield. Even if the full recommended value is not achievable, a yield improvement can be derived by a partial relaxation of high-priority R-rules.

Figure 3.7 Complex layout optimization driven by a comprehensive set of R-rules (original poly: horizontal hatch; original diffusion: diagonal hatch; original contact: dot fill; and open shapes and square fill: post optimization shapes).

coarse grid-prescriptive design rules used in advanced-technology nodes make detailed layout tweaking nearly impossible.

3.3 Chemical Mechanical Polishing Primarily a planarization process, CMP reduces the wafer topography between several sequences of etch and deposition process sequences to provide a flat surface for subsequent levels to be patterned onto. As illustrated in

74

Chapter 3

Dishing

Erosion/Recess

Cu Oxide

Figure 3.8 Hardness differences between the copper wires and oxide insulator cause systematic thickness variations due to layout-dependent overpolishing in CMP.

Fig. 3.8, the pattern-density effects of various length scales cause inter- and intrashape thickness variations. In addition to introducing the risk of catastrophic yield loss due to copper pooling, the thickness variation in wires causes resistance and capacitance changes that can create significant delay variation. Although CMP has many more chemical and physical variables than lithography, its behavior can be captured in semi-empirical models. These models can then be used to • detect “hotspots,” i.e., layout configurations at risk of causing yield loss; • optimize fill patterns in the chip to provide more-uniform CMP behavior; • reduce pessimism in the timing rules by not protecting against unrealistically conservative delay variation; or • interactively guide routing optimization by adding CMP planarity as an additional cost function along with more-common optimization parameters such as wire-length and via-count reduction. Exploiting CMP modeling for physical and electrical design optimization requires seamless integration across multiple analysis and optimization platforms, as shown in Fig. 3.9.31 The dependence on calibration-intensive models makes this approach difficult to implement early in a technology node when processes are still being optimized. An alternative to the sophisticated model-based layout optimization is a set of phenomenological checks that capture known process challenges in multi-level rule checks. For example, Fig. 3.10 illustrates a failure mode where the dishing of a wide copper wire in the level below leads to a topography variation that causes minimum pitch wires to short together in the metal layer above. After this failure mechanism is identified, design rules or recommended rules (depending on the severity of the yield challenge) can be written to avoid these problematic layout configurations. Alternately, after a problematic design–process interaction is identified and the relevant layout situation is identified in a product design, process engineers can be alerted to its presence so that they can monitor and control the affected process steps accordingly.32

Design for Manufacturability

75

Figure 3.9 Virtual manufacturing prediction, i.e., semi-empirical predictive CMP models, can be used to analyze and optimize the physical and electrical behavior of a circuit layout.

Figure 3.10 Based on empirical data from the product or dedicated test vehicles, a rulesbased checking methodology can be used to identify complex yield-limiting process-design interactions, such as the bridging risk between minimum pitch wires crossing a very wide wire in the previous design level.

3.4 Lithography-Friendly Design Chapters 1 and 2 focused primarily on the design impact of sub-resolution patterning, potentially implying that the design community bore the brunt of the burden of sub-resolution scaling. Figure 3.11 shows that designers were not the only ones suffering as the resolution gap widened. Maintaining

76

Chapter 3

N90

N65

N45

N32

k1 = 0.5

k1 = 0.44

k1 = 0.44

k1 = 0.35

~$20M/tool

~$25M/tool

~$35M/tool

~$45M/tool

50kCPUh/maskset

100kCPUh/maskset

500kCPUh/maskset

>750kCPUh/maskset

Figure 3.11 Quantifying investments are made in physical and computational lithography to maintain adequate image quality in the sub-resolution domain.

virtually equivalent image fidelity through the N90 to N32 technology nodes, as shown in the simulated image contours in Fig. 3.11, was made possible by rapidly escalating investment in exposure tools and computational lithography. Additionally, all aspects of the patterning and metrology process had to rapidly improve as tolerable levels of process variation tightened node to node at a sustained rate of • • • •

Focus: 30% node to node, Dose: 20% node to node, Mask error: 20% node to node, and Overlay: 40% node to node.

To curb the escalating cost and complexity of sub-resolution patterning, lithography-friendly design (LFD) became an official component of the resolution enhancement roadmap in the 2003 ITRS.33 The availability of computationally efficient lithography models, driven by the necessity for model-based OPC, made it possible for designers to pre-screen their layouts for possible hotspots, i.e., local layout configurations that pass DRC but exhibit a poor lithographic process window. Although LFD can be seen to describe all aspects of patterning-driven design constraints, this section specifically reviews the opportunities and challenges associated with modelbased LFD. As illustrated in Fig. 3.12, LFD is essentially an extension of design rule checking in which geometric fail limits are checked not on the features drawn by the designer but on the predicted wafer patterns generated by the lithography modeling engine. The hotspots identified in this manner can then be reported to designers in a similar fashion to DRC violations, which is why this operation is often referred to as an optical rules check (ORC). The nontrivial cause-and-effect relationship between patterning hotspots and necessary layout adjustments prompted the integration of LFD checks with automatic layout manipulation

77

Design for Manufacturability Layout

Litho. Model

Contour Generation

Fail Limits

Optical Rules Check

Clean Layout

Hotspot Human Designer

Auto-layout Optimizer

Router, Postroute Cleanup

• Surgical fix (edge shift) • Local redesign (topology change)

Figure 3.12 A possible LFD implementation: modeled pattern contours are checked against established fail limits, and violating layout configurations are reported as hotspots to designers, prompting various degrees of corrective action.

or routing optimization solutions. Especially if a layout has already undergone R-rule-based optimization, it is unlikely that simple surgical fixes to the hotspot can be found (i.e., it is unlikely that a bridging error can be corrected by simply increasing the space in the layout; if there were room to increase the space, the designer would have taken advantage of that already). The corrective action often involves removing constraints to the particular RET solution, which could involve removing short jogs to facilitate better SRAF coverage or aligning feature edges to remove constraints on OPC. Either through trial-anderror or in-depth study of the underlying patterning phenomena, the designers must acquire sufficient expertise to efficiently react to the identified hotspot. Because the goal of LFD is to not only identify layouts that fail under nominal exposure conditions but also flag issues that occur at the corners of the allowable process window, process variation must be incorporated in the lithography modeling, as shown in Fig. 3.13. While exposure dose, defocus, and mask-size errors are not deliberately introduced in product exposures, they have to be captured in these simulations to account for inevitable variation in these parameters even under extremely stringent process control. To maintain a statistically realistic correlation between the simulated process excursions and actual wafer yield, the different degrees of variation in these three variables must be appropriately set to represent iso-probability process excursions. For three variables, each having a positive and negative control limit, this situation results in 26 contours, where each represents a possible and equally likely variant of the anticipated on-wafer shape. In the interest of efficiency, rather than checking each of these 26 contours for width, space, area, or intersection fails, most LFD applications use a simpler yet still reasonably accurate approximation in the form of process variability bands

78

Chapter 3

Trial

Focus

Mask Size

1

Dose

26 individual contours:

+1.00

2

-1.00

3

+1.00

4

-1.00

5

+1.00

6

-1.00

7

+0.71

8

+0.71

-0.71

9

-0.71

+0.71

10

-0.71

+0.71

-0.71

11

+0.71

+0.71

12

+0.71

-0.71

13

-0.71

+0.71

14

-0.71

15

+0.71

16

+0.71

-0.71

17

-0.71

+0.71

-0.71 +0.71

18

-0.71

-0.71

19

+0.58

+0.58

20

+0.58

+0.58

-0.58

21

+0.58

-0.58

+0.58

22

+0.58

-0.58

-0.58

23

-0.58

+0.58

+0.58

24

-0.58

+0.58

-0.58

25

-0.58

-0.58

+0.58

26

-0.58

-0.58

-0.58

Process variability bands (PV-bands) showing the net effect of all variations:

+0.58

Figure 3.13 PV-bands identify the dimensional extremes that a shape will assume as focus, mask-size, and exposure dose fluctuate within their committed control limits.

(PV-bands). PV-bands represent the inner and outer extremes of the feature size that a layout shape can assume over the range of committed process control. Because the PV-bands are a compilation of several simulated contours, they do not predict an actual feature shape that might occur under a specific set of conditions; they merely predict that the actual shape’s contours will always fall somewhere within the PV-band. A quantitative link between a hotspot and the anticipated yield impact can be established by assessing the image quality for iso-probability PV-bands at different probabilities, as shown in Fig. 3.14. It is worth noting that unlike critical area optimization, which provides continuous yield improvement, reacting to LFD hotspots that occur beyond the rework limit, i.e., at process deviations outside the foundry’s established control limits, has no positive yield impact. The PV-bands in Fig. 3.14 also nicely illustrate the complexity of LFD: two seemingly symmetrical layout situations show significantly different variability as a result of the broader pattern context. In addition to accurate lithography models and process control limits, the fail limits to which the PV-bands are checked have a large impact on the number of hotspots that are detected. Figure 3.15 demonstrates how a small change in the fail limit—in this case, a 1-nm increase in the minimum allowable dimension in poly over diffusion—causes a large change in the number of transistor gates that are reported as hotspots.

79

Design for Manufacturability Pinching Hotspot at 3 Process Variation

-3

-2

-1

0

+1

+2

+3

Figure 3.14 If the statistical distribution of lithographic control parameters is known, a quantitative yield-impact assessment can be derived from the knowledge of which isoprobability band results in a hotspot. At nominal conditions (i.e., no process variation) the contours look good, and at 3s process variation the bridging in the contours indicates a catastrophic yield problem—however, the probability of incurring the latter, as illustrated by the normal distribution, must be taken into account when assessing the yield impact.

To limit the dependence on accurately calibrated fail limits, LFD can be used to sweep over a range of fail values to provide significant learning simply based on the signature of the fail count increase. Figure 3.16 shows such a sweep for the same failure mechanism on two different chips. Even though both designs would yield equally well, i.e., they both start to fail at the same fail limit, the shapes of the fail-count histograms vary significantly. A large step function in the failure count (black bars) simply identifies the point at which large portions of the chip will fail under the given process-control limits, whereas a tail in the failure count distribution (grey bars) identifies

Figure 3.15 PV-bands for poly being checked for short channel variation, i.e., the minimum width of a poly PV-band over the diffusion (not shown). Hotspots are identified by the solid markers on the gate region. A 1-nm change in the fail limit causes a 3 increase in the number of errors reported.

80

number of errors reported

Chapter 3 Number of Failures 100% 80% 60% 40% 20%

0% 0

1

2

3

4

5

6

fail limit (arbitrary)

Figure 3.16 A long, shallow tail in the failure count versus the fail-limit histogram identifies hotspots versus a large step function that simply quantifies the anticipated yield of a particular patterning process.

hotspots, i.e., it represents an opportunity to significantly improve yield or resilience to process variations by addressing a manageable number of rogue layout configurations. The fail-limit sweeping approach in LFD can also be used to compare the lithography-limited yield of two different designs in the same technology node. Figure 3.17 shows two layout clips: one for a microprocessor (MP) design, and the other for an application-specific integrated circuit (ASIC) design along with a plot of poly bridging hotspots as a function of the fail limit for the minimum poly space. Surprisingly, the data indicate that the MP design, intended to be very robust for early yield, is far more susceptible to this particular lithography hotspot than the ASIC design. The answer to the mysterious increase in poly bridging hotspots in the MP design is revealed by another interesting LFD experiment, shown in Fig. 3.18. The MP designers, very focused on robust design and following the DFM guidelines that two vias are better than one, went out of their way to add redundant vias on virtually all connection points. Although redundant vias are tremendously beneficial for the via yield, they also significantly complicate the poly design by adding landing pads and line-end jogs to the poly layout. It is around these landing pads that all the bridging hotspots are detected; removing all of the redundant vias and the associated landing pads significantly improves the hotspot-versus-fail-limit distribution. Whether the yield loss due to lithography hotspots would actually offset the yield gain afforded by the redundant vias depends on the exact details of each yield-loss mechanism and is therefore likely to change over the lifetime of a product. Similar nontrivial tradeoffs exist between LFD and CAA.34 Because detailed lithography-limited yield assessments are very data intensive, LFD is more commonly used to eliminate catastrophic hotspots or hard fails. While less dependent on accurate yield correlation, this LFD mode still has challenges. Figure 3.19 shows a catastrophic pinching error in a wafer

81

Design for Manufacturability

ASIC

number of errors reported

MP

Bridging Errors on Poly 100.00%

10.00%

1.00% 110

114 118 122 space fail limit (nm) MP

126

130

ASIC

Figure 3.17 Comparing poly bridging failures in a MP and ASIC. The MP design, although intended to be very robust, shows a significantly higher number of failures even at much riskier fail limits.

image. The fact that this type of pattern failed early in a technology development cycle is not too surprising in itself; what makes this failure interesting is that the design had been run through detailed ORC and should have been flagged and corrected by LFD. Further analysis showed that this particular layout configuration exhibited image characteristics that were outside the range for which the patterning model was calibrated.35 Even though the optical component of the patterning model is based on firstprinciple calculations, the resist and etch behavior are captured by empirical models, most commonly using some form of a variable threshold model. Offsets between the final wafer pattern and the rigorous optical simulation are compensated for by effectively varying the threshold at which the simulated intensity profile is assumed to print. The threshold offsets are calibrated based on image characteristics such as maximum intensity, minimum intensity, image slope, and image curvature. In this particular case, the test patterns from which the data to calibrate the OPC model was collected did not adequately capture the unique image characteristics of all the layout features in the actual layout. If the model predicts the wrong image behavior, not only will the OPC solution fail, but any LFD checks using the same model will also miss that hotspot. To safeguard against this false-negative error, all layout

82

number of errors reported

Chapter 3

Bridging Errors on Poly

100.00% 10.00% 1.00% 0.10% 95

99 103 107 space fail limit (nm) redundant

111

115

deredundified

Figure 3.18 Comparing poly bridging fails in MP design with redundant vias (top left and black bars in the chart) and without redundant vias (top right and grey bars in the chart). Redundant vias, while good for robust connectivity, substantially increase the risk of poly bridging.

SEM Image

Model Calibration Parameter Space

x

Resist pinching in SEM image

x Failure Point

Calibration Points

Figure 3.19 An on-wafer hotspot that slipped through ORC and eluded LFD because the calibration dataset inadequately covered the required image-parameter space (in this case: maximum intensity, minimum intensity, image slope, and image curvature).

83

Design for Manufacturability

configurations that fall outside the calibration space of the lithography model should be flagged as errors by default (which, of course, has the risk of flooding the user with false-positive hotspots). For LFD to be successful as a DFM technique, in addition to accurately flagging hotspots without overwhelming the designers with too many false errors, it must also properly identify ownership of a particular failure. Eventually, the foundry should address all systematic problems, either through process improvements or design rule adjustments. Until model-based layout verification becomes part of the official sign-off process and is made a mandatory component of layout verification along with DRC, LVS, and extraction, the role of LFD involves identifying systematic failures that can be more efficiently addressed in the design space than in the process space. Figure 3.20 shows a hard bridging error on a poly landing pad that was identified on the wafer using a process window qualification (PWQ) process.36 In PWQ a new mask is qualified by exposing a wafer with a matrix of varying dose and focus and then using die-to-die inspection, i.e., a chip-to-chip comparison, to detect hotspots that exhibit a substantially reduced process window. This particular hotspot was traced back to a fragmentation error in the OPC; essentially, the OPC setup did not evaluate the patterning error at the worst-case pinch point and ended up undercorrecting the mask image. It would be undesirable if a designer had identified this type of issue through LFD and delayed the tape-out, possibly gave up density or via redundancy to find a clean layout solution, while the foundry fixed the OPC setup in parallel. Therefore, detailed cause-and-effect analysis on a hotspot is needed, at least

Bridging Defect

Nominal

Nominal-0.05um

Nominal-0.10um

Nominal-0.15um

Nominal-0.20um

OPC Adjustment

OLD: 90nm spacing

NEW: 112nm spacing

Figure 3.20 A poly bridging hotspot found on the wafer during the process window qualification of a new mask. The hotspot was ultimately traced back to a minor tweak in the OPC parameters and did not require any designer intervention.

84

Chapter 3 Layout with Hotspot

Improved Contours

Degraded Layout

Not designer’s intent

Figure 3.21 Eliminating hotspots by adding complexity to the layout shapes may ultimately backfire because it obscures the designer’s intent from the process engineers.

early in a technology node, to determine whether it should be addressed in the process or the design. Finally, model-based layout legalization can obscure the original design intent, making it more difficult for the foundry to find an optimal patterning solution. In the simple example shown in Fig. 3.21, LFD flagged a bridging error between two diffusion shapes, and the designer adjusted the layout pattern to yield acceptable PV-bands but in the process introduced notches into the layout pattern that the foundry has no way of knowing whether they serve a purpose or are, in effect, manual OPC pre-corrections. A preferable approach would have the designer specify target bands, i.e., outline shapes that identify the true fail limits of each shape based on its functional requirements. In summary, LFD can help eliminate gross problems, i.e., problematic layouts that tend to re-occur in every node, early in the technology node. To be effective, all of the details of the computational lithography solution (OPC and RET) must be included, but they can usually be encrypted in the LFD deck. Because OPC models are intended to guide detailed pattern correction near the nominal process points, they are typically not very accurate at the fail limits, so LFD hotspots require “conservative disposition” or expert review. Even though the foundry will eventually fix any systematic (i.e., layoutdependent) failures, eliminating problematic layouts (to the degree possible) in the design is usually faster than process or OPC/RET updates and safer than fab-based layout manipulations. The cause-and-effect relationship in LFD is far more complex than in DRC, and thus too much reliance on LFD could hurt design efficiency. It is usually best to establish local centers of competence to help the broader design community resolve complex hotspots. Having designers alter the layout to improve simulated contours can further obscure the “design intent” (i.e., it can create more-complex layout shapes with a lot of unnecessary detail). Although LFD, and DFM in general, feeds process information from the wafer manufacturer to the designer, the

Design for Manufacturability

85

information flow in the opposite direction—often referred to as design intent— continues to be lacking. Providing more two-way communication is a primary goal for DTCO.

3.5 Prescriptive Design Rules The desirable attributes of good DFM can be summarized as follows: • maintain or improve design productivity and quality, • maintain a stable design environment by shielding designers from process changes, • minimize complexity and avoid errors by striving for the simplest set of design rules possible, • incentivize corrective action by preserving clear boundaries between design and process responsibilities, • eliminate excessive over-design by improving electrical model-to-silicon correlation, and • enable comprehensive characterization of process-design interactions by eliminating rogue layouts. One way to avoid the complications of model-based DFM techniques that inherently require a stable, well-characterized, and accurately modeled process and add significant design complexity when not well integrated into existing design tools, is to adopt prescriptive design rules (PDRs). The idea of telling designers “what is allowed” rather than “what is forbidden” was originally proposed under the name radical design restrictions (RDRs).37 Without changing the fundamental idea, the RDR acronym was redefined to mean restrictive design rules before arriving at its more-accurate final name, PDR. As technology nodes dive deeper into the sub-resolution domain, patterning techniques become so limited in the range of features for which they provide adequate resolution that it simply becomes more efficient to use PDRs to tell the designers the few layout configurations that are allowed rather than using conventional design rules to communicate what is forbidden. However, the desirable attributes of DFM outlined earlier can be achieved for certain designs by adopting PDRs ahead of when higher-order frequency multiplication forces such restrictions.38 Figure 3.22 illustrates the concept of PDRs by comparing the conventional width-dependent spacing rules introduced in N45 to their PDR counterpart. By “negotiating” a small set of high-value width–space combinations, the rules for these discrete layout situations can be tightened by allowing more comprehensive process optimization for all possible layout configurations made from these rules. It is important to realize that while PDRs are often misrepresented as overburdening designers to ensure robust manufacturing, the main motivation behind PDRs is to enable competitive and efficient designs without increasing

86

Chapter 3

Minimum Space (N45)

Critical Area Optimization 6x

10x 3x

5x

4x

4x 3x

1.5x

1.7x

3.5x

2x 1x

1x

3x

1.5x