[Journal] IEEE Transactions on Microwave Theory and Techniques. Vol. 64. No 7A

Citation preview

JULY 2016

VOLUME 64

NUMBER 7

IETMAB

(ISSN 0018-9480)

PART I OF TWO PARTS

PAPERS

EM Theory and Analysis Techniques A J-E Collocated WLP-FDTD Model of Wave Propagation in Isotropic Cold Plasma ..................................... .......................................................................... Y. Fang, X.-L. Xi, J.-M. Wu, J.-F. Liu, and Y.-R. Pu Accurate Analysis of Finite-Volume Lumped Elements in Metamaterial Absorber Design ................................ .................................................. J. W. You, J. F. Zhang, W. X. Jiang, H. F. Ma, W. Z. Cui, and T. J. Cui Designing Optimal Surface Currents for Efficient On-Chip mm-Wave Radiators With Active Circuitry ................ ..................................................................................................... K. Sengupta and A. Hajimiri Passive Circuits A Miniaturized Evanescent Mode Waveguide Filter Using RRRs ...................... J. Y. Jin, X. Q. Lin, and Q. Xue Balanced Wideband/Dual-Band BPFs on a Hybrid Multimode Resonator With Intrinsic Common-Mode Rejection .......................................................................................... X. Guo, L. Zhu, and W. Wu Compact Highly Integrated Planar Duplex Antenna for Wireless Communications ......................................... ......................................................................... C.-X. Mao, S. Gao, Y. Wang, F. Qin, and Q.-X. Chu Hybrid and Monolithic RF Integrated Circuits A Broadband High-Efficiency Doherty Power Amplifier With Integrated Compensating Reactance ..................... ............................................................................................ J. Xia, M. Yang, Y. Guo, and A. Zhu High-Efficiency Microwave and mm-Wave Stacked Cell CMOS SOI Power Amplifiers .................................. .................................................................................. S. R. Helmi, J.-H. Chen, and S. Mohammadi A 70–80-GHz SiGe Amplifier With Peak Output Power of 27.3 dBm .................... H.-C. Lin and G. M. Rebeiz A SiGe Multiplier Array With Output Power of 5–8 dBm at 200–230 GHz ............. H.-C. Lin and G. M. Rebeiz Octave Band Linear MMIC Amplifier With +40-dBm OIP3 for High-Reliability Space Applications .................. .............................................................................................. O. Silva, I. Angelov, and H. Zirath A High IIP2 Broadband CMOS Low-Noise Amplifier With a Dual-Loop Feedback ............... D. Im and I.-Y. Lee A Transformer Feedback Gm -Boosting Technique for Gain Improvement and Noise Reduction in mm-Wave Cascode LNAs ................................................ S. Guo, T. Xi, P. Gui, D. Huang, Y. Fan, and M. Morgan Low-Phase-Noise 54-GHz Transformer-Coupled Quadrature VCO and 76-/90-GHz VCOs in 65-nm CMOS .......... ................................................................... T. Xi, S. Guo, P. Gui, D. Huang, Y. Fan, and M. Morgan

1957 1966 1976 1989 1997 2006

2014 2025 2039 2050 2059 2068 2080 2091

(Contents Continued on Back Cover)

(Contents Continued from Front Cover) Ultra-Wideband Quasi-Circulator Implemented by Cascading Distributed Balun With Phase Cancelation Technique ........................................... S.-D. Tang, C.-M. Lin, S.-H. Hung, K.-W. Cheng, and Y.-H. Wang Variable 360° Vector-Sum Phase Shifter With Coarse and Fine Vector Scaling ............................................. ........................................................................................ M.-M. Mohsenpour and C. E. Saavedra

2113

Instrumentation and Measurement Techniques A Tensor-Based Extension for the Multi-Line TRL Calibration ............................................................... ....................................................................... Y. Rolain, M. Ishteva, E. Van Nechel, and F. Ferranti Sub-10-pW/Hz0.5 Uncooled Micro-Bolometer With a Vacuum Micro-Package ........ H.-H. Yang and G. M. Rebeiz

2121 2129

RF Systems and Applications A Preoptimized Peak to Average Power Ratio Pulse Shaping Filter and Its Effect on System Specifications .......... ....................................................................................................................... J. J. Sochacki Physical Mechanism and Theoretical Foundation of Ambient RF Power Harvesting Using Zero-Bias Diodes ......... ........................................................................................ C. H. P. Lorenz, S. Hemour, and K. Wu Experiments of Time-Reversed Pulse Waves for Wireless Power Transmission in an Indoor Environment ............. ..................................... R. Ibrahim, D. Voyer, A. Bréard, J. Huillery, C. Vollaire, B. Allard, and Y. Zaatar A 3-Gb/s Radar Signal Processor Using an IF-Correlation Technique in 90-nm CMOS ................................... ................................................................................... J. Li, T. Kijsanayotin, and J. F. Buckwalter Magnetic Nanoparticle-Assisted Microwave Hyperthermia Using an Active Integrated Heat Applicator ................ ........................................................................................... K. Kim, T. Seo, K. Sim, and Y. Kwon Nonlinear Optical Angle Modulation for Suppression of RF Interference .................................................... ........................................... V. J. Urick, J. F. Diehl, J. D. McKinney, J. M. Singley, and C. E. Sunderman

2104

2137 2146 2159 2171 2184 2198

IEEE MICROWAVE THEORY AND TECHNIQUES SOCIETY

The Microwave Theory and Techniques Society is an organization, within the framework of the IEEE, of members with principal professional interests in the field of microwave theory and techniques. All members of the IEEE are eligible for membership in the Society upon payment of the annual Society membership fee of $17.00, plus an annual subscription fee of $28.00 per year for electronic media only or $50.00 per year for electronic and print media. For information on joining, write to the IEEE at the address below. Member copies of Transactions/Journals are for personal use only. ADMINISTRATIVE COMMITTEE

K. W U, President A. A BUNJAILEH S. BARBIN

D. W ILLIAMS, President Elect

T. B RAZIL R. G UPTA

R. H ENDERSON W. H ONG

A. JACOB S. KOUL

J. L ASKAR G. LYONS

M. B OZZI, Secretary

M. M ADIHIAN S. PACHECO

Honorary Life Members T. I TOH R. S PARKS

G. P ONCHAK S. R AMAN

A. A BUNJAILEH , Treasurer

J. R AUTIO J. E. R AYAS -S ANCHEZ

S. R EISING M. S ALAZAR -PALMA

A. S ANADA D. S CHREURS

Distinguished Lecturers

P. S TAECKER

C. C AMPBELL R. H. C AVERLY G. C HATTOPADHYAY J.-C. C HIAO

T.-W. H UANG M. JARRAHI J. J. KOMIAK S. KOUL

A. M ORTAZAWI T. NAGATSUMA J. C. P EDRO L. P IERANTONI

M. S TEER

Past Presidents P. ROBLIN D. S CHREURS N. S HINOHARA

A. S TELZER J. W OOD H. Z IRATH

T. L EE (2015) R. W EIGEL (2014) M. G UPTA (2013)

MTT-S Chapter Chairs Albuquerque: E. FARR Argentina: A. M. H ENZE Atlanta: K. NAISHADHAM Austria: A. S PRINGER Baltimore: I. A HMAD Bangalore/India: K. V INOY Beijing: Z. F ENG Belarus: S. M ALYSHEV Benelux: G. VANDENBOSCH Boston: C. G ALBRAITH Bombay/India: M. V. P ITKE Brasilia: J. B EZERRA/ M. V INICIUS A LVES N UNES Buenaventura: C. S EABURY Buffalo: M. R. G ILLETTE Bulgaria: K. A SPARUHOVA Canada, Atlantic: Z. C HEN Cedar Rapids/Central Iowa: C. G. X IE Central & South Italy: L. TARRICONE Central No. Carolina: Z. X IE Central Texas: J. P RUITT Centro-Norte Brasil: M. V. A LVES N UNES Chengdu: Z. N EI Chicago: D. E RRICOLO Cleveland: M. S CARDELLETTI Columbus: A. O’B RIEN Connecticut: C. B LAIR Croatia: D. B ONEFACIC Czech/Slovakia: J. VOVES Dallas: R. S ANTHAKUMAR Dayton: A. T ERZUOLI Delhi/India: A. BASU

Denver: M. JANEZIC Eastern No. Carolina: T. N ICHOLS Egypt: E. H ASHEESH Finland: V. V IIKARI Florida West Coast: J. WANG Foothills: M. C HERUBIN France: D. BAJON Germany: G. B OECK Greece: R. M AKRI Gujarat/India: S. C HAKRABARTY Harbin: Q. W U Hawaii: K. M IYASHIRO Hong Kong: H. W ONG Houston: S. A. L ONG Houston, College Station: G. H. H UFF Hungary: L. NAGY Huntsville: H. S CHANTZ Hyderabad/India: S. R. N OOKALA India: D. B HATNAGER India/Kolkata: S. S ANKARALINGAM Indonesia: E. T. R AHARDJO Israel: S. AUSTER Japan: N. S UEMATSU Kansai: T. I SHIZAKI Kingston: S. P ODILCHAK Kitchener-Waterloo: R. R. M ANSOUR Lebanon: E. NASSAR Lithuania: B. L EVITAS Long Island/New York: S. PADMANABHAN Los Angeles, Coastal: V. R ADISIC Los Angeles, Metro/San Fernando: T. C ISCO

Macau: C. C. P ONG Madras/India: S. S ALIVAHANAN Malaysia: M. K. M. S ALLEH Malaysia, Penang: B. L. L IM Melbourne: R. B OTSFORD Mexican Council: R. M. RODRIGUEZ -DAGNINO Milwaukee: S. G. J OSHI Monterrey/Mexico: R. M. RODRIGUEZ -DAGNINO Morocco: M. E SSAAIDI Montreal: K. W U Morocco: M. E SSAAIDI Nagoya: J. BAE Nanjing: W. H ONG Nanjing, Hangzhou: L. S UN New Hampshire: E. H. S CHENK New Jersey Coast: J. S INSKY New South Wales: Y. R ANGA New Zealand: A. W ILLIAMSON North Italy: G. O LIVERI North Jersey: A. K. P ODDAR Northern Australia: J. M AZIERSKA Northern Canada: M. DANESHMAN Northern Nevada: B. S. R AWAT Norway: M. U BOSTAD Orange County: H. J. DE L OS S ANTOS Oregon: K. M AYS Orlando: K. K ARNATI Ottawa: Q. Z ENG Philadelphia: A. S. DARYOUSH Phoenix: S. ROCKWELL

D OMINIQUE S CHREURS KU Leuven B-3001 Leuven, Belgium

Editorial Assistants

M ARCIA H ENSLEY USA E NAS K ANDIL Belgium

Sweden: A. RYDBERG Switzerland: M. M ATTES Syracuse: D. M C P HERSON Taegu: Y.-H. J EONG Tainan: H.-H. C HEN Taipei: C. M ENG Thailand: C. P HONGCHAROENPANICH Toronto: G. V. E LEFTHERIADES Tucson: H. X IN Tunisia: A. G HARSALLAH Turkey: B. S AKA Twin Cities: C. F ULLER UK/RI: A. R EZAZADEH Ukraine, East: N. K. S AKHNENKO Ukraine, Kiev: Y. P ROKOPENKO Ukraine, Rep. of Georgia: K. TAVZARASHVILI Ukraine, Vinnitsya: V. M. D UBOVOY Ukraine, West: I. I VASENKO United Arab Emirates: N. K. M ALLAT Uttar Pradesh/India: M. J. A KHTAR Vancouver: S. M C C LAIN Venezuela: J. B. P ENA Victoria: K. G HORBANI Virginia Mountain: T. A. W INSLOW Washington DC/Northern Virginia: T. I VANOV Western Saudi Arabia: A. S HAMIM Winnipeg: P. M OJABI Xian: X. S HI

Associate Editors

Editors-In-Chief

J ENSHAN L IN Univ. of Florida Gainesville, FL 32611-6130 USA

Pikes Peak: K. H U Poland: W. J. K RZYSZTOFIK Portugal: J. C ALDINHAS VAZ Princeton/Central Jersey: W. C URTICE Queensland: K. B IALKOWSKI Rio de Janeiro: J. R. B ERGMANN Rochester: M. S IDLEY Romania: T. P ETRESCU Russia, Moscow: V. A. K ALOSHIN Russia, Nizhny-Novgorad: G. L. PAKHOMOV Russia, Novosibirsk: A. YAROSLAVTSEV Russia, Saratov/Penza: M. D. P ROKHOROV Russia, Saint Petersburg: S. P. ZUBKO Russia, Siberia: V. V. S UHOTIN Russia, Tomsk: D. Z YKOV San Diego: J. T WOMEY Santa Clara Valley/San Francisco: N. S HAMS Seattle: S. E BADI Seoul: C. S EO Serbia and Montenegro: B. M ILOVANOVI C´ Shanghai: J. M AO Singapore: Z. YANG South Africa: A. LYSKO South Australia: T. K AUFMANN South Brazil: J. R. B ERGMANN Southeastern Michigan: T. O ZDEMIR Southern Alberta: E. F EAR Spain: J. I. A LONSO Springfield: P. R. S IQUEIRA Sri Lanka: A. U. A. W. G UNAWARDENA St. Louis: D. BARBOUR

N UNO B ORGES C ARVALHO Universidade de Aveiro Aveiro, Portugal

X. C HEN Nat. Univ. Singapore Singapore

K AMRAN G HORBANI RMIT Univ. Melbourne, Vic., Australia

J ON M ARTENS Anritsu Morgan Hill, CA USA

O LGA B ORIC -L UBECKE Univ. of Hawaii at Manoa Manoa, HI USA

J.-C. C HIAO Univ. of Texas at Arlington Arlington, TX USA

ROBERTO G OMEZ -G ARCIA Univ. Alcala Madrid, Spain

F RANCISCO M ESA Universidad de Sevilla Seville, Spain

JAMES F. B UCKWALTER Univ. of California at Santa Barbara Santa Barbara, CA USA

A LESSANDRA C OSTANZO Univ. Bologna Bologna, Italy

J IASHENG H ONG Heriot-Watt Univ. Edinburgh, UK

L UCA P ERREGRINI Univ. of Pavia Pavia, Italy

S HENG -F UH R. C HANG Nat. Chung Cheng Univ. Chiayi County, Taiwan A. R IDDLE, Editor-in-Chief, IEEE Microwave Magazine N. S. BARKER, Editor-in-Chief, IEEE Microwave and Wireless Component Letters BARRY L. S HOOP, President K AREN BARTLESON, President-Elect PARVIZ FAMOURI, Secretary J ERRY L. H UDGINS, Treasurer H OWARD E. M ICHEL, Past President

G ILLES DAMBRINE Univ. of Lille Lille, France

T.-W. H UANG C ARLOS S AAVEDRA Nat. Taiwan Univ. Queen’s Univ. Taipei, Taiwan Kingston, ON, Canada J. S TAKE, Editor-in-Chief, IEEE Trans. Terahertz Science and Technology R. M IYAMOTO, Web Master

IEEE Officers

S. K. R AMESH, Vice President, Educational Activities S HEILA S. H EMAMI, Vice President, Publication Services and Products WAI -C HOONG W ONG, Vice President, Member and Geographic Activities B RUCE P. K RAEMER, President, Standards Association J OSE M. F. M OURA, Vice President, Technical Activities P ETER A LAN E CKSTEIN, President, IEEE-USA W ILLIAM W. M OSES, Director, Division IV—Electromagnetics and Radiation

IEEE Executive Staff D R . E. JAMES P RENDERGAST, T HOMAS S IEGERT, Business Administration J ULIE E VE C OZIN, Corporate Governance D ONNA H OURICAN, Corporate Strategy JAMIE M OESCH, Educational Activities E ILEEN M. L ACH, General Counsel & Chief Compliance Officer S HANNON J OHNSTON, Human Resources C HRIS B RANTLEY, IEEE-USA

Executive Director & Chief Operating Officer C HERIF A MIRAT, Information Technology PATRICK D. M AHONEY, Marketing C ECELIA JANKOWSKI, Member and Geographic Activities M ICHAEL F ORSTER, Publications KONSTANTINOS K ARACHALIOS, Standards Association M ARY WARD -C ALLAN, Technical Activities

IEEE Periodicals Transactions/Journals Department

Senior Director, Publishing Operations: F RAN Z APPULLA Director, Editorial Services: DAWN M ELLEY Director, Production Services: P ETER M. T UOHY Associate Director, Editorial Services: W ILLIAM A. C OLACCHIO Associate Director, Information Conversion and Editorial Support: K EVIN L ISANKIE Managing Editor: M ONA M ITTRA Senior Editor: C HRISTINA M. R EZES IEEE T RANSACTIONS ON M ICROWAVE T HEORY AND T ECHNIQUES (ISSN 0018-9480) is published monthly by the Institute of Electrical and Electronics Engineers, Inc. Responsibility for the contents rests upon the authors and not upon the IEEE, the Society/Council, or its members. IEEE Corporate Office: 3 Park Avenue, 17th Floor, New York, NY 10016-5997. IEEE Operations Center: 445 Hoes Lane, Piscataway, NJ 08854-4141. NJ Telephone: +1 732 981 0060. Price/Publication Information: Individual copies: IEEE Members $20.00 (first copy only), nonmember $167.00 per copy. (Note: Postage and handling charge not included.) Member and nonmember subscription prices available upon request. Copyright and Reprint Permissions: Abstracting is permitted with credit to the source. Libraries are permitted to photocopy for private use of patrons, provided the per-copy fee of $31.00 is paid through the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923. For all other copying, reprint, or republication permission, c 2016 by The Institute of Electrical and Electronics Engineers, Inc. write to Copyrights and Permissions Department, IEEE Publications Administration, 445 Hoes Lane, Piscataway, NJ 08854-4141. Copyright  All rights reserved. Periodicals Postage Paid at New York, NY and at additional mailing offices. Postmaster: Send address changes to IEEE T RANSACTIONS ON M ICROWAVE T HEORY AND T ECHNIQUES, IEEE, 445 Hoes Lane, Piscataway, NJ 08854-4141. GST Registration No. 125634188. CPC Sales Agreement #40013087. Return undeliverable Canada addresses to: Pitney Bowes IMEX, P.O. Box 4332, Stanton Rd., Toronto, ON M5W 3J4, Canada. IEEE prohibits discrimination, harassment and bullying. For more information visit http://www.ieee.org/nondiscrimination. Printed in U.S.A.

Digital Object Identifier 10.1109/TMTT.2016.2576159

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

1

A J-E Collocated WLP-FDTD Model of Wave Propagation in Isotropic Cold Plasma Yun Fang, Xiao-Li Xi, Member, IEEE, Ji-Min Wu, Jiang-Fan Liu, and Yu-Rong Pu

Abstract— We present an unconditionally stable weighted Laguerre polynomials (WLPs)-based finite-difference timedomain (FDTD) algorithm for modeling wave propagation in isotropic cold plasma. The plasma effects contributed by electrons and collisions are modeled by current density vectors collocated with the electric field components. The factorizationsplitting scheme is employed to translate the large sparse matrix equation in the conventional WLP-FDTD algorithm into two and six tri-diagonal ones for 2-D and 3-D problems, respectively. This procedure significantly improves the efficiency of the WLP-FDTD algorithm in terms of computational expenses. The stretchedcoordinate perfectly matched layer with a complex-frequencyshifted factor is implemented as the absorbing boundary condition. The accuracy and efficiency of the proposed algorithm are validated by numerical examples. Index Terms— Finite-difference time domain (FDTD), isotropic cold plasma, perfectly matched layer (PML), weighted Laguerre polynomials (WLPs).

I. I NTRODUCTION

T

HE finite-difference time-domain (FDTD) [1], [2] has been successfully used in the modeling of wave propagation in plasmas. The frequency-dependent FDTD, i.e., (FD)2 TD, algorithm was proposed by Luebbers et al. [3] based on the recursive convolution (RC) technique [4]. Later on, many different approaches have been proposed to handle the frequency dependency of the material property in the timedomain method [5]–[11]. Extensions of FDTD techniques to magnetized plasma modeling can be found in [12] and [13]. All the above-mentioned FDTD methods are explicit ones governed by the Courant–Friedrich–Levy (CFL) stability condition [2]. In plasma, the CFL constraint can be stricter compared to free space and the time step size depends not only on the plasma parameters, but also on the discretization scheme [9]. Over the years, researchers have developed a variety of unconditionally stable (US) FDTD algorithms, including

Manuscript received October 2, 2015; revised December 31, 2015 and April 28, 2016; accepted May 18, 2016. This work was supported in part by the National Science Foundation of China under Grant 61271091, in part by the Scientific Research Program through the Shaanxi Provincial Education Department under Grant 15JK1514, and in part by the China Post-Doctoral Science Foundation under Grant 2014M562521XB and Grant 2015M582687. (Corresponding author: Xiao-Li Xi.) Y. Fang, X.-L. Xi, J.-F. Liu, and Y.-R. Pu are with the Faculty of Automation and Information Engineering, Xi’an University of Technology, Xi’an 710048, China (e-mail: [email protected]; [email protected]; [email protected]; [email protected]). J.-M. Wu is with the Department of Optical Information Engineering, Wuhan University, Wuhan 430071, China (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TMTT.2016.2572178

but not limited to the alternating direction implicit (ADI) FDTD [14], [15], the locally one-dimensional FDTD [16], and the weighted Laguerre polynomials (WLPs) FDTD [17] methods. Among them, the ADI scheme suffers from its nondivergence-free nature that its numerical dispersion error increases with increasing time step [18]. In this respect, the WLP approach is superior. However, the original WLP-FDTD method involves the solution of large sparse matrix equation which is numerically expensive. Recently, a factorizationsplitting (FS) WLP scheme has been proposed which turns the sparse matrix into two tri-diagonal ones by adding perturbation terms [19]. This effort significantly improves the efficiency of the WLP-FDTD algorithm. Like in all other implicit algorithms, the perfectly matched layer (PML) [20], [21] implementation in the WLP-FDTD is complicated. In [22] and [23], Berenger’s split-field PML and the uniaxial anisotropic PML were introduced to the WLP-FDTD method to terminate air boundaries, respectively. The extension of the original WLP-FDTD to plasma modeling can be found in [24], where the frequency-dependent constitutive relation was handled by the ADE technique. Mur’s first-order absorbing boundary condition was employed there. Later on, the nearly perfectly matched layer (NPML) was implemented in the ADE-WLP-FDTD algorithm by Chen et al. [25]. The maximum reflection error achieved was around −40 dB with 16 NPMLs in a 2-D homogenous Debye medium. Recently, we applied the stretched-coordinate (SC)-based complex-frequency-shifted (CFS) PML to a 2-D ADE-WLP-FDTD algorithm for dispersive medium [26], [27]. It achieved a better absorption with a maximum reflection error of −65 dB with five PMLs for a plasma medium. In this paper, we present a new J-E collocated (JEC)WLP-FDTD algorithm for isotropic cold plasma modeling. Based on the FS technique [19], the proposed algorithm solves tri-diagonal matrix equations, thereby greatly reducing the computation cost of the WLP scheme. Formulated in an SC system [28], the PML can be easily combined with a CFS factor to improve the performance for low-frequency and evanescent wave absorption [29]–[31]. The inclusion of the CFS factor is very meaningful for the WLP scheme because the low-frequency range is exactly where an implicit algorithm can have an advantage over the conventional FDTD due to its larger time-step value. The organization of this paper is as follows. The formulation of the proposed JEC-WLP-FDTD is derived in the SC framework for 2-D and 3-D in Sections II and III, respectively. In Section IV, a plasma slab is taken as an example to validate the correctness of the algorithm. The transmission and

0018-9480 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 2

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

reflection coefficients are compared with those of the analytical, ADE-WLP-FDTD, and conventional PLRC-FDTD results. In Section V, the SC-PML performance is investigated through a 2-D homogenous plasma model. The improvement from the CFS factor is verified. In Section VI, the electromagnetic wave scattering by a cubic plasma object in free space is investigated. II. 2-D F ORMULATIONS The governing equations for wave propagation in isotropic cold plasma with the SC-PML are given by ∂E ∇˜ × H = ε0 +J (1) ∂t ∂H (2) ∇˜ × E = −μ0 ∂t dJ + υJ = ε0 ω2p E (3) dt where J is the polarization current density, ω p is the plasma frequency, and υ is the collision frequency. ∇˜ is the curl operator in the SC system [28], given by 1 ∂ 1 ∂ 1 ∂ + yˆ + zˆ . ∇˜ = xˆ sx ∂ x s y ∂y sz ∂z

At first, we take a 2-D-TEz case to describe the procedure for deriving the WLP-FDTD algorithm. The WLP domain representation of the E x component in (1) reads ⎛ ⎞ q−1  q q q E xk ⎠ + Jx = H˜ zy . (12) ε0 ⎝0.5s E x + s k=0,q>0

Using (9) in (12), we get q

q

E x − c2y D y Hz = −2

q−1 



q E xk + 2c1y h¯ q−1 Hzy − c4 Jx .

k=0,q>0

(13) Similarly, we have q

q

E y + c2x Dx Hz = −2

q−1 

q

E ky − 2c1x h¯ q−1 (Hzx ) − c4 Jy

q

q

q

Hz − c2y c3 D y E x + c2x c3 Dx E y = −2

(4)

q−1 

Hzk + 2c1y c3 h¯ q−1 (E x y ) − 2c1x c3 h¯ q−1 (E yx )

k=0,q>0

(15)

sζ , ζ = x, y, z are the coordinate-stretching variables defined as sζ = 1 + σζ / j ωε0

(5)

with the CFS factor, sζ , modified to [29]–[31] sζ = kζ + σζ /(αζ + j ωε0).

where c2ζ = c1ζ (2αζ /ε0 s + 1), c3 = ε0 /μ0 , and c4 = 2/(ε0 s). The current density components collocated with the electric fields are also written in the WLP domain as q Jx

(6)

=

q c5 E x

− c6

1 ∂ Fη F˜ηζ = sζ ∂ζ

q

q F˜ηζ = (αζ + 0.5ε0 s)c1ζ Dζ Fηq + ε0 sc1ζ h¯ q−1 (Fηζ )

(9)

where q is the order of WLPs, s > 0 is a time-scaling factor, and c1ζ is a PML parameter-related coefficient, given by c1ζ = 1/(κζ αζ + σζ + 0.5κζ ε0 s).

(10)

Dζ is the differential operator along ζ direction, and h¯ q−1 (Fηζ ) is the lower order sum of the fields and auxiliary variables in the WLP domain, given by

k=0,q>0

Jxk

q−1 

Jyk

(16)

k=0,q>0

Following Chung’s field expansion and Galerkin’s testing procedures [17], (8) is transferred to the Laguerre polynomial domain as

Fηk − κζ

q

J y = c5 E y − c6

(7)

where Fη are the electric and magnetic field components in the η direction, i.e., F = E, H and η = x, y, z. F˜ηζ , ζ = η are the stretched Fη in the ζ direction. Equation (7) can be transferred to the time domain by replacing j ω with a differential operator ∂/∂t, leading to   ∂ F˜ηζ ∂ Fη ∂ ∂ Fη ˜ = αζ + ε0 . (8) (κζ αζ + σζ ) Fηζ + κζ ε0 ∂t ∂ζ ∂t ∂ζ

h¯ q−1 (Fηζ ) = Dζ

q−1  k=0,q>0

We introduce the following auxiliary variables:

q−1 

(14)

k=0,q>0

q−1  k=0,q>0

k . F˜ηζ

(11)

where c5 = ε0 ω2p /(0.5s + υ) and c6 = s/(0.5s + υ) are plasma-related coefficients. Using (16) in (13) and (14), we have q

q

E x − c2y c7 D y Hz ⎛ q−1  = c7 ⎝−2 E xk + 2c1y h¯ q−1 (Hzy ) + c4 c6 k=0,q>0

q−1 

⎞ Jxk ⎠

k=0,q>0

(17) q

q

E y + c2x c7 Dx Hz ⎛ q−1  = c7 ⎝−2 E ky − 2c1x h¯ q−1 (Hzx ) + c4 c6 k=0,q>0

q−1 

⎞ Jyk ⎠

k=0,q>0

(18) where c7 = 1/(1 + c4 c5 ). Equations (15), (17), and (18) form the set of JEC field equations of the WLP-FDTD algorithm, which can be written in a more compact form as (I − D)Xq = Vq−1 q Ex

where Xq = [ side and Vq−1 =

(19)

q q E y Hz ]T are the unknowns on the left-hand q−1 q−1 T [ VEq−1 x V E y V H z ] on the right-hand side

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. FANG et al.: JEC-WLP-FDTD MODEL OF WAVE PROPAGATION IN ISOTROPIC COLD PLASMA

3

q

q

q

are known values contributed by lower order fields. I is a 3×3 identity matrix, and D is the differential operator matrix, given by ⎤ ⎡ 0 0 c2y c7 D y 0 0 −c2x c7 Dx ⎦. D=⎣ (20) c2y c3 D y −c2x c3 Dx 0

E x , the collocated Jy and Jx can be calculated explicitly using (16). The time-domain fields can finally be reconstructed following [17] with a certain order of WLP fields.

This equation set must be solved implicitly. It is clear that the left-hand sides comprise a large sparse matrix. The traditional WLP-FDTD algorithm solves the large sparse matrix equation directly, which is computationally intensive in terms of both time and memory. To overcome this problem, we follow the FS approach in [19] to convert the sparse matrix equation into two tri-diagonal ones that could be solved efficiently using a chasing algorithm. Splitting D according to the directions of the differential operators, the resulting matrices A and B have two nonzero elements each, given by

In this section, we extend the WLP-FDTD algorithm to a full 3-D model. Following a similar process as in the 2-D case, the WLP expressions of the electromagnetic components can be derived. For example, the implicit representation of the E x component is given by

a23 = −c2x c7 Dx a32 = −c2x c3 Dx b13 = c2y c7 D y b31 = c2y c3 D y .

(21) (22)

III. 3-D F ORMULATIONS

q

= −2

Adding a perturbation term AB(Xq − Vq−1 ), and introducing an intermediate variable Yq = (I − B)Xq + BVq−1 , (23) can be split into two equations as follows:  (I − A)Yq = (I + B)Vq−1 (24) (I − B)Xq = Y − BVq−1 ∗q

where Yq = [ E x E y Hz ]T . Expanding (24) leads to q−1

∗q − a23 Hz

=

q−1 VE y

∗q

= b31 VE x + V H z

q

E xk +2c1y h¯ q−1 (Hzy ) − 2c1z h¯ q−1 (H yz ) − c4 Jx . (29)

Using the current density Jx in (16) in (29), we have q

k=0,q>0

− 2c1z h¯ q−1 (H yz ) + c4 c6

∗q

−a32 E y + Hz q

q

∗q

q

∗q

q

∗q

q

q−1

q−1

Ey = Ey

q−1

−b31 E x + Hz = Hz − b31 VE x .

(25)

Eliminating the intermediate variables, we get the final WLP field equations q

q−1

q

q−1

q−1

q−1

(26)

q

(27)

(1 − a23 a32 )E y = VE y + a23 b31 VE x + a23 V H z q−1

(1 − b13 b31 )E x = VE x + b13 V H z + b13 a32 E y q Hz

=

q a32 E y

q + b31 E x

+

q−1 VH z .

Jxk ⎠.

(I − D)Xq = Vq−1

E x − b13 Hz = E x − b13 V H z q



(28)

Equations (26)–(28) can be discretized using Yee’s central difference scheme. The left-hand side coefficients of (26) and (27) will then form tri-diagonal matrices because of the second-order finite differencing operations in the spatial domain. The right-hand side coefficients are known values q if we solve (26)–(28) sequentially. With known E y and

(30)

The other five field components have similar forms and are omitted for the sake of brevity. The complete set of the JEC field equations of the WLP-FDTD algorithm can be written more compactly as

q−1

q−1

q−1  k=0,q>0

E x = VE x + b13 V H z ∗q Ey

q

E x + c2z c7 Dz H y − c2y c7 D y Hz ⎛ q−1  = c7 ⎝−2 E xk + 2c1y h¯ q−1 (Hzy )

(23)

∗q

q−1 

q

(I − A − B)Xq = Vq−1 .

∗q

q

k=0,q>0

Equation (19) becomes

∗q

q

E x + c2z Dz H y − c2y D y Hz

q

q

q

q

(31)

q

where Xq = [ E x E y E z Hx H y Hz ]T are the unknowns q−1 q−1 q−1 q−1 q−1 q−1 and Vq−1 = [ VE x VE y VE z V H x V H y V H z ]T on the right-hand side are known values contributed by lower order fields. I is a 6 × 6 identity matrix, and D is the differential operator matrix, given by ⎤ ⎡ 0 0 0 0 a15 b16 ⎢ 0 0 0 b24 0 a26 ⎥ ⎥ ⎢ ⎢ 0 0 0 a34 b35 0 ⎥ ⎥ ⎢ (32) D=⎢ b42 a43 0 0 0 ⎥ ⎥ ⎢ 0 ⎣ a51 0 b53 0 0 0 ⎦ b61 a62 0 0 0 0 where a15 = −c2z c7 Dz a26 = −c2x c7 Dx a34 = −c2y c7 D y a43 = −c3 c2y D y a51 = −c3 c2z Dz a62 = −c3 c2x Dx b16 = c2y c7 D y b24 = c2z c7 Dz b35 = c2x c7 Dx b42 = c3 c2z Dz b53 = c3 c2x Dx b61 = c3 c2y D y .

(33)

Using the same manipulation as in the 2-D case, we get the

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 4

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Fig. 1.

Simulation model for plane wave traveling in plasma. Fig. 2.

Magnitude of the transmission and reflection coefficients.

Fig. 3.

Phase of the transmission and reflection coefficients.

following updated equations: ∗q

q−1

q−1

(1−a15a51)E x = VE x + a15 b53 VE z q−1

q−1

+ a15 V H y + b16 V H z ∗q (1−a26a62 )E y

∗q (1−a34a43)E z

q (1−b16b61)E x q (1−b24b42 )E y q (1−b35b53)E z q Hx q Hy q Hz

= = = = = = = =

q−1 + a26 b61 VE x q−1 q−1 + a26 V H z + b24 V H x q−1 q−1 VE z + a34 b42 VE y q−1 q−1 + a34 V H x + b35 V H y ∗q ∗q E x + b16 a62 E y ∗q ∗q E y + b24 a43 E z ∗q ∗q E z + b35 a51 E x q ∗q q−1 c3 c2z Dz E y − c3 c2y D y E z + V H x q ∗q q−1 c3 c2x Dx E z −c3 c2z Dz E x +V H y q ∗q q−1 c3 c2y D y E x −c3 c2x Dx E y +V H z .

(34)

q−1 VE y

(35)

(36) (37) (38) (39) (40) (41) (42)

The first six equations can be translated into six tri-diagonal matrix equations following Yee’s discretization scheme, which includes solving for three nonphysical intermediate variables, ∗q ∗q ∗q i.e., E x , E y , and E z . The last three equations of magnetic field components are solved explicitly after these intermediate variables and the electric fields are known. In the following sections, three numerical examples are taken to verify the accuracy and efficiency of the proposed JEC-WLP-FDTD method. The simulations are performed on a desktop with Intel(R) i7-4790 Core of 3.60 GHz and 8-GB RAM. IV. N UMERICAL S TUDY First, we calculate the transmission and reflection coefficients of a plasma slab and validate the results through comparison with the analytical solution using the scattering matrix method (SMM) [32]. It is worth noting that the SMM takes into account the thickness of the plasma slab and the multiple reflections at the two interfaces of the slab are captured. Two numerical methods, i.e., the conventional FDTD method with PLRC technique and the ADE-WLP-FDTD method, are also taken as references to compare the computation cost and accuracy. The simulation model is shown in Fig. 1. The computational domain is discretized into 100×50 lattices. A 6-mm-thick plasma slab occupies the 30th–69th grids along the x direction. The plasma parameters are

ω p = 1.80327 × 1011 rad/s and υ = 2 × 1011 rad/s. A plane wave excited from the left side of the plasma slab has a shape of differential Gaussian pulse given by I y (t) = (t − t0 )/τ × exp(−(t − t0 )2 /τ 2 )

(43)

where t0 = 0.05 ns and τ = 0.01 ns. Each of the four boundaries of the computational domain is terminated with ten PML cells. The PML parameters are scaled as follows [31]: σζ = σζ max |ζ − ζ0 |m /d m σopt = (m + 1)/150π ζ κζ = 1 + (κζ max − 1)|ζ − ζ0 |m /d m

(44) (45) (46)

where ζ = x, y, and ζ0 represents the interface between the FDTD and PML grids. d is the thickness of the PML. m = 4 is a constant number. We choose the order of WLPs to be q = 90; a guidance on selection of this parameter can be found in [33] and a timescale factor of s = 5.225 × 1011. In order to balance between computational burden and result precision, we choose spatial step x = y = 150 μm. The time step of the JEC-WLP algorithm is set at t = 3.54 ps with a CFL number of 10 to show unconditional stability of the proposed method. The total time duration is T f = 0.425 ns.Two observation points are located on the left and right boundaries of the plasma slab. The electric field values are transferred to the frequency domain to calculate the reflection and transmission coefficients. Figs. 2 and 3 show the magnitude and phase of the coefficients, respectively. The analytical, ADE-WLP-FDTD,

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. FANG et al.: JEC-WLP-FDTD MODEL OF WAVE PROPAGATION IN ISOTROPIC COLD PLASMA

Fig. 4. Magnitude error between the analytical and numerical reflection coefficients.

Fig. 6.

Transient electric fields of the reflection wave.

Fig. 7.

Transient electric fields of the transmission wave.

Fig. 8.

Splitting error of the reflection wave.

5

Fig. 5. Magnitude error between the analytical and numerical transmission coefficients. TABLE I C OMPUTATION C OST C OMPARISON FOR THE FDTD A LGORITHMS

and PLRC-FDTD results are added to have a comparison. An excellent agreement is observed in both figures. Fig. 4 plots the magnitude errors between the analytical and the two numerical reflection coefficients, while Fig. 5 plots those of the transmission coefficients. The magnitude errors of reflection coefficients of the two methods are very close; however, the JEC-WLP method achieves a slightly better accuracy on the magnitude of the transmission coefficient than the PLRC-FDTD method, especially in the high-frequency range. Note that the PLRC simulation takes a much smaller time step size of 0.25 ps in order to have a stable solution. It runs for 1698 steps. The computation costs of the three numerical algorithms are listed in Table I. The total CPU time cost by the JEC-WLP is greatly reduced compared to the PLRC algorithm. The ADE-WLP algorithm is computationally more expensive because of solving a huge sparse matrix

equation than JEC-WLP algorithm. We can also see that the CPU time is almost the same for the JEC-WLP algorithm with different time step sizes. This is as expected because the time step size in a WLP algorithm has a meaning of time-domain sampling rate; it is not linked to the number of iterations. The memory cost of the JEC-WLP is higher than that of the PLRC q−1 q−1 algorithm due to the three extra variables, i.e., VE y , VE x , q−1

and V H z , required to store the lower order sums of each grid. Now we analyze the error introduced by the FS process. Figs. 6 and 7 show the y components of the transient electric fields recorded on the left and right boundaries of the plasma slab with different FDTD methods. It can be seen that the results agree very well. Figs. 8 and 9 display the splitting errors of the reflection and transmission waves, defined as

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 6

Fig. 9.

Fig. 10. plasma.

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Fig. 11.

Splitting error of the transmission wave.

Transient magnetic fields for the two FDTD methods.

Simulation model for a point source radiation in homogeneous

the differences between the conventional WLP-FDTD and the proposed FS JEC-WLP-FDTD algorithms. It is seen that the splitting errors at the observation points are on the order of 10−14 with respect to the field strength. V. PML P ERFORMANCE S TUDY As a second example, we analyze the SC-PML performance with the CFS factor. We simulate a homogeneous plasma model with ω p = 5.64 × 1010 rad/s and υ = 2 × 1010 rad/s. As shown in Fig. 10, the computational domain, including ten layers of PML, is divided into 50×50 cells with a uniform size of x = y = 0.95238 mm. The time step is set at 1.587 ps to facilitate comparison with conventional FDTD. The total time duration is 0.7935 ns. A sinusoidal modulated Gaussian pulse given by Iz (t) = sin(2π f (t − t0 )) × exp(−(t − t0 )2 /τ 2 )

(47)

is excited at the center of the simulation domain with t0 = 144 ps, τ = 48 ps, and f = 10.5 GHz. The order of WLPs is 250, and the timescale factor is set to be s = 1.15 × 1012 . Fig. 11 shows the transient Hz component at the observation point located at the corner of the plasma area compared with the conventional PLRC-FDTD solution. The good agreement between the two methods verifies the proposed solution. We calculate the PML reflection error [22] using   

 (48) RdB (t) = 20 log10  Hzref (t) − Hz (t) max  Hzref (t) where Hz (t) is the time-domain magnetic field recorded at the observation point. Hzref (t) is the reference solution from

Fig. 12. Maximum reflection error in dB as a function of PML parameters. (a) κmax = 1, αmax = 0. (b) κmax = 6, αmax = 0.62. (c) m = 7, αmax = 0.62. (d) m = 7, κmax = 6.

an extended model (400 × 400 cells) where no reflection is captured during the simulation period. At first, we perform parametric studies to search for the optimal PML parameters. Note that α in the CFS expression (6) is scaled as follows: αζ = αζ max (d − |ζ − ζ0 |)/d.

(49)

Fig. 12(a) plots the peak reflection errors as a function of σmax /σopt and m with κmax = 1 and αmax = 0, corresponding to the case without CFS. In theory, this situation is equivalent to the split-field PML. The lowest reflection around −64 dB occurs at σmax /σopt = 0.8 and m = 5. Fig. 12(b)–(d) shows the peak reflection errors as a function of CFS parameters at different section planes. It is seen that the optimal CFS parameter set, i.e., σmax /σopt = 0.8, m = 7, κmax = 6, and αmax = 0.62, achieves a lower peak reflection of −74 dB. Fig. 13 compares the time-domain relative reflection errors of the SC-PML with and without CFS; in both cases, the optimal

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. FANG et al.: JEC-WLP-FDTD MODEL OF WAVE PROPAGATION IN ISOTROPIC COLD PLASMA

Fig. 13.

Fig. 14.

7

Relative reflection error as a function of time.

Fig. 15. plasma.

Simulation model for electromagnetic wave scattering by a cubic

Fig. 16.

Transient electric fields of the x component at p1.

Fig. 17.

Transient electric fields of the x component at p2.

Relative reflection error as a function of frequency.

parameter sets are taken. The improvement resulting from the CFS factor is significant. Further, we examine the absorption effects of the two PMLs in the frequency domain calculated using   

 RdB ( f ) = 20 log10  Hzref ( f ) − |Hz ( f )|  Hzref ( f ) (50) where Hz ( f ) and Hzref ( f ) are the frequency-domain magnetic fields obtained by using discrete Fourier transform to Hz (t) and Hzref (t), respectively. Fig. 14 shows the frequency-domain relative reflection errors of different absorbing boundaries. It is found that SC-PML with CFS shows a better absorption than SC-PML without CFS, especially when ω < ω p . VI. S CATTERING BY A C UBIC P LASMA As a 3-D example, we investigate the electromagnetic wave scattering by a cubic plasma in free space. As shown in Fig. 15, the computational domain is divided into 100 × 100 × 100 cells with a uniform grid size of

x = y = z = 150 μm including ten layers of PML around the computational domain. A 20 × 20 × 20 cell cubic plasma occupies the 40th–59th grids along the x, y, and z directions. An x–z oriented plane wave located at the grid (y = 20) is incident on the cubic plasma. It has a shape of differential Gaussian pulse Ix (t) = (t − t0 )/τ × exp(−(t − t0 )2 /τ 2 )

(51)

where t0 = 0.05 ns and τ = 0.01 ns. The order of WLPs is 288, and s = 4.7 × 1012 . The JEC-WLP-FDTD takes a time step of t = 1.4434 ps at CFLN = 5. The total time duration is T f = 0.289 ns. The homogeneous cubic plasma parameters

are ω p = 1.80327 × 1011 rad/s and υ = 2 × 1010 rad/s. The PML parameters are set to σmax /σopt = 0.8, m = 4, κmax = 6, and αmax = 0.62. We take three observation points at p1 (50, 35, 50), p2 (26, 50, 50), and p3 (50, 77, 50) to investigate the electromagnetic wave scattering. Figs. 16–18 show the transient electric fields of the E x component at these observation points compared with the conventional PLRC-FDTD method. It can be seen that the results agree very well. It is worth mentioning that the time step of the PLRC-FDTD method is set at t = 0.14434 ps

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 8

Fig. 18.

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Transient electric fields of the x component at p3.

to have a stable solution, and the CPU time cost is 1532 s to run the total simulation period of T f = 0.289 ns. For the proposed JEC-WLP-FDTD method, the CPU time is 852 s. VII. C ONCLUSION We presented an US JEC-WLP-FDTD algorithm for the simulation of wave propagation in plasma. The accuracy of the proposed algorithm was validated through comparisons with the analytical, ADE-WLP-FDTD, and conventional PLRC-FDTD solutions. The proposed algorithm significantly reduced the computational time with improved accuracy compared with the PLRC method. Formulated in an SC system, the PML with CFS factor showed very good absorption performance. A full 3-D numerical example is given for the simulation of electromagnetic wave scattering by a plasma cubic in free space. R EFERENCES [1] K. Yee, “Numerical solution of initial boundary value problems involving Maxwell’s equations in isotropic media,” IEEE Trans. Antennas Propag., vol. 14, no. 3, pp. 302–307, May 1966. [2] A. Taflove and S. C. Hagness, Computational Electrodynamics: The Finite-Difference Time-Domain Method, 3rd ed. Norwood, MA, USA: Artech House, 2005. [3] R. J. Luebbers, F. Hunsberger, and K. S. Kunz, “A frequency-dependent finite-difference time-domain formulation for transient propagation in plasma,” IEEE Trans. Antennas Propag., vol. 39, no. 1, pp. 29–34, Jan. 1991. [4] J. L. Young, A. Kittichartphayak, Y. M. Kwok, and D. Sullivan, “On the dispersion errors related to (FD)2 TD type schemes,” IEEE Trans. Microw. Theory Techn., vol. 43, no. 8, pp. 1902–1909, Aug. 1995. [5] D. M. Sullivan, “Z-transform theory and the FDTD method,” IEEE Trans. Antennas Propag., vol. 44, no. 1, pp. 28–34, Jan. 1996. [6] D. F. Kelley and R. J. Luebbers, “Piecewise linear recursive convolution for dispersive media using FDTD,” IEEE Trans. Antennas Propag., vol. 44, no. 6, pp. 792–797, Jun. 1996. [7] J. L. Young, “A full finite difference time domain implementation for radio wave propagation in a plasma,” Radio Sci., vol. 29, no. 6, pp. 1513–1522, 1994. [8] J. L. Young, “A higher order FDTD method for EM propagation in a collisionless cold plasma,” IEEE Trans. Antennas Propag., vol. 44, no. 9, pp. 1283–1289, Sep. 1996. [9] S. A. Cummer, “An analysis of new and existing FDTD methods for isotropic cold plasma and a method for improving their accuracy,” IEEE Trans. Antennas Propag., vol. 45, no. 3, pp. 392–400, Mar. 1997. [10] Q. Chen, M. Katsurai, and P. H. Aoyagi, “An FDTD formulation for dispersive media using a current density,” IEEE Trans. Antennas Propag., vol. 46, no. 11, pp. 1739–1746, Nov. 1998.

[11] M. A. Alsunaidi and A. A. Al-Jabr, “A general ADE-FDTD algorithm for the simulation of dispersive structures,” IEEE Photon. Technol. Lett., vol. 21, no. 12, pp. 817–819, Jun. 15, 2009. [12] Y. Yu and J. J. Simpson, “An E-J collocated 3-D FDTD model of electromagnetic wave propagation in magnetized cold plasma,” IEEE Trans. Antennas Propag., vol. 58, no. 2, pp. 469–478, Feb. 2010. [13] W. Tierens and D. De Zutter, “An unconditionally stable time-domain discretization on Cartesian meshes for the simulation of nonuniform magnetized cold plasma,” J. Comput. Phys., vol. 231, no. 15, pp. 5144–5156, 2012. [14] F. Zhen, Z. Chen, and J. Zhang, “Toward the development of a three-dimensional unconditionally stable finite-difference time-domain method,” IEEE Trans. Microw. Theory Techn., vol. 48, no. 9, pp. 1550–1558, Sep. 2000. [15] A. P. Zhao, “Two special notes on the implementation of the unconditionally stable ADI-FDTD method,” Microw. Opt. Technol. Lett., vol. 33, no. 4, pp. 273–277, 2002. [16] J. Shibayama, M. Muraki, J. Yamauchi, and H. Nakano, “Efficient implicit FDTD algorithm based on locally one-dimensional scheme,” Electron. Lett., vol. 41, no. 19, pp. 1046–1047, Sep. 2005. [17] Y.-K. Chung, T. K. Sarkar, H. J. Baek, and M. Salazar-Palma, “An unconditionally stable scheme for the finite-difference timedomain method,” IEEE Trans. Microw. Theory Techn., vol. 51, no. 3, pp. 697–704, Mar. 2003. [18] S. G. Garcia, R. G. Rubio, A. R. Bretones, and R. G. Martín, “On the dispersion relation of ADI-FDTD,” IEEE Microw. Wireless Compon. Lett., vol. 16, no. 6, pp. 354–356, Jun. 2006. [19] Z. Chen, Y.-T. Duan, Y.-R. Zhang, and Y. Yi, “A new efficient algorithm for the unconditionally stable 2-D WLP-FDTD method,” IEEE Trans. Antennas Propag., vol. 61, no. 7, pp. 3712–3720, Jul. 2013. [20] J.-P. Berenger, “A perfectly matched layer for the absorption of electromagnetic waves,” J. Comput. Phys., vol. 114, no. 2, pp. 185–200, 1994. [21] S. D. Gedney, “An anisotropic perfectly matched layer-absorbing medium for the truncation of FDTD lattices,” IEEE Trans. Antennas Propag., vol. 44, no. 12, pp. 1630–1639, Dec. 1996. [22] Z. Chen, Y.-T. Duan, Y.-R. Zhang, H.-L. Chen, and Y. Yi, “PML implementation for a new and efficient 2-D Laguerre-based FDTD method,” IEEE Antennas Wireless Propag. Lett., vol. 12, pp. 1339–1342, Oct. 2013. [23] Y. T. Duan, B. Chen, H. L. Chen, and Y. Yi, “Anisotropic-medium PML for efficient Laguerre-based FDTD method,” Electron. Lett., vol. 46, no. 5, pp. 318–319, Mar. 2010. [24] W.-J. Chen, W. Shao, and B.-Z. Wang, “ADE-Laguerre-FDTD method for wave propagation in general dispersive materials,” IEEE Microw. Wireless Compon. Lett., vol. 23, no. 5, pp. 228–230, May 2013. [25] W.-J. Chen, W. Shao, H. Chen, and B.-Z. Wang, “Nearly PML for ADE-WLP-FDTD modeling in two-dimensional dispersive media,” IEEE Microw. Wireless Compon. Lett., vol. 24, no. 2, pp. 75–77, Feb. 2014. [26] X. Xi, Y. Fang, J. Liu, and Z. Zhu, “An effective CFS-PML implementation for 2-D WLP-FDTD method,” IEICE Electron. Exp., vol. 12, no. 7, p. 20150191, 2015. [27] J. F. Liu, Y. Fang, Z. B. Zhu, and X. Xi, “WLP-FDTD implementation of CFS-PML for plasma media,” in Proc. 31st Int. Rev. Prog. Appl. Comput. Electromagn. (ACES), Mar. 2015, pp. 1–2. [28] W. C. Chew and W. H. Weedon, “A 3D perfectly matched medium from modified Maxwell’s equations with stretched coordinates,” Microw. Opt. Technol. Lett., vol. 7, pp. 599–604, Sep. 1994. [29] J. A. Roden and S. D. Gedney, “Convolution PML (CPML): An efficient FDTD implementation of the CFS–PML for arbitrary media,” Microw. Opt. Technol. Lett., vol. 27, no. 5, pp. 334–339, 2000. [30] J.-P. Wrenger, “Numerical reflection from FDTD-PMLs: A comparison of the split PML with the unsplit and CFS PMLs,” IEEE Trans. Antennas Propag., vol. 50, no. 3, pp. 258–265, Mar. 2002. [31] S. D. Gedney, “Scaled CFS-PML: It is more robust, more accurate, more efficient, and simple to implement. Why aren’t you using it?” in Proc. IEEE Antennas Propag. Soc. Int. Symp., Jul. 2005, pp. 364–367. [32] B. J. Hu, G. Wei, and S. L. Lai, “SMM analysis of reflection, absorption, and transmission from nonuniform magnetized plasma slab,” IEEE Trans. Plasma Sci., vol. 27, no. 4, pp. 1131–1136, Aug. 1999. [33] K. Srinivasan, P. Yadav, E. Engin, and M. Swaminathan, “Choosing the right number of basis functions in multiscale transient simulation using Laguerre polynomials,” in Proc. Elect. Perform. Electron. Packag., Atlanta, GA, USA, Oct. 2007, pp. 291–294.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. FANG et al.: JEC-WLP-FDTD MODEL OF WAVE PROPAGATION IN ISOTROPIC COLD PLASMA

Yun Fang received the B.S. and M.S. degrees in electronic engineering from the Xi’an University of Technology (XUT), Xi’an, China, in 2013 and 2016, respectively. He is currently pursuing the Ph.D. degree in electronic engineering at XUT. His current research interests include computational electromagnetics and antenna design.

Xiao-Li Xi (M’10) received the B.S. degree in applied physics from the University of Defense Technology, Changsha, China, in 1990, the M.S. degree in biomedical engineering from Fourth Military Medical University, Xi’an, China, in 1998, and the Ph.D. degree in electronic engineering from Xi’an Jiaotong University, Xi’an, China, in 2004. She is currently a Professor with the Department of Electronic Engineering, Xi’an University of Technology, Xi’an, China. Her current research interests include wave propagation, antenna design, and communication signal processing.

Ji-Min Wu, photograph and biography not available at the time of publication.

9

Jiang-Fan Liu received the B.S. and M.S. degrees in electronic engineering from the Xi’an University of Technology (XUT), Xi’an, China, in 2006 and 2009, respectively, and the Ph.D. degree in electronic engineering from Northwestern Polytechnical University, Xi’an, China, in 2013. He is currently a Lecturer with the Department of Electronic engineering, XUT. His current research interests include computational electromagnetics and antenna design.

Yu-Rong Pu received the B.S., M.S., and Ph.D. degrees in electronic engineering from the Xi’an University of Technology (XUT), Xi’an, China, in 2004, 2007, and 2013, respectively. She is currently a Lecturer with the Department of Electronic engineering, XUT. Her current research interests include computational electromagnetics and wave propagation.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

1

Accurate Analysis of Finite-Volume Lumped Elements in Metamaterial Absorber Design Jian Wei You, Member, IEEE, Jian Feng Zhang, Wei Xiang Jiang, Hui Feng Ma, Wan Zhao Cui, and Tie Jun Cui, Fellow, IEEE

Abstract— To simulate lumped elements (LEs) more accurately in metamaterial absorber (MA) designs, a finite-volume LE (FVLE) model based on the time-domain finite-integral theorem (TDFIT) is proposed in this paper. In MA designs, lumped resistors and capacitors play an important role in dissipating electromagnetic (EM) fields, controlling resonant frequencies, and achieving an impedance matching. Through a rigorous mathematical derivation, we successfully prove that an arbitrary lumped resistor or capacitor can be modeled more accurately by modifying the conductivity or permittivity of a finite-volume model in TDFIT method. Compared with existing traditional zero-volume LE (ZVLE) solutions, the coupling effect between the circuit part and the EM part can be considered much better in the proposed FVLE model. For this reason, the numerical results of the FVLE model match the measured results much better. In addition, a result comparison is given to validate that the challenges of grid dispersion error and frequency response error in the classic ZVLE solution can be greatly overcome by the proposed FVLE model. Index Terms— Finite-volume lumped element (FVLE), frequency response error, finite-difference time domain (FDTD), grid dispersion error, metamaterial absorber (MA), timedomain finite-integral theorem (TDFIT), zero-volume lumped element (ZVLE).

I. I NTRODUCTION METAMATERIAL absorber (MA) is usually mounted with some periodic unit cells, which are composed of a metal pattern and a grounded dielectric substrate. With the help of metamaterial [1], we can easily tune the constitutive parameters of an absorber to achieve a better impedance matching on the interface between the absorber and air. Theoretically, the incident electromagnetic (EM) waves can perfectly transmit in an MA without any interfacial reflection. Meanwhile, the energy of transmitted EM wave rapidly dissipates because of the conductive and dielectric losses at the resonant frequency. Based on these principles, we can design a perfect MA [2] at the frequencies we are interested in. In recent years, the benefit of MA has aroused great

A

Manuscript received July 27, 2015; revised October 30, 2015 and May 7, 2016; accepted May 18, 2016. This work was supported by the National Natural Science Funds of China under Grant 61401096. J. W. You, J. F. Zhang, W. X. Jiang, H. F. Ma, and T. J. Cui are with the State Key Laboratory of Millimeter Waves, Southeast University, Nanjing 210096, China (e-mail: [email protected]; [email protected]; wxjiang@ emfield.org; [email protected]; [email protected]). W. Z. Cui is with the Science and Technology on Space Microwave Laboratory, China Academy of Space Technology, Xi’an 710100, China (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TMTT.2016.2572180

attention, and hundreds of relevant papers have been published. Nowadays, the working frequency of MA applications has spanned from microwave [3]–[5] to terahertz [6]–[8], and sometimes even in the optical regime [9]–[11]. Motivated by its considerable prospects, researchers have been pursuing much higher performance of MA, including lower reflection, wider bandwidth, larger angle of absorption, and better polarization insensitivity. In practical designs, the polarization insensitivity is usually realized by multilayer MAs or more symmetrical metal patterns. Recently, Costa et al. [12], Cheng et al. [13], and Yoo and Lim [14] found that a lumped capacitor can be used to optimize the impedance matching of MA and obtain lower interfacial reflection. Besides, a lumped resistor can be employed to increase the joule loss and then achieve a wider bandwidth. Because of these two excellent characteristics, lumped elements (LEs) have been widely used in MA designs. This new trend makes it very important to accurately analyze the interaction between the circuit part and EM part. However, many reports have pointed out that the error between experiments and simulations would become much more serious as the working frequency of MA increases. In existing time-domain numerical methods, the transmission line matrix (TLM) method [15], [16] is first applied to simulate a hybrid circuit (a distributed circuit mounted with LEs). However, its precision and complexity are no longer acceptable in modern MA designs, because an empirical equivalent-transformation technique should be employed in the TLM method to get the parameters of the equivalent model from a distributed circuit. To improve the accuracy and decrease the complexity of implementation, an LE-finitedifference time-domain (FDTD) hybrid method [17]–[19] has been proposed to simulate some simple LEs. Supplemented by some other great work [20]–[23], a lumped network (LN)-FDTD method has been developed to analyze an arbitrary LN in the Z -transform domain. So far, FDTD method has become one of the most popular time-domain numerical methods in EM-circuit analysis. In addition to the popular FDTD method, the time-domain finite-element method (TDFEM) and the discontinuous Galerkin time-domain (DGTD) method are also good candidates to analyze LEs. In [24], the TDFEM is first used to simulate a multiport LN, which is represented by an admittance matrix and interfaced with TDFEM through the cell edges and circuit nodes. To further improve its efficiency, a flexible time-stepping scheme [25] has been proposed to

0018-9480 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 2

solve a multiscale EM-circuit problem. Meanwhile, DGTD is extended to simulate an LN in [26] and [27]. Similar to LN-TDFEM, the system in LN-DGTD is divided as the EM subsystem and the circuit subsystem. The EM subsystem is analyzed by DGTD, and the circuit subsystem is modeled by a modified nodal analysis method. The coupling between the EM part and the circuit part is achieved through the port voltage and port current. In the methods mentioned above, an LE or its network is generally implemented by incorporating it on a cell’s edge, and its size is always approximated as zero-volume. However, the effectiveness of such zero-volume LE (ZVLE) is based on two crucial assumptions: First, the effective length of the lumped part is far shorter than the pump wavelength. Second, compared with the size of the distributed part, the volume of the lumped part is negligible. Only when both of these assumptions are satisfied, a realistic LE can be approximated with the ZVLE solution. However, the modern MA is more and more tiny, and its working frequency becomes much higher. For these reasons, the coupling effect between the lumped part and the distributed part would become more and more noticeable. These new characteristics make the classic ZVLE solution have two major challenges to accurately analyze LE-based MAs. First, since a fine metal pattern is needed to be meshed with highly dense grids in simulations, the undesirable grid-dispersion error becomes much more serious. Second, some new frequencyresponse errors emerge in MA designs, since the coupling effect has not been considered in ZVLE solutions. To overcome these challenges, a finite-volume LE (FVLE) model based on the time-domain finite-integral theorem (TDFIT) is proposed in this paper. This paper is organized as follows. A new ZVLE-TDFIT scheme used to solve the EM-circuit problem is presented in Section II, where the challenges and errors of traditional ZVLE solutions are discussed. To overcome these challenges and errors, a novel FVLE model is proposed in Section III. In Section IV, two examples are given to verify the superiorities of the proposed FVLE model. At the end of this paper, a brief conclusion is given. II. Z ERO -VOLUME L UMPED E LEMENT AND C HALLENGES In our past research, we have proposed a TDFIT scheme [28], and now it has been successfully applied to analyze accurately the antenna–radome interaction [29] and the passive-intermodulation phenomenon [30]. Besides, it also has been used to predict the multipactor threshold [31], [32] in high-power microwave devices. Different from our previous studies, the TDFIT scheme in this paper is focused on solving the EM-Circuit problem in MA designs. In this section, we present a more flexible way to directly implement the ZVLE series/parallel network in time domain. Meanwhile, the challenges and errors of the classic ZVLE solutions in MA designs are discussed. A. Zero-Volume Lumped Element in TDFIT Method To demonstrate more clearly the new way to incorporate a ZVLE series/parallel network into TDFIT, a brief

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Fig. 1.

ZVLE series network in a TDFIT cell.

introduction of TDFIT is appended in Section VI. Unlike the 



FDTD method [33], the units of e and h are volt and ampere, respectively. It means the quantities in the TDFIT method are the equivalent voltages and equivalent currents rather than EM field components. Thus, the TDFIT algorithm is more natural and explicit than the FDTD method to simulate a ZVLE. 



To incorporate a ZVLE into TDFIT, the quantities e and h in regular regions are iterated by the normal recursions (24), (25). But for the grids loaded with ZVLE, their conduction current density i de,s in (24) should be added with a port current density i de,L of ZVLE. In Sections II-A1 and II-A2, we will detail the approach to obtain the port current of the ZVLE series/parallel networks and then incorporate them into TDFIT. 1) ZVLE Series Network: To obtain the port current of a ZVLE series network, we study its voltage–current (U –I ) relationships first. As illustrated in Fig. 1, the U –I relationships is expressed by  t 1 dI + · I · dt (1) U = U1 + U2 + U3 = I · R + L · dt C 0 where symbols R, L, and C are the resistance, inductance, and capacitance, respectively. If we realize the time derivative with a second-order central difference and approximate the time integral by an accumulation operator, (1) is rewritten as U t +t /2 = I t +t /2 · R + L · +

t · C

t /t−1 

I t +t /2 − I t −t /2 t

I i·t +t /2 +

i=0

t t +t /2 ·I C

in which, the symbol . . . is the rounded down operator. Moving the term I t +t /2 to the left of the equal sign, we can obtain the port current of a ZVLE series network by  t +t  t e,L t +t/2 e d i, j,k + e d i, j,k   − C2 id  = C1 · (2) i, j,k 2 where Ct C1 = (3a) RCt + LC + (t)2 t /t−1  (t)2 · I i+t /2 C2 = 2 RCt + LC + (t) i=0

LC · I t −t /2 − . RCt + LC + (t)2

(3b)

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. YOU et al.: ACCURATE ANALYSIS OF FINITE-VOLUME LUMPED ELEMENTS IN METAMATERIAL ABSORBER DESIGN

Fig. 2.

3

ZVLE parallel network in a TDFIT cell.

Since the ZVLEs are mounted on a cell’s edge (Fig. 1), their port voltage U t +t /2 can be approximated by   t ( e d |ti,+t j,k + e d |i, j,k /2). Adding the port current to the conduction current in (24), we can get the final iterative formula as follows: 2De − Te · C1  t 2Te  t +t e d i, j,k = · e d i, j,k + 2 + Te · C1 2 + Te · C1   t +t/2 t +t/2  . · h dprior i, j,k +1 − h dnext i, j,k + C 2 +1 dnext

dprior

(4) The range of R/L/C value in (3) is limited as R ≥ 0, L ≥ 0, C > 0. Here, we should note that if R = 0 and L = 0, (4) is reduced as a way to incorporate a single zero-volume capacitor into TDFIT. Similarly, if L = 0 and C → +∞, a single zero-volume resistor can be modeled by (4). It means a same code based on (4) can be used to simulate each of R/L/C elements as well as their series networks. Thus, (4) is more flexible and general than classic ZVLE solutions. 2) ZVLE Parallel Network: To obtain the port current of a ZVLE parallel network (Fig. 2), its U –I relationship is expressed as  t 1 U dU + · . (5) I = I1 + I2 + I3 = U · dt + C · R L 0 dt Replacing the time derivative with a second-order central difference and approximating the time integral with an accumulation operator, we rewrite (5) as  e,L t +t/2   t +t  t i d  = C1 · e d i, j,k + C2 · e d i, j,k + C3 (6) i, j,k

where C t 1 + + 2R 2t 2L 1 C2 = 2R t /t t   it C  t −t C3 = e d i, j,k − e d i, j,k . L 2t C1 =

(7a) (7b) (7c)

i=1

Adding the port current to the conduction current in (24), we obtain the modified recursion (8) which can be used to

Fig. 3.

Grid dispersion error in traditional ZVLE solutions.

incorporate a ZVLE parallel network into the TDFIT method De − Te · C2  t Te  t +t e d i, j,k = · e d i, j,k + 1 + Te · C1 1 + Te · C1   t +t/2 t +t/2  · h dprior i, j,k +1 − h dnext i, j,k − C3 . +1 dnext

dprior

(8) Unlike the ZVLE series network, the range of R/L/C value in (7) is limited as R > 0, L > 0, C ≥ 0. But, similar to the ZVLE series network, we can prove that (8) is also a general formula that can be used to simulate each of R/L/C elements as well as their parallel networks. B. Challenges In Section II-A, we have obtained two general formulas (4) and (8) to incorporate a ZVLE series/parallel network into TDFIT method. Although the proposed ZVLE-TDFIT method is more general and flexible than classic methods, all ZVLE solutions have two inevitable challenges in high-frequency or microscale EM-circuit simulations. 1) Challenge One (Grid Dispersion Error): As discussed above, the effectiveness of the ZVLE solution is based on two crucial assumptions. One of them is that the effective length of the lumped parts should be far shorter than the pump wavelength. Only in this case, its realistic length can be approximated by the length of one cell’s edge (Figs. 1 and 2). However, with increasing operating frequency, the realistic length of LE is no longer far smaller than the wavelength. For this reason, the approximation error becomes very serious in modern MA designs. To demonstrate the grid-approximation error clearly, a simple example is given in Fig. 3, where an LE is mounted on a microstrip line. In this example, the number of grids on the microstrip line does not vary with the grid number between the points P1 and P2 . The field distribution of the ports 1 and 2 are modulated as TEM mode, and port 1 is excited with a 10-GHz sinusoidal waveform. To record the responding voltage, a voltage monitor is created from the point P1 to P2 . Simulating with different mesh intervals, we can get different resulting voltages. From Fig. 3, it is

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 4

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

obvious that the normalized resulting voltages of each R/L/C element decrease remarkably rather than keeping constant, when the grid number of the gap is increased. In this paper, this numerical error is named grid dispersion error. In fact, such grid dispersion error mainly occurs in the analysis of high-frequency or microscale hybrid circuits. In these cases, the original single LE grid is split into several shorter grids. To keep the principle of current continuity, the ZVLE in Fig. 3 is connected to the metal microstrip with two thin conductive wires. Since the tangential electric field on the conductive wire is zero, the integral port voltage decreases with the increment of the grid number between the gap (from P1 to P2 ). 2) Challenge Two (Coupling and Frequency Response Error): In ZVLE solutions, LEs are treated as zero-volume components, since their size is negligible compared with the distributed circuit. Thus, the coupling effect between the lumped part and the distributed part is ignored in ZVLE solutions. However, in modern MA designs, the structure of MA’s unit cell becomes much smaller and finer. Therefore, the coupling effect is more and more significant. Since the coupling effect has not been considered in ZVLE solutions, many kinds of numerical distortions were found in existing simulations of LE-based MA. Two dominant frequency response errors are summed here. First, the frequency of the maximum absorbing peak obtained by ZVLE solutions is always shifted, compared with the measured result. This frequency offset is a serious challenge to design a narrow-band MA. Second, a number of weak sidelobes can be commonly measured in experiments. But they cannot be observed in ZVLE solutions. This numerical error is unacceptable to develop a high-power frequency-selective MA for multicarrier communications, because some of the valuable carrier signals would be dissipated by the designed frequency-selective MAs.

Comparing (26a) with (10), we can clearly see that they have very similar rational expressions. In the TDFIT method, the medium parameter Mσ is defined as

III. F INITE -VOLUME L UMPED E LEMENT AND M ERITS

From the above derivation, we have proved that a lumped resistor can be incorporated into TDFIT by modifying the original grid’s conductivity σ¯ with (σ¯ + σ R ). Comparing (9) and (15) with (24), it is obvious that (15) is much better than (9) to keep consistency with the regular iteration formula (24). Furthermore, if (9) is used to incorporate a lumped resistor into the TDFIT method, an extra code should be executed at each time step to simulate these special grids mounted with LE. But, if (15) is utilized to realize a lumped resistor, a simple modification is required only once in the preprocessing of the TDFIT code to replace the original conductivity σ¯ with (σ¯ + σ R ). Thus, (15) is more efficient than previous methods to remarkably reduce the complexity of implementation and save computational time. In fact, the concept of a finite-volume resistor has appeared in (12) and (13), where the size of a finite-volume resistor is equal to the size of a cell (˜sk , lk ). Now, we extend (12)–(15) to model a finite-volume cuboid (S, L) whose size is the same as that of a real-world chip resistor [Fig. 4(a)]. If we assume that the cuboid’s conductivity is homogenous, we can conveniently implement an arbitrary finite-volume resistor [Fig. 4(b)] in the TDFIT method by modifying its original conductivity σ¯ with the new one σ R = L/(R · W · H ).

To overcome the challenges in ZVLE solutions, a new FVLE solution based on TDFIT is proposed in this section. Since the chip resistor and capacitor are the most common LEs in MA designs, our attention in this paper is more focused on the implementation of finite-volume resistor and capacitor. A. Finite-Volume Resistor in TDFIT In (5), if we assume C = 0, L = +∞, a time-domain iteration to incorporate a lumped resistor into the TDFIT method is derived as De · 2R − Te  t Te · 2R  t +t e d i, j,k = · e d i, j,k + 2R + Te 2R + Te   t +t/2 t +t/2  . (9) · h dprior i, j,k +1 − h dnext i, j,k +1 dnext

dprior

Substituting (26a) into (9), we have     Mσ + R1 Mσ + De · 2R − Te Mε Mε − + = 2R + Te t 2 t 2 Te · 2R =1 2R + Te





Mσ + Mε + t 2

1 R

1 R



(10a)



.

(10b)



 

Mσ =

ie 

e

s˜ = k

Je · ds

L ki

E · dl

=

Je · s˜k s˜k = σ¯ . σ¯ −1 Je · lk lk

(11)

As we know, the classic definition of a resistor’s conductance is expressed as 1 s˜k = σR . R lk

(12)

Here, the symbol lk indicates the length of the cell’s edge where the resistor is loaded. The term s˜k is the cross-sectional area of its dual cell, and σ R is the conductivity. Comparing (11) with (12), we find that the medium parameter Mσ in TDFIT can be interpreted as the conductance. Based on this special characteristic, we redefine a medium parameter Mσ Mσ = Mσ +

s˜k s˜k 1 s˜k = σ¯ + σ R = (σ¯ + σ R ) . R lk lk lk

(13)

Substituting (13) into (10), we obtain the redefined iterative coefficients De and Te  

M Mε M De · 2R − Te Mε (14a) − σ + σ De = = 2R + Te t 2 t 2  M Te · 2R Mε Te = + σ . =1 (14b) 2R + Te t 2 If we substitute (14) into (9), (9) can be rewritten as follows: t +t  t e d i, j,k = De · e d i, j,k + Te   t +t/2 t +t/2  . (15) · h dprior i, j,k +1 − h dnext i, j,k +1



dnext

dprior

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. YOU et al.: ACCURATE ANALYSIS OF FINITE-VOLUME LUMPED ELEMENTS IN METAMATERIAL ABSORBER DESIGN

Fig. 4. Physical structure of a chip resistor and its simplified model. (a) Physical model. (b) Simplified model.

B. Finite-Volume Capacitor in TDFIT Similar to the derivation in Section III-A, we can derive a time-domain iteration to incorporate a lumped capacitor into TDFIT, if we specify R = +∞ and L = +∞ in (5). To be consistent with the derivation of the finite-volume resistor, a modified U –I relationship of the lumped capacitor is given as follows:     t +t  t C  C · e d i, j,k − e d i, j,k  t +t/2 . (16) i d i, j,k = t Substituting (16) into (24), a new iteration to incorporate a lumped capacitor into TDFIT is obtained as De · t + Te · C  t Te · t  t +t · e d i, j,k + e d i, j,k = t + Te · C t + Te · C   t +t/2 t +t/2  . (17) · h dprior i, j,k +1 − h dnext i, j,k +1 dnext

dprior

If (17) is substituted into (26a), we have

  (Mε + C) Mσ (Mε + C) Mσ De · t + Te · C = − / + t + Te · C t 2 t 2 (18a)

 (Mε + C) Mσ Te · t = 1/ + . (18b) t + Te · C t 2 Comparing (26a) with (18), we can find that they have very similar rational expressions. In TDFIT, the definition of the medium parameter Mε is given as  

Mε =

d



e

s˜ = k L ki

D · ds E · dl

=

D · s˜k s˜k =ε . lk ε−1 D · lk

(19)

Based on the classic definition of capacitance, the value of a capacitor’s capacitance is calculated by C = εc

s˜k lk

(20)

5

Fig. 5. Physical structure of a chip capacitor and its simplified model. (a) Physical model. (b) Simplified model.

where the term s˜k is the effective cross-sectional area of an equivalent parallel-plat capacitor on the dual cell. The symbol lk is the effective length and εc is the permittivity. Comparing (19) with (20), we find the medium parameter Mε in the TDFIT method can be interpreted as the capacitance. Based on this special characteristic, a new parameter Mε is defined as Mε = Mε + C = ε¯

s˜k s˜k s˜k + εc = (¯ε + εc ) . lk lk lk

(21)

Substituting (21) into (18), we obtain the redefined iterative coefficients De and Te

    Mε Mσ Mσ De · t + Te · C Mε = − + De = / t + Te · C t 2 t 2 (22a)

  M M · t T e σ ε = 1/ + . (22b) Te = t + Te · C t 2 If (22) is substituted into (17), (17) is rewritten as  t +t  t e d i, j,k = De · e d i, j,k + Te   t +t/2 t +t/2  . · h dprior i, j,k +1 − h dnext i, j,k +1 dnext

dprior

(23)

Through above derivation, we have proved that a lumped capacitor can be incorporated into TDFIT by modifying its original grid’s permittivity ε¯ with (¯ε + εc ). If the permittivity of a cuboid (S, L) is assumed as homogenous and its size is the same as a real-world chip capacitor [Fig. 5(a)], we can conveniently implement an arbitrary finite-volume capacitor [Fig. 5(b)] in TDFIT by modifying its original permittivity ε¯ with the new one εc = (L · C)/(W · H ). Here, we should note that the equivalent parameters σ R in (13) and εc in (21) can be represented by a broadband expression. Besides, the length L and width W in Figs. 4 and 5 should cover more than one spatial cell in FVLE model.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 6

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

However, in some special cases, the L or W may cover noninteger cells, which will result in some undesirable numerical errors. To overcome this challenge, we have developed a fix-point technique and a conformal technique in our code. Using the fix-point technique, we can appropriately assign some fix points at the boundaries of an FVLE model. By this way, we can ensure that both L and W cover integer cells only. In some special applications, the fix points may excessively increase the number of cells. To further solve this challenge, the natural conformal technique of TDFIT [28] is applied in the FVLE model. Based on the conformal technique, the fractional part of the noninteger cells is well considered in the FVLE model through the material average method. C. Merits of Finite-Volume Lumped-Element Model Compared with ZVLE solutions, there are four major advantages of the proposed FVLE model. First, it is more efficient than ZVLE solutions, since the LE is implemented only once in the preprocessing of the FVLE-TDFIT code rather than carrying out an extra iteration at each time step. Second, the grid dispersion error in ZVLE solutions is effectively eliminated by the FVLE model, since the size of an FVLE is independent of the mesh density. Third, the coupling effect between the lumped part and the distributed part has been considered in the FVLE model; thus, the frequency response errors of ZVLE solutions can be successfully restrained. Finally, the simplification of the physical structures in Figs. 4 and 5 is simple but effective; thus, the FVLE model is flexible to be integrated into existing commercial software and other solvers, such as FDTD, FEM, and DGTD. Moreover, compared with directly modeling the realistic chip LEs [Figs. 4(a) and 5(a)], the FVLE models [Figs. 4(b) and 5(b)] have two extra merits. First, we can conveniently achieve an arbitrary resistance or capacitance in the same FVLE model just by modifying its conductivity or permittivity. Otherwise, we should modify the geometry in simulation for different real-world chip LEs, when its resistance or capacitance changes. Second, some singular structures (thin sheets, fine lines, etc.) of realistic chip LEs are avoided by the FVLE model. Therefore, an enormous number of cells can be greatly reduced by the proposed FVLE model. IV. D ISCUSSION AND VALIDATION In Section II-B, we have shown that the grid dispersion error and frequency response error of classic ZVLE solutions are serious in MA designs. To verify that these two challenges can be successfully overcome by the proposed FVLE model, a realistic LE-based MA is analyzed in this section. Section IV-A is used to demonstrate that the proposed FVLE model is valuable in eliminating the grid dispersion error. Section IV-B is used to show that the frequency response error can be greatly restrained by our FVLE model. In the following Figs. 7 and 10, the numerical results without a special notation are obtained by self-developed TDFIT code package. A. Unit Cell With Lumped Elements In this example, we analyze a unit cell of an MA. As illustrated in Fig. 6(a) and (b), a symmetrical split-ring resonator

Fig. 6. Simulation of a unit cell mounted with LEs. (a) Simulation with ZVLEs. (b) Simulation with FVLEs. (c) Field distribution simulated by the proposed FVLE model. (d) Field distribution simulated by the ZVLE solution.

is applied as the metal pattern of a unit cell to realize a polarization-insensitive feature. In order to achieve perfect impedance matching, two lumped capacitors are mounted on the metal pattern to flexibly adjust the resonant frequency of the unit cell. Meanwhile, two lumped resistors are used to obtain better absorptivity through increasing its loss of joule heat. To clearly demonstrate that the proposed FVLE model is valuable in eliminating the grid dispersion error of ZVLE solutions, a ZVLE model [Fig. 6(a)] and an FVLE model [Fig. 6(b)] are simulated with the same excitation. As shown in Fig. 6(c), the unit cell is vertically illuminated by a plane wave, which carries a 10-GHz ascending cosine waveform. The near-field distributions calculated by the FVLE model and the ZVLE solution are, respectively, given in Fig. 6(c) and (d). Comparing Fig. 6(c) with (d), it is shown that the coupling effect between the LEs and the metal pattern can be observed in Fig. 6(c). However, the same effect is almost negligible in Fig. 6(d). This difference results in different frequency responses, which will be detailed in Section IV-B. In this example, our attention is focused more on the grid dispersion error. A result comparison is given in Fig. 7, where the horizontal axis indicates the cell number of the gap [d in Fig. 6(a)] where the LEs are mounted. As illustrated in Fig. 7, there is a serious grid dispersion error in the result of the ZVLE solution. On the contrary, the result obtained by the proposed FVLE model is converging and only slightly varies within a small error range. Actually, this error of the FVLE model is caused by grid discretization, which can be easily restrained by reducing the spatial interval. Different from the grid discretization error of the FVLE model, the grid dispersion error in ZVLE solutions is uncontrollable and would be

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. YOU et al.: ACCURATE ANALYSIS OF FINITE-VOLUME LUMPED ELEMENTS IN METAMATERIAL ABSORBER DESIGN

Fig. 9.

7

MA measured by the waveguide measurement method.

Fig. 7. Comparison of the capacitor’s responding voltage to verify that the grid dispersion error in traditional ZVLE solutions can be restrained by the proposed FVLE model.

Fig. 10. Comparison of the reflection coefficients between the simulated results and the measured result.

Fig. 8. MA. (a) Simulated model and its geometrical parameters. (b) Measured model.

more serious as the spatial interval is reduced, because its grid dispersion error is related to the mesh interval (Fig. 3). In Fig. 7, there is an overlap region between Regions 1 and 2. This phenomenon proves that the ZVLE solution is still effective in the low-frequency or large-scale cases. But in the high-frequency or microscale EM-Circuit applications, the proposed FVLE model is more applicable, since the grid dispersion error can be effectively restrained. B. Metamaterial Absorber With Lumped Elements In this example, eight unit cells are fabricated on an FR-4b substrate. As shown in Fig. 8(a), the length of the substrate is 22.5 mm. Its width W = 10 mm, and thickness

H = 2 mm. A copper layer with thickness d = 0.035 mm is coated on the back of the substrate. The measured prototype is shown in Fig. 8(b). To tune the frequency of the maximum absorptivity peak from 10 to 15 GHz, 16 chip capacitors (C = 4 pF) are mounted on the metal patterns. Meanwhile, 16 chip resistors (R = 5 ) are applied to design a narrowband MA, which is more suitable to observe the different frequency responses from the ZVLE solutions and the proposed FVLE model. In fact, if we increase the resistance of each resistor in Fig. 8, a wideband MA can be achieved. To measure the absorptivity, a waveguide measurement method is applied in this example. As illustrated in Fig. 9, the MA is placed in two standard BJ-100 waveguide transmission lines. The measurement system is excited by a modulated Gaussian signal with frequency range from 9 to 15 GHz. The reflection coefficients are measured by a vector network analyzer and given in Fig. 10. Compared with the measured result, it is obvious that there are two major frequencyresponse errors in the ZVLE solution. 1) The maximum peak of absorptivity is mismatched. 2) The sidelobes of absorptivity cannot be observed. On the contrary, the result simulated by the proposed FVLE model can match the measured result much better, and the frequency-response error of the ZVLE model has been greatly restrained by the FVLE model, since the coupling effect (Fig. 6) between the lumped part and the distributed part has been considered in the proposed method. Finally, we should note that the difference between the measured result and the FVLE result in Fig. 10 is mainly caused by real-world assembly error and the size error between the

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 8

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

simulated model and the measured prototypes. In numerical simulations, the size of the simulated model [Fig. 8(a)] is almost the same as the size of a BJ-100 waveguide port. However, the length and width of the measured prototype [Fig. 8(b)] are slightly decreased in experiments, so that the measured absorber can be placed into the waveguide transmission line (Fig. 9) successfully. For this reason, there is an inevitable tiny gap between the measured prototype and the inner walls of waveguide in experiments. But our attention in this paper is mainly focused on whether the proposed FVLE model is effective in eliminating the grid dispersion error and restraining the frequency-response error in classic ZVLE solutions. Thus, we think the difference between the measured result and the FVLE result in Fig. 10 is acceptable. V. C ONCLUSION In this paper, we first present a more general approach to incorporate a ZVLE or its series/parallel network into the TDFIT method. Based on this new approach, the major challenges of classic ZVLE solutions in MA designs are discussed subsequently. Through rigorous mathematical derivations, we prove that an arbitrary finite-volume lumped resistor or capacitor in the TDFIT method can be conveniently achieved by modifying its conductivity or permittivity correspondingly. Using this principle, we proposed a new FVLE model to overcome the challenges of traditional ZVLE models. The effectiveness of the proposed FVLE models has been verified by many real-world applications, two of which have been given in this paper. The main innovation and contribution of this paper are summed as follows. First, a new numerical approach is presented to implement the ZVLE series/parallel network in the TDFIT method. Compared with previous ZVLE solutions, the new ZVLE model is more general and flexible to simulate LEs at low-frequency or large-scale EM-circuit applications, because each R/L/C element and its series/parallel network can be realized by the same ZVLE-TDFIT code. Second, to overcome the challenges of ZVLE solutions in the analysis of high-frequency or microscale MAs, a novel FVLE model is proposed after rigorous mathematical derivations, and its merits have been detailed in Section III-C. Third, the effectiveness of the proposed FVLE model to eliminate the grid dispersion error and frequency response error in ZVLE solutions has been validated by some experiments. These improvements are very helpful to design high-performance metamaterial devices in high-power multicarrier communications. A PPENDIX Similar to the classic FIT [34], the TDFIT method [28] is also based on the integral Maxwell’s equations, and its scalar iteration formulas are expressed as follows:  t +t e d i, j,k  t = De · e d i, j,k + Te e,s    t +t/2 t +t/2   t +t/2 · h dprior i, j,k +1 − h dnext i, j,k − i d i, j,k +1 dnext

dprior

(24)

t +t/2 h d i, j,k



 t −t/2 = Dh · h d i, j,k + Th  t t   · e dnext i, j,k − e dprior i, j,k +1 dprior

where

dnext

  m,s t + i d +1 i, j,k

(25)

     Mσ Mε Mσ Mσ Mε Mε − / + ; Te = 1/ + t 2 t 2 t 2 (26a)       Mμ Mμ Mμ Mκ Mκ Mκ − + + Dh = / ; Th = 1/ . t 2 t 2 t 2 (26b) 

De =

To simplify the derivation, some new symbols have been defined in [34] as follows:        ed = E · dl, h d = H · dl; i d = J · ds. Ld

Ld

 sd

More details on the physical meaning of each symbol in the above formulas can be found in [28] and [31]. Here, we should note that the quantities M in (26) can be tensors. R EFERENCES [1] T. J. Cui, D. R. Smith, and R. Liu, Metamaterials: Theory, Design, and Applications. New York, NY, USA: Springer, 2010. [2] N. I. Landy, S. Sajuyigbe, J. J. Mock, D. R. Smith, and W. J. Padilla, “Perfect metamaterial absorber,” Phys. Rev. Lett., vol. 100, pp. 207402-1–207402-4, May 2008. [3] D. Ye et al., “Towards experimental perfectly-matched layers with ultrathin metamaterial surfaces,” IEEE Trans. Antennas Propag., vol. 60, no. 11, pp. 5164–5172, Nov. 2012. [4] T. Liu, X. Cao, J. Gao, Q. Zheng, W. Li, and H. Yang, “RCS reduction of waveguide slot antenna with metamaterial absorber,” IEEE Trans. Antennas Propag., vol. 61, no. 3, pp. 1479–1484, Mar. 2013. [5] B.-Y. Wang et al., “A novel ultrathin and broadband microwave metamaterial absorber,” J. Appl. Phys., vol. 116, no. 9, pp. 094504-1–094504-7, Sep. 2014. [6] H. Tao, N. I. Landy, C. M. Bingham, X. Zhang, R. D. Averitt, and W. J. Padilla, “A metamaterial absorber for the terahertz regime: Design, fabrication and characterization,” Opt. Exp., vol. 16, no. 10, pp. 7181–7188, May 2008. [7] Y. Ma, Q. Chen, J. Grant, S. C. Saha, A. Khalid, and D. R. S. Cumming, “A terahertz polarization insensitive dual band metamaterial absorber,” Opt. Lett., vol. 36, no. 6, pp. 945–947, Mar. 2011. [8] F. Costa, S. Genovesi, A. Monorchio, and G. Manara, “A circuit-based model for the interpretation of perfect metamaterial absorbers,” IEEE Trans. Antennas Propag., vol. 61, no. 3, pp. 1201–1209, Mar. 2013. [9] W. Padilla and X. Liu, “Perfect electromagnetic absorbers from microwave to optical,” SPIE Newsroom, Oct. 2010, pp. 1–3. [10] H. A. Atwater and A. Polman, “Plasmonics for improved photovoltaic devices,” Nature Mater., vol. 9, no. 3, pp. 205–213, Feb. 2010. [11] J.-J. Greffet, “Applied physics: Controlled incandescence,” Nature, vol. 478, pp. 191–192, Oct. 2011. [12] F. Costa, A. Monorchio, and G. Manara, “Analysis and design of ultra thin electromagnetic absorbers comprising resistively loaded high impedance surfaces,” IEEE Trans. Antennas Propag., vol. 58, no. 5, pp. 1551–1558, May 2010. [13] Y. Z. Cheng, Y. Wang, Y. Nie, R. Z. Gong, X. Xiong, and X. Wang, “Design, fabrication and measurement of a broadband polarizationinsensitive metamaterial absorber based on lumped elements,” J. Appl. Phys., vol. 111, no. 4, pp. 044902-1–044902-4, Feb. 2012. [14] M. Yoo and S. Lim, “Polarization-independent and ultrawideband metamaterial absorber using a hexagonal artificial impedance surface and a resistor-capacitor layer,” IEEE Trans. Antennas Propag., vol. 62, no. 5, pp. 2652–2658, May 2014.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. YOU et al.: ACCURATE ANALYSIS OF FINITE-VOLUME LUMPED ELEMENTS IN METAMATERIAL ABSORBER DESIGN

[15] P. B. Johns, “On the relationship between TLM and finite-difference methods for Maxwell’s equations,” IEEE Trans. Microw. Theory Techn., vol. MTT-35, no. 1, pp. 60–61, Jan. 1987. [16] R. H. Voelker and R. J. Lomax, “A finite-difference transmission line matrix method incorporating a nonlinear device model,” IEEE Trans. Microw. Theory Techn., vol. 38, no. 3, pp. 302–312, Mar. 1990. [17] W. Sui, D. A. Christensen, and C. H. Durney, “Extending the twodimensional FDTD method to hybrid electromagnetic systems with active and passive lumped elements,” IEEE Trans. Microw. Theory Techn., vol. 40, no. 4, pp. 724–730, Apr. 1992. [18] M. Piket-May, A. Taflove, and J. Baron, “FD-TD modeling of digital signal propagation in 3-D circuits with passive and active loads,” IEEE Trans. Microw. Theory Techn., vol. 42, no. 8, pp. 1514–1532, Aug. 1994. [19] P. Ciampolini, P. Mezzanotte, L. Roselli, and R. Sorrentino, “Accurate and efficient circuit simulation with lumped-element FDTD technique,” IEEE Trans. Microw. Theory Techn., vol. 44, no. 12, pp. 2207–2215, Dec. 1996. [20] J. A. Pereda, F. Alimenti, P. Mezzanotte, L. Roselli, and R. Sorrentino, “A new algorithm for the incorporation of arbitrary linear lumped networks into FDTD simulators,” IEEE Trans. Microw. Theory Techn., vol. 47, no. 6, pp. 943–949, Jun. 1999. [21] T.-L. Wu, S.-T. Chen, and Y.-S. Huang, “A novel approach for the incorporation of arbitrary linear lumped network into FDTD method,” IEEE Microw. Wireless Compon. Lett., vol. 14, no. 2, pp. 74–76, Feb. 2004. [22] O. González, J. A. Pereda, A. Herrera, and Á. Vegas, “An extension of the lumped-network FDTD method to linear two-port lumped circuits,” IEEE Trans. Microw. Theory Techn., vol. 54, no. 7, pp. 3045–3051, Jul. 2006. [23] O. González, A. Grande, J. A. Pereda, A. Herrera, and Á. Vegas, “A study on the stability and numerical dispersion of the lumpednetwork FDTD method,” IEEE Trans. Antennas Propag., vol. 57, no. 7, pp. 2023–2033, Jul. 2009. [24] R. Wang and J.-M. Jin, “Incorporation of multiport lumped networks into the hybrid time-domain finite-element analysis,” IEEE Trans. Microw. Theory Techn., vol. 57, no. 8, pp. 2030–2037, Aug. 2009. [25] R. Wang and J. M. Jin, “A flexible time-stepping scheme for hybrid field-circuit simulation based on the extended time-domain finite element method,” IEEE Trans. Adv. Packag., vol. 33, no. 4, pp. 769–776, Nov. 2010. [26] P. Li and L. J. Jiang, “Integration of arbitrary lumped multiport circuit networks into the discontinuous Galerkin time-domain analysis,” IEEE Trans. Microw. Theory Techn., vol. 61, no. 7, pp. 2525–2534, Jul. 2013. [27] P. Li and L. J. Jiang, “A hybrid electromagnetics-circuit simulation method exploiting discontinuous Galerkin finite element time domain method,” IEEE Microw. Compon. Lett., vol. 23, no. 3, pp. 113–115, Mar. 2013. [28] J. You, S. Tan, J. Zhang, W. Cui, and T. Cui, “A uniform time-domain finite integration technique (TDFIT) using an efficient extraction of conformal information,” IEEE Antennas Propag. Mag., vol. 56, no. 2, pp. 63–75, Apr. 2014. [29] J. W. You, S. R. Tan, X. Y. Zhou, W. M. Yu, and T. J. Cui, “A new method to analyze broadband antenna-radome interactions in timedomain,” IEEE Trans. Antennas Propag., vol. 62, no. 1, pp. 334–344, Jan. 2014. [30] J. W. You, H. G. Wang, J. F. Zhang, S. R. Tan, and T. J. Cui, “Accurate numerical analysis of nonlinearities caused by multipactor in microwave devices,” IEEE Microw. Wireless Compon. Lett., vol. 24, no. 11, pp. 730–732, Nov. 2014. [31] J. W. You, H. G. Wang, J. F. Zhang, S. R. Tan, and T. J. Cui, “Accurate numerical method for multipactor analysis in microwave devices,” IEEE Trans. Electron Devices, vol. 61, no. 5, pp. 1546–1552, May 2014. [32] J. W. You, H. G. Wang, J. F. Zhang, Y. Li, W. Z. Cui, and T. J. Cui, “Highly efficient and adaptive numerical scheme to analyze multipactor in waveguide devices,” IEEE Trans. Electron Devices, vol. 62, no. 4, pp. 1327–1333, Apr. 2015. [33] A. Taflove and S. C. Hagness, Computational Electrodynamics: The Finite-Difference Time-Domain Method, 3rd ed. Norwood, MA, USA: Artech House, 2005. [34] M. Clemens and T. Weiland, “Discrete electromagnetism with the finite integration technique,” Prog. Electromagn. Res., vol. 32, no. 1, pp. 65–87, 2001.

9

Jian Wei You (GSM’14–M’16) received the B.Sc. degree in electrical engineering from Xidian University, Xi’an, China, in 2010, and the Ph.D. degree in electromagnetic and microwave technology from Southeast University, Nanjing, China, in 2016. He served as a Research Visitor with the Department of Electrical and Computer Engineering, University of Houston, Houston, TX, USA, in 2011. In 2016, he joined the Department of Electronic and Electrical Engineering, University College London, London, U.K., as a Research Associate. His current research interests include computational electromagnetics, nonlinear microwave and optics, quantum metamaterial, and multiphysics and plasma physics via computer simulation.

Jian Feng Zhang was born in Shandong, China, in 1979. He received the B.E. degree from Shandong University, Shandong, in 2000, the M.E. degree from the 14th Research Institute of CETC, Nanjing, China, in 2004, and the Ph.D. degree from Southeast University, Nanjing, China, in 2008. He is currently a Lecturer with the School of Information Science and Engineering, Southeast University. His current research interests include computational electromagnetics and fast algorithms.

Wei Xiang Jiang was born in Jiangsu, China, in 1981. He received the B.S. degree in mathematics from Qingdao University, Qingdao, China, in 2004, and the M.S. degree in applied mathematics and Ph.D. degree in electrical engineering from Southeast University, Nanjing, China, in 2007 and 2011, respectively. He was promoted to Professor in 2015. His current research interests include electromagnetic theory and metamaterials. Prof. Jiang served as the Organization Committee Co-Chair of the International Workshop on Metamaterials in 2012. His research has been selected as Research Highlights in the Europhysics News Journal, the Journal of Physics D: Applied Physics, and Applied Physics Letters.

Hui Feng Ma was born in Jiangsu, China, in 1981. He received the B.Sc. degree in electronics engineering from the Nanjing University of Science and Technology, Nanjing, China, in 2004, and the Ph.D. degree from Southeast University, Nanjing, China, in 2010. He joined the School of Information Science and Engineering, Southeast University, in 2010, where he became a Professor in 2015. He has published his work in Nature Communications, Applied Physic Letters, and Optics Express. His current research interests include metamaterial antennas, invisible cloaks, and other novel metamaterial functional devices, including theoretical design and experimental realization. Prof. Ma’s research of 3-D ground carpet cloak realized by the use of metamaterials has been selected as one of the 10 Breakthroughs of Chinese Science in 2010.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 10

Wan Zhao Cui received the B.S. degree in electrical engineering from the Lanzhou Railway Institute, Lanzhou, China, in 2002, and the Ph.D. degree in electrical engineering from Xi’an Jiaotong University, Xi’an, China, in 2006. He has been with the China Academy of Space Technology, Xi’an, China, since 2006. He has been the Vice Director of the National Key Laboratory of Science and Technology on Space Microwave since 2008. Since 2013, he has been a Professor with the China Academy of Space Technology. His current research interests include artificial metamaterials, high-power microwave components, and artificial intelligent.

Tie Jun Cui (M’98–SM’00–F’15) received the B.Sc., M.Sc., and Ph.D. degrees from Xidian University, Xi’an, China, in 1987, 1990, and 1993, respectively, all in electrical engineering. In March 1993, he joined the Department of Electromagnetic Engineering, Xidian University, where he became an Associate Professor in November 1993. From 1995 to 1997, he was a Research Fellow with the Institut fur Hochstfrequenztechnik und Elektronik, University of Karlsruhe, Karlsruhe, Germany. In 1997, he joined the Center for Computational Electromagnetics, Department of Electrical and Computer Engineering, University of Illinois at Urbana–Champaign, Champaign, IL, USA, first as a Post-Doctoral Research Associate and then a Research Scientist. In 2001, he became a Cheung-Kong Professor with the Department of Radio Engineering, Southeast University, Nanjing, China. He is currently the Associate Dean of the School of Information Science and Engineering, and the Associate Director of the State Key Laboratory of Millimeter Waves.

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

He is a Co-Editor of Metamaterials-Theory, Design, and Applications (Springer, 2009). He has authored six book chapters. He has published over 300 peer-review journal papers in Science, PNAS, Nature Communications, Physical Review Letters, and IEEE T RANSACTIONS. His research has been cited more than 12 000 times. His current research interests include metamaterials, computational electromagnetic, wireless power transfer, and millimeterwave technologies. Prof. Cui was the recipient of a Research Fellowship from the Alexander von Humboldt Foundation, Bonn, Germany, in 1995, a Young Scientist Award from the International Union of Radio Science in 1999, a Cheung Kong Professorship under the Cheung Kong Scholar Program by the Ministry of Education, China, in 2001, the National Science Foundation of China for Distinguished Young Scholars in 2002, the Special Government Allowance by the Department of State, China, in 2008, the Award of Science and Technology Progress from the Shaanxi Province Government in 2009, a May First Labour Medal by the Jiangsu Province Government in 2010, the First Prize of Natural Science from the Ministry of Education, China, in 2011, and the Second Prize of National Natural Science, China, in 2014. According to Elsevier, he is one of the Most Cited Chinese Researchers. His research has been selected as one of the 10 Breakthroughs of China Science in 2010, Best of 2010 in the New Journal of Physics, and Research Highlights in Europhysics News, the Journal of Physics D: Applied Physics, Applied Physics Letters, and Nature China. His work has been reported by Nature News, Science, MIT Technology Review, Scientific American, and New Scientists. He is an active Reviewer for Science, Nature Materials, Nature Photonics, Nature Physics, Nature Communications, Physical Review Letters, Advanced Materials, and a series of the IEEE T RANSACTIONS. He was an Associate Editor of the IEEE T RANSACTIONS ON G EOSCIENCE AND R EMOTE S ENSING and a Guest Editor of Science China Information Sciences. He served as an Editorial Staff Member of the IEEE Antennas and Propagation Magazine, and is on the Editorial Boards of Progress in Electromagnetic Research and the Journal of Electromagnetic Waves and Applications. He served as the General Co-Chair of the International Workshops on Metamaterials in 2008 and 2012, and a TPC Co-Chair of the Asian–Pacific Microwave Conference in 2005 and the Progress in Electromagnetic Research Symposium in 2004.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

1

Designing Optimal Surface Currents for Efficient On-Chip mm-Wave Radiators With Active Circuitry Kaushik Sengupta, Member, IEEE, and Ali Hajimiri, Fellow, IEEE Abstract— Integrated antennas have become the attractive solution as the electromagnetic (EM) interface for mm-Wave and terahertz ICs. However, on-chip antennas lying at the interface between two different dielectrics (such as air and substrate) can channel most of its power into multiple nonradiative surfacewave modes, reducing efficiency drastically. In this paper, we consider the following problem: given a dielectric substrate, what is the theoretical optimal 2-D surface-current configuration that collectively suppresses surface waves and maximizes radiation efficiency with the desirable radiation pattern? This paper also discusses demonstrative examples of a circuit-EM codesign approach to realize the approximation of such current configurations. Measurement results of radiating arrays in CMOS at mm-Wave frequencies (250–300 GHz) are presented and compared with theoretical predictions. Index Terms— Beamforming, CMOS, distributed active radiation, EIRP, near field, on-chip antenna, power combining, power generation, radiation, substrate modes, terahertz (THz).

I. I NTRODUCTION

I

N THE past few years, high-frequency integrated systems in the millimeter-wave (mm-Wave) and terahertz (THz) range have demonstrated a lot of promising new applications in high-speed wireless communication, imaging, sensing, health care, and global environment monitoring [1]–[16]. One of the critical challenges has been the electromagnetic (EM) interface. Noticeably, at such high frequencies, the die size itself can become comparable with the wavelength of operation, and therefore, in principle, the antennas can be integrated directly into the chip. However, an on-chip antenna embedded in a dielectric interface between air and silicon can excite multiple substrate modes, which can critically affect radiation patterns, efficiency, and the antenna impedance. The surface-wave effect, therefore, is traditionally mitigated using off-chip silicon lenses [5], [6], [13] or by using high-dielectric superstrates [16]. While a single oscillating current element can excite these unwanted modes, a complex 2-D current configuration J (x, y)e j ψ(x,y) can collectively suppress them. The suppression of surface waves with careful passive antenna designs Manuscript received August 17, 2015; revised January 16, 2016 and April 28, 2016; accepted May 22, 2016. K. Sengupta is with the Department of Electrical Engineering, Princeton University, Princeton, NJ 08544 USA (e-mail: [email protected]). A. Hajimiri is with the Department of Electrical Engineering, California Institute of Technology, Pasadena, CA 91125 USA (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TMTT.2016.2573278

Fig. 1. Active EM current synthesis on chip for desirable radiated and surface-wave fields.

and employing antenna arrays have been demonstrated in [17]. However, the modern state-of-the-art IC technology allows the integration of very large number of high-speed transistors and fine resolution back-end-of-the-line metal structures, and it may be possible to create a more distributed and multifunctional EM current configuration at length scales much smaller than the wavelength, compared with the classical passive antenna structures or periodic array configuration. The concept is shown in Fig. 1, which shows a multitude of coupled highfrequency active devices driven with amplitude and phase control into a distributed EM structure that collectively generates an efficient radiating current surface. In this paper, we consider the following problem: for a given dielectric substrate and frequency of operation, what is the optimal J (x, y)e j ψ(x,y) for maximum radiation efficiency and desirable radiation patterns and how do we synthesize such current configuration on-chip through circuit-EMs codesign techniques? This paper is organized as follows. In Section II, we introduce the framework of describing the dielectric environment by its EM impulse response and transfer function. In Section III, we discuss the methods to reach near-optimal current surface configurations using a circuit-EM codesign approach. Section IV presents the measurement results of some demonstrative examples of power-generating and radiating silicon ICs in the range of 250–300 GHz. II. G ENERAL A NALYSIS F RAMEWORK : E LECTROMAGNETIC I MPULSE R ESPONSE Radiating elements can excite multiple surface-wave modes inside a dielectric (grounded/ungrounded), due to which the radiation efficiency [η = Prad /(Prad + Psw )] of an integrated

0018-9480 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 2

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Fig. 2. Distribution of generic TE and TM modes excited by an infinitesimal Hertzian dipole placed on a semi-infinite dielectric with finite thickness (h).

antenna can be substantially less than one even for a lossless dielectric of thickness (h) [18]–[27]. The surface-wave field configuration of a 2-D radiating surface can be evaluated from the dyadic Green’s function, which captures the field configuration of a single Hertzian current element, which we refer to as h rad and h sw , as shown in Fig. 2. The closed-form expressions for radiative and surface-wave fields (h rad and h sw ) for a current impulse on a grounded dielectric have been covered in the previous works by treating it as the classical Sommerfeld problem of an antenna on lossy earth [28], [29]. The determination of exact surface waves requires the calculations of residues of the poles of Sommerfeld integrals [30], and the extensive work on antennas embedded in multilayered dielectrics based on this method has been reported in [31]–[50]. In general, in the far-field regime, due to power conservation, Green’s dyadic varies as h rad (r, φ, θ ) = (spherical co-ordinates) and h rad (φ, θ )(e− j k0 r /r ) √ h sw (ρ, φ, z) = h sw (φ, z)(e− jβsw ρ / ρ) (polar co-ordinates), where k0 and βsw represent the wavenumbers of the respective modes. From the spectral properties of Green’s function, given the EM radiative and surface-wave impulse responses (h rad and h sw ), the radiated fields for a general surface-current profile J (x, y)e j ψ(x,y) at any far-field point P(r, φ, θ ) in space can be represented as Frad (ξ, γ ) = Hrad (ξ, γ )J (ξ, γ )

(1)

where ξ = k0 sin θ cos  γ = k0 sin θ sin φ, J (ξ, γ ) =  φ, F(J (x, y)e j ψ(x,y)) = y x J (x, y)e j ψ(x,y)e j (xξ +yγ )d x d y is the Fourier transform of the surface-current profile, and Frad (ξ, γ ) and Hrad (ξ, γ ) are the transformations of Frad (φ, θ ) and h rad (φ, θ ) from the spatial domain into the (ξ, γ ) domain. Therefore, given a transfer function Hrad (ξ, γ ) and desired EM radiative field profile described by Frad (ξ, γ ), one can evaluate the necessary surface-current profile on the chip as J (x, y)e j ψ(x,y) as   Frad (ξ, γ ) (2) J (x, y)e j ψ(x,y) = F−1 Hrad (ξ, γ ) if the inverse Fourier exists. It can be noted that this solution, even if it exists, can be impractical to realize. Since there is no assumption of continuity of the currents at any point, the realization of such a surface may necessitate multiple tiny radiators with precise amplitude and phase control, which may

Fig. 3. EM response in a dielectric can be completely described by multiple parallel spectral filters [Hrad (ξ, γ ) and Hsw (ξ, γ )]. The input to these filters is the planar current configuration J (x, y)e j ψ(x,y) , and the output represents the spatial distribution of all the modes.

be very inefficient to realize. In Section III, we will discuss practical examples, where the desired surface-current profile can be approximated to a great extent by efficient actively driven novel EM structures. This representation also allows us to evaluate the radiated power as the output energy of the Hrad (ξ, γ ) filter whose input is J (ξ, γ )  Prad = |Frad (φ, θ )|2 d Sθ,φ S  = |Hrad (ξ, γ )|2 |J (ξ, γ )|2 d Sξ,γ . (3) S

In a similar fashion, it can be shown that the surface-wave field configuration of a given mode excited by an arbitrary J (ξ, γ ) can be represented as Fsw (ξ, γ ) = Hsw (ξ, γ )J (ξ, γ )|γ 2 +ξ 2 =βsw 2

(4)

where ξ = βsw cos φ, γ = βsw sin φ, and Hsw (ξ, γ ) is the surface-wave transfer function of the desirable mode. Unlike the radiative case, γ and ξ are not independent 2 . In summary, given the but related as γ 2 + ξ 2 = βsw dielectric substrate, the radiative and various surface-wave excitations can be represented by multiple parallel spectral filters [Hrad (ξ, γ ) and Hsw (ξ, γ )] excited by the 2-D current distribution J (x, y)e j ψ(x,y). The output of the filters represents the spatial distribution of all possible modes, and therefore, the optimally efficient current distribution (Jopt ) reduces to finding the input which maximizes the output energy of the Hrad filter, while minimizing the output energy of the surface-wave filters Hsw s. This is shown in Fig. 3. The analytical expressions for the radiative and surfacewave impulse responses (h rad and h sw ) for a grounded lossless infinite dielectric have been covered in [17] (summarized in the Appendix) and allow us to evaluate the various excited mode configurations for a given dielectric and frequency, thereby

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. SENGUPTA AND HAJIMIRI: DESIGNING OPTIMAL SURFACE CURRENTS FOR EFFICIENT ON-CHIP mm-WAVE RADIATORS

3

Fig. 4. Total surface-wave power broken down into its different modes and the total radiated power for a Hertzian dipole for different substrate thicknesses.

Fig. 6. Optimal surface-current configuration maximizing the radiated power and minimizing the surface-wave power. The 1-D temporal analogy is only used as a surrogate for easier understanding and intuition on the 2-D spatial problem at hand.

Fig. 5. Radiation efficiency of a Hertzian dipole at 300 GHz against different substrate thicknesses.

constructing the desirable Jopt . As an illustrative example, Fig. 4 shows the power trapped in the individual surfacewave modes, normalized to the maximum radiated power, for different substrate heights at 300 GHz for a planar grounded silicon substrate. It can be seen that the radiated power maximizes for h = 70 μm and h = 220 μm. At these substrate thicknesses, the waves radiated at 300 GHz, after suffering reflection from the ground below, add coherently at the top of the substrate. This is also illustrated in the radiation efficiency variation, as shown in Fig. 5, which shows that the peak radiation efficiency of current element in an optimized lossless substrate is approximately 30% for h > λsub /4. The efficiency theoretically can be improved with thinner dielectrics, but the radiation resistance can become much smaller than any practical conductor losses reducing efficiency drastically. In a practical implementation, efficiency will be further degraded due to substrate losses. The power carried by different modes is shown in Fig. 4, which shows that for an optimized substrate height of 220 μm, the first excited TM0 mode (strongest for h = 70 μm) is responsible for only 15% of the surface-wave power, while TM1 contributes to 52% and TE1 contributes to the remaining 33%. In this following section, we focus on lossless substrates that emphasize the effect of excitation of surface waves. However, the case of lossy substrate can be encompassed through a complex dielectric coefficient as shown in Section IV. Therefore, as shown in Fig. 3, the EM response of a substrate can be represented by a chain of spectral transfer functions Hrad (ξ, γ ), Hsw (ξ, γ ), and |H(ξ, γ )|2 |J (ξ, γ )|2 represents the output power flux density of the corresponding mode. We are interested in shaping J (ξ, γ ) with given input

 power ( |J (ξ, γ )|2 dξ dγ = C) that maximizes the area under the curve |Hrad (ξ, γ )|2 |J (ξ, γ )|2 , while simultaneously minimizing the area under |Hsw (ξ, γ )|2 |J (ξ, γ )|2 (Fig. 6). Evidently, simultaneous optimization may not be always possible, but a cost function can be defined that considers radiated as well as leaked surface-wave power. For a given spectral filter Hrad (ξ, γ ), the optimal current configuration Jopt,rad that maximizes the filter output energy (which corresponds to radiated power) can be obtained by concentrating all the spectral components of the input signal in that portion of the spectrum where the filter response is strongest, as designated by Hrad,max in Fig. 6. This can also be proved by splitting the input spectrum and the filter response into discrete segments ((ξ1 , γ1 ), (ξ2 , γ1 ).., (ξi , γ j ), ..) and (H1,1, H2,1 .., Hi, j , ..) and evaluating the output 2 2 2) ≈ energy   as E2y = 2 |H 2(ξ, γ )| |J (ξ, γ )|2 dξ dγ /(4π 2 |Hi, j | |Ji, j | /(4π ) ≤ |Hrad,max| E x /(4π ), where i, j   Ex = |Ji, j |2 is the input signal energy. Therefore, radiated power is maximized by designing the current configuration as a spectral impulse located at ξmax , γmax , as shown in Fig. 6. In a similar fashion, the surface-wave power can be minimized by designing the current configuration as an impulse located at ξmin , γmin , which represents the location of minimum of the relevant or dominant Hsws. In case where both cannot be achieved simultaneously, a numerical procedure can be applied to shape J (ξ, γ ) that optimizes a cost function which considers both radiated and leaked powers. Two such examples are shown in Fig. 7 for two siliconbased dielectric thicknesses of h = 220 μm and h = 380 μm. For h = 220 μm, the radiation pattern from (10) and (11) has a broadside maximum (φmax = 0, θmax = 0) [equivalently, (ξmax = 0, γmax = 0)], which implies Jopt (ξ, γ ) is an impulse located at (ξmax = 0, γmax = 0). In the (x, y) plane, this is

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 4

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Fig. 7. Optimal surface-current configuration for different radiative impulse responses in different geometries. (a) Radiative impulse response h rad (θ, φ). (b) Radiative transfer function Hrad (ξ, γ ) and J (ξ, γ ) for maximum radiated power. (c) Equivalent J (x, y). (d) Surface-wave excitation for such J (x, y).

Fig. 8.

Circuit-EM codesign. Realization of an optimal 2-D surface current through an array of traveling-wave radiators.

equivalent to constant current surface, as shown in Fig. 7. This choice of 2-D current also simultaneously minimizes surfacewave excitation by minimizing the overlap of |J (ξ, γ )|2 and 2 (Fig. 7). The |Hsw (ξ, γ )|2 on the circle γ 2 + ξ 2 = βsw intuitive argument behind the result is that in a constant current surface for each current element, there exists another current element separated by λTM /2 and λTE /2, which suppresses each others’ substrate modes. For a finite substrate with finite surface area, the optimal current remains similar since in that case, |J (ξ, γ )|2 ∼ |(sin(γ )/γ )(sin(ξ )/ξ )|2 still concentrates the maximum power at (ξmax = 0, γmax = 0). For substrate thickness of h = 380 μm whose Hrad (ξ, γ ) is shown in Fig. 7, the optimal radiating current is given by Jopt,rad (ξ, γ ) = δ(ξ − ξmax , γ − γmax ), which corresponds to Jopt,rad (x, y) = Ce− j ξmax x e− j γmax y = Ce− j γmax y that represents an embodiment of the continuous phased-array surface. However, unlike the previous example, Jopt,rad (ξ, γ ) has a strong overlap with Hsw (TE) at |γ | = 1, |ξ | = 0 implying a strong TE mode excitation, as shown in Fig. 7. Therefore, the optimal current solution for maximizing radiation efficiency can only be analyzed by simultaneously considering both the radiative and surface-wave impulse responses together.

III. C IRCUIT-E LECTROMAGNETIC C ODESIGN FOR O PTIMAL S URFACE C ONFIGURATION : ACTIVE S YNTHESIS OF O PTIMAL C URRENT S URFACES While the spectral transfer functions allow us to evaluate qualitatively and quantitatively radiative properties of the current surface, the realization of this optimal solution is neither unique nor straightforward, but it sets a guiding principle on which an approximation of such a current surface can be realized. The following example is an illustration of this. Based on the design example of optimal current surface in Fig. 7, the continuous current surface is first discretized in 2-D space with multiple discrete current elements separated by approximately λTM /2 and λTE /2, as shown in Fig. 8. This is done to suppress the strongest substrate modes TM1 and TE1 , as shown in Fig. 4, for a dielectric thickness of h = 220 μm. It is not necessary and of course not practical to realize each current element as individual antenna with separate driving signal. An example of a more optimal design methodology is illustrated here. Consider one such element in Fig. 8, where the four current vectors are separated by λTE /2. Since substrate mode suppression relies on the separation among the current elements

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. SENGUPTA AND HAJIMIRI: DESIGNING OPTIMAL SURFACE CURRENTS FOR EFFICIENT ON-CHIP mm-WAVE RADIATORS

5

Fig. 9. Full-wavelength traveling-wave structure embedded in dielectric sustains traveling-wave current distribution in oxide, while the substrate modes are excited in the silicon substrate.

A. Circuit Design and Optimal Impedance Matching

Fig. 10. Variation of the excited TM and TE mode wavelengths inside grounded silicon substrate at 300 GHz.

oriented in the same direction, the two current elements in the x-direction are phase rotated by π/2 without significantly degrading efficiency, as shown in Fig. 8. This reasoning is only approximate as in any other direction, and the two orthogonal current elements do interact as their surface fields superimpose. If we now allow this configuration to flow on a metal loop, it represents an approximation of a fullwavelength rotary traveling-wave antenna current distribution, as shown in Fig. 9. The instantaneous current distribution in such a rotary traveling-mode antenna is shown in Fig. 9. The current distribution follows a sin(φ) variation with angle, and the strongest current elements are located at φ = π/2 and φ = 3π/2 in the same direction. The important condition for such a traveling-wave structure to emulate the four current element group in Fig. 8 is that the loop diameter needs to be λTE /2 to cancel the strong TE mode, while the circumference needs to be one wavelength long. This can be simultaneously achieved since the loop antenna is embedded in an oxide environment interfacing with air, while the substrate modes propagate in the silicon substrate, as shown in Fig. 9. Due to the ratio of the dielectric constants of oxide (εr,oxide = 4) and silicon (εr,si = 11.7), if 2a = λTE /2 ≈ λsi /2, then 2πa ≈ πλsi /2 ≈ λox [55] (Fig. 10). As we will see later, this current configuration can be efficiently realized in actively driven multifunctional EM structures. This illustrates an example, where an optimal continuous 2-D current surface is progressively approximated to a distribution which can be realized with active-EM structures, in this case the array of distributed traveling-wave radiators. With time, the entire configuration just rotates about the perpendicular axis due to its traveling-wave nature. As a result, the surface-wave profile as well as radiation pattern is circularly symmetric.

The traveling-wave radiator can be realized by pumping currents over the circumference with appropriate phase distribution, as shown in Fig. 8. The phase distribution can be generated in multiple ways. In [51] and [52], the distributed ports are unilaterally driven to sustain traveling-wave radiation, as shown in Fig. 9. In [53]–[57], the driving signals were generated from the loop itself, by turning the loop into a resonator and a traveling-wave oscillator at the fundamental frequency f 0 and a radiator at the desired harmonic frequency. Such an active EM structure is shown in Fig. 9, where signal generation, frequency multiplication, and radiation of the second harmonic in-phase currents happen simultaneously, while the fundamental mode gets filtered quasi-optically. The closed circuit-electromagnetism codesign opens the space for active synthesis of current configurations for the desired EM behavior at different frequencies and also simultaneous impedance matching for optimal power transfer. This is especially important at THz frequencies, where parasitic detuning, modeling inaccuracies, and ohmic losses can critically degrade efficiency. By subsuming, the matching networks into the radiator structure can eliminate the problem. This is an enabling feature of a traveling-wave radiator, where symmetry of the driving structure ensures uniform radiative impedance over the loop. This is shown in Fig. 9, where the antenna is designed to produce optimal power-generation radiation impedance for maximizing conversion efficiency from driving signal at 150 GHz to radiated second harmonic power at 300 GHz. Simultaneously, due to the traveling-wave nature, there exists no current null or resonance peaks (like a standing-wave radiator). This renders broadband impedance characteristics unlike on-chip patch antennas [58]. Due to such favorable driving-point impedances, desirable radiation properties, and surface-wave suppression, a traveling-wave multiport driven radiator can be effectively realized as a highefficiency integrated THz source and antenna [51]. Now, we will analyze quantitatively radiative and surface-wave modes of a traveling-wave radiator. B. Surface-Wave Suppression In Fig. 9, assuming that the instantaneous current distribution along the loop is I (φ) = |I0 sin(φ)|, the strongest current elements at φ = π/2 and φ = 3π/2 contribute to the strongest TE modes along the y-axis (Fig. 2). By design, these should be canceled mutually. From (13), the TE mode field due to

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 6

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Fig. 11. Variation of the strong TE1 power suppression with the radius of the traveling-wave radiator at 300 GHz with h = 220 μm. The minimum a ≈ λTE1 /4 corresponds to one full-wavelength radiator in the oxide.

Fig. 12. Total surface-wave power broken down into its different modes and the total radiated power for a traveling-wave loop for different substrate thicknesses. The plot shows almost complete suppression of the TE1 mode.

the traveling-wave loop of current is given by Hzd,loop(a) =

Hzd0 sin φ cos(φ−φ O )e− j aβTE cos(φ−φ O ) dφ





max

|φ O |≤π/2

0

(5) where the maximization is done over the observation angle φ O because of the circularly symmetric nature of the surfacewave fields. Expectedly, the TE1 mode is considerably suppressed when the phase shift corresponding to one diameter traversal (2aβTE1) is approximately 180° (2a ≈ λTE1 /2), as shown in Fig. 11. It can be seen that the angle is slightly more than 180°, due to the contribution of the current elements along the circumference of the loop and not just at φ = π/2 and φ = 3π/2. The TM mode, however, has a maximum in the φ = 0 direction, as shown in Fig. 2. Along the direction, the maximum current elements at φ = π/2 and φ = 3π/2 excite the TM mode in phase. Therefore, we expect much less cancelation of the TM modes. C. Total Radiated Power and Radiation Efficiency In free space due to much longer wavelengths (λ0 ≈ 3λTE,1 ), the maximum phase difference between two current elements at any angle is much less than π. Therefore, the radiated power is hardly suppressed. The resultant radiated far fields from the traveling-wave loop can be calculated from (10) and (11) as E θ,loop(a, θ, φ)





= F1 (θ ) max

|φ O |≤π/2

E φ,loop(a, θ, φ)

sin φ sin(φ −φ O )e



0





= F2 (θ ) max

|φ O |≤π/2

− j aβ0 sin θ cos(φ−φ O )

sin φ cos(φ −φ O )e

dφ .

− j aβ0 sin θ cos(φ−φ O )

0

(6) As expected and shown in Fig. 12, at the designed substrate thickness of 220 μm, we see an almost complete suppression of the TE1 mode (compared with Fig. 4) for a distributed rotary traveling-wave radiator.

Fig. 13. Ratio of total power radiated by a dipole pair to the sum of the power of two single dipoles for the configuration shown.

Fig. 14. Radiation power suppression in a coupled dipole pair. Ratio of total power radiated by a dipole pair to the sum of the power of two single dipoles is also shown.

D. Array of Traveling-Wave Radiators: Optimal Surface-Current Distribution A suitably designed array configuration with the proposed radiators can suppress the remaining TM1 mode. In an array, as the radiated fields of the constituent elements interfere, the total radiated power can be significantly manipulated due to mutual coupling of the antennas. This is illustrated as an example with two Hertzian dipole pairs in free space arranged in the parallel and broadside configuration, as shown in Figs. 13 and 14. Assuming equal excitations in both the current elements, the radiated fields at any elevation angle θ for the two cases can be obtained using superposition as E0 sin θ (1 + e− jβ0 d cos θ ) r E0 sin θ (1 + e− jβ0 d sin θ ). = r

E θ,dpair1 = E θ,dpair2

(7)

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. SENGUPTA AND HAJIMIRI: DESIGNING OPTIMAL SURFACE CURRENTS FOR EFFICIENT ON-CHIP mm-WAVE RADIATORS

Fig. 15. 2 × 2 arrays of traveling-wave radiators implementing an optimal surface-current distribution suppressing the surface-modes significantly, as shown in Figs. 16 and 17. This surface-current configuration emulates the near-optimal current distribution in Fig. 8.

The total radiated power in Figs. 13 and 14 can be obtained as  π  2π 1 |E θ,dp |2r 2 sin θ dφdθ Prad,dpair1 = 2η0 0 0    π E 02 π 3 2 β0 d cos θ dθ (8) 4 sin θ cos = η0 0 2    π E 02 π 3 2 β0 d sin θ dθ. (9) Prad,dpair2 = 4 sin θ cos η0 0 2 The total array power, as a fraction of the power of two single dipoles, is plotted against the distances of separation. As can be seen from Fig. 13, the ratio progresses toward unity for d > 2λ when the dipoles behave as noninteracting. On the other hand, when the two current elements are separated along the direction of maximum directivity, almost 90% of the radiated power can be suppressed, as shown in Fig. 14. These effects will be reflected in the finite driving-point impedance variations due to nonnegligible near-field interactions. The configuration in Fig. 14 is similar to the case of the traveling-wave radiator suppressing internally its strongest TE surface-mode. If an array of the traveling-wave loops is now deployed with element spacing d ≈ λTM1 /2, then the surface-wave power contained in the remaining strongest TM1 can be significantly suppressed. However, in the intended direction of radiation, the array behaves similar to Fig. 13, thereby retaining the radiated power. The set of 2 × 2 array of such traveling-wave radiators is shown in Fig. 15. The wavelength corresponding to the strongest TM1 mode for h = 220 μm is λTM1 = 782 μm, as shown in Fig. 10. Keeping the element spacing at d ≈ λTM1 /2 ≈ 400 μm, the radiated power, total surface-wave power, and its components have been shown in Fig. 16 for different substrate heights. At h = 220 μm, it shows almost complete suppression of both the surface-wave modes. This results in the radiation efficiency increasing to 90%, as shown in Fig. 17. Furthermore, the suppressed surface-wave power is not very sensitive to element spacing, as shown in Fig. 18, illustrating the efficacy of array structure in reducing surface-wave power. The radiation patterns, analytically predicted and EM simulated, for the 2 × 2 arrays of traveling-wave loops are shown in Fig. 19. It is

7

Fig. 16. Total surface-wave power broken down into its different modes and the total radiated power for 2 × 2 arrays of traveling-wave loops, separated by d ∼ λTM1 /2, for different substrate thicknesses. The plot shows almost complete suppression of the TE1 and TM1 modes.

Fig. 17. Radiation efficiency of 2×2 arrays of traveling-wave loops, separated by d ∼ λTM1 /2, against frequency.

Fig. 18. Radiated and suppressed surface-wave powers for different element spacings.

noteworthy to compare this with that of single current element (Figs. 4 and 5), which establishes the fact that the surface currents can be carefully designed in conjunction with circuits to arrive at novel and multifunctional active EM structures with desirable EM and circuit properties. IV. M EASUREMENT R ESULTS The theoretical formulations in Sections II and III provide a guideline for efficient radiating current configurations on-chip. Arrays of traveling-wave radiators, as described in Section III, whose elements are synchronized to each other by different coupling methods, were fabricated in 45-nm CMOS SOI process [53]–[55] and 65-nm bulk CMOS process [56] with bulk conductivity of σsub

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 8

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Fig. 22.

Measurement setup to measure the beam pattern in the 2-D space.

Fig. 23. Fig. 20.

Photo of the measurement setup at 0.3 THz for the chipsets in

Fig. 19. Analytically predicted and electromagnetically simulated radiation patterns for 2 × 2 arrays of traveling-wave loops, separated by d ∼ λTM1 /2.

Fig. 20.

Die micrographs of the traveling-wave radiator arrays.

Fig. 21. Schematic of the 2 × 2 array mutually coupled through t-line/nearfield EMs and 4 × 4 array centrally locked to a reference oscillator on chip.

of 13.5 -cm. The radiators are based on the self-oscillating, frequency-doubling, and radiating structures, as shown in Fig. 9. When biased above the oscillation condition, the two loops behave as differential transmission line and sustain a traveling-wave oscillation at a fundamental frequency ( f 0 ), where the loop phase is 180°. The currents in the two loops are also antiphase at f 0 , which suppresses radiation at f 0 . However, due to the differential swing at f 0 , the second harmonic currents are injected in phase into the two structures. As a result, the structure behaves as two concentric fullwavelength traveling-wave radiators, as shown in Fig. 9. The die micrographs are shown in Fig. 20.

Four sets of radiating arrays with different coupling modes, different frequencies, and substrate heights were fabricated and measured. Fig. 21 shows the architecture of the mutually coupled 2 × 2 arrays locked through t-lines/near-field electromagnetism and 4×4 array centrally locked to a reference oscillator on chip [54]–[56]. No silicon lens was used to suppress the substrate modes. Depending on the design, either frontside or backside was the intended direction for radiation as will be described later. The general measurement setup is shown in Fig. 22. A photograph of the setup is shown in Fig. 23. The radiation from the die is received by a 25-dB directional horn antenna at the appropriate waveguide band and then downconverted by a subharmonic mixer and analyzed in a spectrum analyzer. The entire setup is calibrated using a calorimeter-based Erickson power meter. The radiation from the chip is expected to be circularly polarized because of the traveling-wave nature of the on-chip radiators. This was also experimentally verified by rotating the linearly polarized receiver antenna and measuring the captured power, and the variation over the 2π angle was within 0.5 dB in the broadside direction. As the first example, 2 × 2 array of synchronized traveling-wave radiators at 291 GHz was measured under two different substrate heights (200 and 300 μm), respectively. The elements were coupled using transmission lines. The measured beam patterns for azimuth and elevation are

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. SENGUPTA AND HAJIMIRI: DESIGNING OPTIMAL SURFACE CURRENTS FOR EFFICIENT ON-CHIP mm-WAVE RADIATORS

9

Fig. 26. Measured and analytically predicted beam patterns for a 2 × 2 beam-steering array of mutually synchronized radiators at 191 GHz. The electronically tunable beam patterns are shown for both azimuth and elevation. The substrate thickness is h = 250 μm. Fig. 24. Measured, electromagnetically simulated, and analytically predicted beam patterns for a 2×2 array of mutually synchronized radiators at 291 GHz for two different substrate heights (h = 300 μm and h = 200 μm).

Fig. 25. Simulated E and H fields for two optimal substrate heights of 70 and 220 μm and bulk resistivity of 10 -cm.

shown in Fig. 24. The traveling-wave nature of the radiator arrays results in a nearly circularly symmetric pattern. As can be seen from the measurement results in Fig. 24, the patterns and maximum directivity are substantially different for the same array in two different substrate thicknesses. The beam is much sharper with the directivity of 10 dB for h = 300 μm, and much broader at h = 200 μm with directivity of 7.5 dBi. This is also predicted in the analytical results and EM simulation, as shown by the close correspondence in Fig. 24. Substrate conductivity was incorporated in the analytically predicted radiation pattern by using a complex value of dielectric permittivity at the frequency of radiation at 291 GHz [εr (ω) ≈ 11.7 − j (σsub/ωε0 ) ≈ 11.7 − j 1]. Furthermore, the absence of unpredictable lobes in the measurement results for a lossy finite substrate and its close correspondence with analytically prediction and EM simulations for a lossy semi-infinite substrate qualitatively show the effectiveness of surface-mode suppression. It can be noted that while the array radiates from the backside, the frontside has a metal ground plane, which results in the overall behavior to be similar to the analysis model in this paper. This is shown in the simulated results in Fig. 25, which shows E and H fields and radiation patterns for two optimal substrate heights of 70 and 220 μm, demonstrating that the majority of the power is radiated through the backside. A beam-scanning array of 2 × 2 elements fabricated in the 65-nm CMOS was also measured. Unlike the previous example, the elements were coupled using a near-field EM coupling method [56]. The resonant frequencies of the constituent

Fig. 27. Measured and analytically predicted beam pattern (in azimuth) for a 4 × 4 synchronized array under two different substrate heights h = 70 μm and h = 180 μm at 280 GHz.

elements are tunable with the help of integrated varactors. Within the locking range, as the free-running oscillation frequency of one or more of the synchronized oscillators is varied, additional phase shifts are introduced, which translates into electronic beamsteering [59]. The measurement results along with the analytically predicted results for the array radiating at 191 GHz with the substrate thickness of h = 250 μm are shown in Fig. 26. As can be seen, the measurement results and the analysis are in close correspondence. The offset angle for the measured maximum directivity is due to the mismatch of the traveling-wave oscillator elements. As another example, a 4 × 4 element phased array at 280 GHz, where the elements were centrally locked to a frequency reference, was fabricated in the 45-nm SOI CMOS [54]–[55]. In this case, the substrate was thinned to 70 and 180 μm, and the radiation pattern was measured from the frontside and compared with the analysis in Fig. 27. Measurement and analysis are in close correspondence and demonstrate the variation of beamwidth with substrate thicknesses. The measurement results along with the analytically predicted results in the two orthogonal planes under beamforming for the 70-μm substrate are also shown in Fig. 28 [56]. As before, the substrate wave suppression can be noticed in the absence of significant power leakage in the lateral directions. The analytically predicted results show the directivity of the order of 16 dB and have very close correspondence with the measured patterns.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 10

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Fig. 29. Radiation efficiency of an integrated dipole at 300 GHz on a lossless silicon substrate of h = 220 μm as the length is varied. Fig. 28. Measured and analytically predicted beam pattern (in azimuth) for a 4 × 4 beam-steering array of centrally synchronized radiators at 280 GHz. The substrate thickness is h = 70 μm, and the radiation was measured from the frontside.

and k1 and k2 are defined in terms of the free-space wavenumber (k0 ) as 2 k12 = εr k02 − βm,n 2 k22 = k02 − βm,n .

V. C ONCLUSION In this paper, we created a framework for design, analysis, and realization of optimal 2-D surface currents that collectively suppresses surface waves and maximizes radiation efficiency with desirable radiation pattern. The theoretical framework sets a guiding principle on which approximations to optimal surface currents are realized with actively driven EM structures through a closed circuit-EM codesign approach. Demonstrative examples and measurement results of radiating arrays in CMOS at mm-wave frequencies (250–300 GHz) are presented and compared with the theoretical predictions. A PPENDIX While we consider the case of grounded dielectric, ungrounded substrate can be accounted for in a similar manner. The far-field radiated fields for a current impulse located on the surface of a lossless dielectric is a strong function of the frequency of operation, substrate height, and dielectric constant, and can evaluated, as expressed in (10) and (11), as shown at the bottom of this page [40]. In addition to the radiative modes, Green’s dyadic for E and H fields inside the substrate for the TM and TM modes, respectively, for a current impulse is given by [17] Jx k2 k x η0 cos(k1 z) k0 Tm j Jx k y Hzd (h sw,TE (ξ, γ , z)) = − sin(k1 z) Te

E zd (h sw,TM (ξ, γ , z)) =

(12) (13)

where Te = 0 and Tm = 0 represent the representative modes given by Tm = k1 cos(k1 d) + j k2 sin(k1 d) Te = εr k2 cos(k1 d) + j k1 sin(k1 d)

j e− j k0r k0 Jx η0 cos φ E θ (r, θ, φ) ≈ r 2π

(14) (15)

− j e− j k0r k0 Jx η0 E φ (r, θ, φ) ≈ sin φ r 2π

(16)

Equations (10)–(16) represent the complete solution for all EM modes for a given substrate thickness, dielectric constant, and frequency of operation and shown in Fig. 7 for h = 220 μm and for h = 380 μm. The graphical representations of these transfer functions (Hrad ,Hsw ) enable us to develop an intuitive understanding of optimizing current configurations without rigorous solutions of the dielectric EM configurations. The rest of this section presents such an example of length-dependent radiation properties of an on-chip dipole antenna in silicon at 300 GHz for h = 220 μm. As discussed earlier, this can be analyzed by studying the overlap between the transfer functions Hrad (ξ, γ ) and Hsw(ξ, γ ) and the length-dependent dipole current configuration I (x) = I0 sin(keff (l/2 − |x|)) √ (keff = k0 εr,eff ), which translates to



cos keff 2l − cos ξ 2l J (ξ, γ ) = 2 (17) 2 ξ 2 − keff where εr,eff is the effective dielectric constant at the interface between the substrate and the air. For a thin strip of dipole, εr,eff is approximated as εr,eff = (εr + 1)/2. A quick analysis of (17) reveals that the global maximum of J (ξ, γ ) coincides with that of Hrad in the broadside direction at ξ = 0, γ = 0 for (l < 1/keff ), as shown in Fig. 7, for h = 220 μm. Therefore, solution to maximize   one possible total radiated power |Hrad J |2 dξ dγ can be achieved by choosing l that maximizes |J (ξ = 0, γ = 0)|. From (17), such an optimal length lopt for maximum radiated power is given by cos(keff lopt /2) = −1, which implies lopt,rad = (2n + 1)λeff . For silicon substrate at lopt ≈ (2n + 1) × 400 μm.

300

(18)

GHz,

1/2

h) − (εr − sin2 θ )

j (εr − sin2 θ )1/2 cot(k0 (εr − sin2 θ )

implies

1/2

cos θ (εr − sin2 θ )

j ε cos θ cot(k0 (εr − sin 2 θ ) r cos θ

this

1/2

1/2

h) − cos θ

(10)

(11)

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. SENGUPTA AND HAJIMIRI: DESIGNING OPTIMAL SURFACE CURRENTS FOR EFFICIENT ON-CHIP mm-WAVE RADIATORS

Fig. 30. Variation of radiated power and strongest substrate mode TM1 (for Hertzian dipole) as the dipole length is varied.

Fig. 31. Filter output |Hrad (ξ, γ )J (ξ, γ )|2 in the (ξ, γ ) plane and the corresponding radiation patterns in the (φ, θ ) plane for two choices of the dipole lengths.

Substrate mode minimization can be achieved by placing the minimum of J (ξ, γ ) with the maximum of Hsw,TM at |ξ | = βTM (φ = 0), since TM1 is the strongest surfacewave mode with λTM,1 = 782 μm. From (17), the dipole length must satisfy cos(keff lopt /2) − cos(βTM,1lopt /2) = 0, which gives lopt,sw = 2n

λTM,1 λeff . (λTM,1 ± λeff )

(19)

Dipole lengths satisfying this relation minimizes the strongest TM1 mode and measures the multiples of 530 μm at 300 GHz on 220-μm silicon substrate. Evidently, the radiation efficiency maximizes at dipole lengths between the optimal values noted in (18) and (19), as clearly observed in both analytical and EM simulation in Fig. 29. This is further elaborated in Fig. 30, where the variation of the constituent surface-wave power TM1 power, total surface wave, and radiated power is plotted against the dipole length. The correspondence between the radiation pattern of the dipole in the (φ, θ ) plane and the output of the Hrad filter in the (ξ, γ ) domain, given by Hrad (ξ, γ )J (ξ, γ ), can be clearly shown in Fig. 31. R EFERENCES [1] M. Tonouchi, “Cutting-edge terahertz technology,” Nature Photon., vol. 1, no. 2, pp. 97–105, 2007. [2] P. H. Siegel, “Terahertz technology,” IEEE Trans. Microw. Theory Techn., vol. 50, no. 3, pp. 910–928, Mar. 2002. [3] E. Pickwell and V. P. Wallace, “Biomedical applications of terahertz technology,” J. Phys. D, Appl. Phys., vol. 39, no. 17, pp. R301–R310, 2006.

11

[4] K. B. Cooper, R. J. Dengler, N. Llombart, B. Thomas, G. Chattopadhyay, and P. H. Siegel, “THz imaging radar for standoff personnel screening,” IEEE Trans. THz Sci. Technol., vol. 1, no. 1, pp. 169–182, Sep. 2011. [5] A. Babakhani, G. Xiang, A. Komijani, A. Natarajan, and A. Hajimiri, “A 77-GHz phased-array transceiver with on-chip antennas in silicon: Receiver and antennas,” IEEE J. Solid-State Circuits, vol. 41, no. 12, pp. 2795–2806, Dec. 2006. [6] R. Al Hadi et al., “A 1 k-pixel video camera for 0.7–1.1 terahertz imaging applications in 65-nm CMOS,” IEEE J. Solid-State Circuits, vol. 47, no. 12, pp. 2999–3012, Dec. 2012. [7] O. Momeni and E. Afshari, “A broadband mm-wave and terahertz traveling-wave frequency multiplier on CMOS,” IEEE J. Solid-State Circuits, vol. 46, no. 12, pp. 2966–2976, Dec. 2011. [8] X. Wu and K. Sengupta, “Dynamic waveform shaping for reconfigurable radiated periodic signal generation with picosecond timewidths,” in Proc. IEEE Custom Integr. Circuits Conf. (CICC), Sep. 2015, pp. 1–4. [9] X. Wu and K. Sengupta, “Programmable picosecond pulse generator in CMOS,” in IEEE MTT-S Int. Microw. Symp. Dig., May 2015, pp. 1–4. [10] X. Wu and K. Sengupta, “A 40-to-330 GHz synthesizer-free THz spectroscope-on-chip exploiting electromagnetic scattering,” in IEEE ISSCC Dig. Tech. Papers, Jan./Feb. 2016, pp. 428–429. [11] C. R. Chappidi and K. Sengupta, “A frequency-reconfigurable mm-wave power amplifier with active-impedance synthesis in an asymmetrical non-isolated combiner,” in IEEE ISSCC Dig. Tech. Papers, Jan./Feb. 2016, pp. 344–345. [12] K. Sengupta, D. Seo, L. Yang, and A. Hajimiri, “Silicon integrated 280 GHz imaging Chipset with 4 × 4 SiGe receiver array and CMOS source,” IEEE Trans. THz Sci. Technol., vol. 5, no. 3, pp. 427–437, May 2015. [13] Y. Zhao, J. Grzyb, and U. R. Pfeiffer, “A 288-GHz lens-integrated balanced triple-push source in a 65-nm CMOS technology,” in Proc. Eur. Solid-State Circuits Conf. (ESSCIRC), Bordeaux, France, Sep. 2012, pp. 289–292. [14] J.-D. Park, S. Kang, S. V. Thyagarajan, E. Alon, and A. M. Niknejad, “A 260 GHz fully integrated CMOS transceiver for wireless chip-to-chip communication,” in Symp. VLSI Circuits Dig., Jun. 2012, pp. 48–49. [15] Z. Wang, P. Chiang, P. Nazari, C.-C. Wang, Z. Chen, and P. Heydari, “A 210 GHz fully integrated differential transceiver with fundamentalfrequency VCO in 32 nm SOI CMOS,” in IEEE ISSCC Dig. Tech. Papers, Feb. 2013, pp. 136–137. [16] F. Golcuk, D. O. Gurbuz, and G. M. Rebeiz, “A 0.39–0.44 THz 2 × 4 amplifier-quadrupler array with peak EIRP of 3–4 dBm,” IEEE Trans. Microw. Theory Techn., vol. 61, no. 12, pp. 4483–4491, Dec. 2013. [17] D. M. Pozar and D. H. Schaubert, “Scan blindness in infinite phased arrays of printed dipoles,” IEEE Trans. Antennas Propag., vol. 32, no. 6, pp. 602–610, Jun. 1984. [18] C. A. Balanis, Antenna Theory: Analysis and Design, 3rd ed. Hoboken, NJ, USA: Wiley, 2005. [19] J. D. Kraus, Antennas. New York, NY, USA: McGraw-Hill, 1988. [20] S. Drabowitch, A. Papiernik, H. Griffiths, J. Encinas, and B. L. Smith, Modern Antennas, 2nd ed. Dordrecht, The Netherlands: Springer, 2005. [21] C. E. Baum, “General properties of antennas,” IEEE Trans. Electromagn. Compat., vol. 44, no. 1, pp. 18–24, Feb. 2002. [22] S. A. Schelkunoff, “Theory of antennas of arbitrary size and shape,” Proc. IRE, vol. 29, no. 9, pp. 493–521, Sep. 1941. [23] D. B. Rutledge et al., “Integrated-circuit antennas,” in Infrared and Millimeter Waves. New York, NY, USA: Academic, 1983. [24] N. G. Alexopoulos, P. B. Katehi, and D. B. Rutledge, “Substrate optimization for integrated circuit antennas,” IEEE Trans. Microw. Theory Techn., vol. MTT-31, no. 7, pp. 550–557, Jul. 1983. [25] H. Kogelnik, “Theory of dielectric waveguides,” in Integrated Optics, T. Tamir, Ed. New York, NY, USA: Springer-Verlag, ch. 2. [26] G. M. Rebeiz, “Millimeter-wave and terahertz integrated circuit antennas,” Proc. IEEE, vol. 80, no. 11, pp. 1748–1770, Nov. 1992. [27] D. F. Filipovic, G. P. Gauthier, S. Raman, and G. M. Rebeiz, “Off-axis properties of silicon and quartz dielectric lens antennas,” IEEE Trans. Antennas Propag., vol. 45, no. 5, pp. 760–766, May 1997. [28] A. Sommerfeld, “Über die Ausbreitung der Wellen in der drahtlosen Telegraphie,” Ann. Phys., vol. 28, no. 4, pp. 665–736, 1909. [29] A. Baños, Dipole Radiation in the Presence of a Conducting Halfspace. Oxford, U.K.: Pergamon, 1966.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 12

[30] R. E. Collin, Field Theory of Guided Waves, 2nd ed. New York, NY, USA: Wiley, 1990. [31] R. W. P. King, “The electromagnetic field of a horizontal electric dipole in the presence of a three-layered region,” J. Appl. Phys., vol. 69, no. 12, pp. 79–87, 1991. [32] S. R. J. Brueck, “Radiation from a dipole embedded in a dielectric slab,” IEEE J. Sel. Topics Quantum Electron., vol. 6, no. 6, pp. 899–910, Nov./Dec. 2000. [33] L. Liu and K. Li, “Radiation from a vertical electric dipole in the presence of a three-layered region,” IEEE Trans. Antennas Propag., vol. 55, no. 12, pp. 3469–3475, Dec. 2007. [34] A. K. Bhattacharyya, “Characteristics of space and surface waves in a multilayered structure [microstrip antennas],” IEEE Trans. Antennas Propag., vol. 38, no. 8, pp. 1231–1238, Aug. 1990. [35] C.-Y. E. Tong and R. Blundell, “An annular slot antenna on a dielectric half-space,” IEEE Trans. Antennas Propag., vol. 42, no. 7, pp. 967–974, Jul. 1994. [36] D. R. Jackson and N. G. Alexopoulos, “Microstrip dipoles on electrically thick substrates,” Int. J. Infr. Millim. Waves, vol. 7, no. 1, pp. 1–26, 1986. [37] P. B. Katehi and N. G. Alexopoulos, “On the effect of substrate thickness and permittivity on printed circuit dipole properties,” IEEE Trans. Antennas Propag., vol. AP-31, no. 1, pp. 34–39, Jan. 1983. [38] N. K. Uzunoglu, N. G. Alexopoulos, and J. G. Fikioris, “Radiation properties of microstrip dipoles,” IEEE Trans. Antennas Propag., vol. AP-27, no. 6, pp. 853–858, Nov. 1979. [39] G. V. Eleftheriades and M. Qiu, “Efficiency and gain of slot antennas and arrays on thick dielectric substrates for millimeter-wave applications: A unified approach,” IEEE Trans. Antennas Propag., vol. 50, no. 8, pp. 1088–1098, Aug. 2002. [40] W. B. Dou and Z. L. Sun, “Surface wave fields and power in millimeter wave integrated dipole antennas,” Int. J. Infr. Millim. Waves, vol. 18, no. 3, pp. 711–721, 1997. [41] S. F. Mahmoud, Y. M. M. Antar, H. F. Hammad, and A. P. Freundorfer, “Theoretical considerations in the optimization of surface waves on a planar structure,” IEEE Trans. Antennas Propag., vol. 52, no. 8, pp. 2057–2063, Aug. 2004. [42] D. M. Pozar, “Considerations for millimeter wave printed antennas,” IEEE Trans. Antennas Propag., vol. AP-31, no. 5, pp. 740–747, Sep. 1983. [43] M. Kominami, D. M. Pozar, and D. H. Schaubert, “Dipole and slot elements and arrays on semi-infinite substrates,” IEEE Trans. Antennas Propag., vol. AP-33, no. 6, pp. 600–607, Jun. 1985. [44] D. M. Pozar, “Microstrip antennas,” Proc. IEEE, vol. 80, no. 1, pp. 79–91, Jan. 1992. [45] J.-G. Yook and L. P. B. Katehi, “Micromachined microstrip patch antenna with controlled mutual coupling and surface waves,” IEEE Trans. Antennas Propag., vol. 49, no. 9, pp. 1282–1289, Sep. 2001. [46] R. L. Rogers and D. P. Neikirk, “Radiation properties of slot and dipole elements on layered substrates,” Int. J. Infr. Millim. Waves., vol. 10, no. 6, pp. 697–728, 1989. [47] W. Lukosz and R. E. Kunz, “Light emission by magnetic and electric dipoles close to a plane interface. I. Total radiated power,” J. Opt. Soc. Amer., vol. 67, no. 12, pp. 1607–1615, Dec. 1977. [48] W. Lukosz and R. E. Kunz, “Light emission by magnetic and electric dipoles close to a plane dielectric interface. II. Radiation patterns of perpendicular oriented dipoles,” J. Opt. Soc. Amer., vol. 67, no. 12, pp. 1615–1619, Dec. 1977. [49] W. Lukosz, “Light emission by magnetic and electric dipoles close to a plane dielectric interface. III. Radiation patterns of dipoles with arbitrary orientation,” J. Opt. Soc. Amer., vol. 69, no. 11, pp. 1495–1503, 1979. [50] W. Lukosz and R. E. Kunz, “Fluorescence lifetime of magnetic and electric dipoles near a dielectric interface,” Opt. Commun., vol. 20, pp. 195–199, Feb. 1977. [51] S. M. Bowers and A. Hajimiri, “Multi-port driven radiators,” IEEE Trans. Microw. Theory Techn., vol. 61, no. 12, pp. 4428–4441, Dec. 2013. [52] S. M. Bowers and A. Hajimiri, “An integrated multi-port driven radiating source,” in IEEE MTT-S Int. Microw. Symp. Dig., Jun. 2013, pp. 1–3. [53] K. Sengupta and A. Hajimiri, “Distributed active radiation for THz signal generation,” in IEEE ISSCC Dig. Tech. Papers, Feb. 2011, pp. 288–289. [54] K. Sengupta and A. Hajimiri, “A 0.28 THz 4 × 4 power-generation and beam-steering array,” in IEEE ISSCC Dig. Tech. Papers, Feb. 2012, pp. 256–257.

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

[55] K. Sengupta and A. Hajimiri, “A 0.28 THz power-generation and beamsteering array in CMOS based on distributed active radiators,” IEEE J. Solid-State Circuits, vol. 47, no. 12, pp. 3013–3031, Dec. 2012. [56] K. Sengupta and A. Hajimiri, “Sub-THz beam-forming using near-field coupling of distributed active radiator arrays,” in Proc. IEEE Radio Freq. Integr. Circuits Symp., Jun. 2011, pp. 1–4. [57] K. Sengupta and A. Hajimiri, “Mutual synchronization for power generation and beam-steering in CMOS with on-chip sense antennas near 200 GHz,” IEEE Trans. Microw. Theory Techn., vol. 63, no. 9, pp. 2867–2876, Sep. 2015. [58] D. M. Pozar, Microwave Engineering, 3rd ed. New York, NY, USA: Wiley, 2004. [59] R. A. York and R. C. Compton, “Quasi-optical power combining using mutually synchronized oscillator arrays,” IEEE Trans. Microw. Theory Techn., vol. 39, no. 6, pp. 1000–1009, Jun. 1991.

Kaushik Sengupta (M’12) received the B.Tech. and M.Tech. degrees in electronics and electrical communication engineering from IIT Kharagpur, Kharagpur, India, in 2007, and the M.S. and Ph.D. degrees in electrical engineering from the California Institute of Technology (Caltech), Pasadena, CA, USA, in 2008 and 2012, respectively. He performed research with the University of Southern California, Los Angeles, CA, USA, and the Massachusetts Institute of Technology, Cambridge, MA, USA, in 2005 and 2006, where he was involved in nonlinear integrated systems for high-purity signal generation and low-power RF identification tags. He joined as a Faculty Member the Department of Electrical Engineering, Princeton University, Princeton, NJ, USA, in 2013. His current research interests include high-frequency ICs, electromagnetics, and optics for various applications in sensing, imaging, and high-speed communication. Dr. Sengupta was a recipient of the IBM Ph.D. Fellowship from 2011 to 2012, the IEEE Solid State Circuits Society Predoctoral Achievement Award in 2012, the IEEE Microwave Theory and Techniques Graduate Fellowship in 2012, and the Analog Devices Outstanding Student Designer Award in 2011. He was the recipient of the Charles Wilts Prize in 2013 from the Department of Electrical Engineering, Caltech, for outstanding independent research in electrical engineering leading to a Ph.D. He was also a recipient of the Prime Minister Gold Medal Award of IIT Kharagpur in 2007, the Caltech Institute Fellowship, the Most Innovative Student Project Award of the Indian National Academy of Engineering in 2007, and the IEEE Microwave Theory and Techniques Undergraduate Fellowship in 2006. He was a corecipient of the IEEE RFIC Symposium Best Student Paper Award in 2012 and the 2015 IEEE MTT-S Microwave Prize. He serves on the Technical Program Committee of the European Solid-State Circuits Conference. He was selected to the Princeton Engineering Commendation List for Outstanding Teaching in 2014.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. SENGUPTA AND HAJIMIRI: DESIGNING OPTIMAL SURFACE CURRENTS FOR EFFICIENT ON-CHIP mm-WAVE RADIATORS

Ali Hajimiri (F’10) received the B.S. degree in electronics engineering from the Sharif University of Technology, Tehran, Iran, and the M.S. and Ph.D. degrees in electrical engineering from Stanford University, Stanford, CA, USA, in 1996 and 1998, respectively. He was a Design Engineer with Philips Semiconductors, Eindhoven, The Netherlands, from 1993 to 1994, where he was involved in a BiCMOS chipset for global system for mobile communication and cellular units. In 1995, he was with Sun Microsystems, Santa Clara, CA, USA, where he was involved in the UltraSPARC microprocessors cacheRAM design methodology. In 1997, he was with Lucent Technologies (Bell Laboratories), Murray Hill, NJ, USA, where he investigated low-phase-noise integrated oscillators. In 1998, he joined as a Faculty Member the California Institute of Technology, Pasadena, CA, USA, where he is currently a Thomas G. Myers Professor of Electrical Engineering and the Director of the Microelectronics Laboratory. In 2002, he co-founded Axiom Microdevices Inc., Woburn, MA, USA, whose fully integrated CMOS PA has shipped close to 250 million units, and was acquired by Skyworks Inc., Woburn, MA, USA, 2009. He authored The Design of Low Noise Oscillators (Springer, 1999) and has authored or coauthored more than 150 refereed journal and conference technical articles. He holds more than 60 U.S. and European patents. His research interests are high-speed and highfrequency integrated circuits for applications in sensors, biomedical devices, photonics, and communication systems.

13

Dr. Hajimiri has served on the Technical Program Committee of the IEEE International Solid-State Circuits Conference (ISSCC) as an Associate Editor of the IEEE J OURNAL OF S OLID -S TATE C IRCUITS , an Associate Editor of the IEEE T RANSACTIONS ON C IRCUITS AND S YSTEMS —II: E XPRESS B RIEFS , a Member of the Technical Program Committee of the International Conference on Computer Aided Design (ICCAD), Guest Editor of the IEEE T RANSACTIONS ON M ICROWAVE T HEORY AND T ECHNIQUES , and on the Guest Editorial Board of the Transactions of the Institute of Electronics, Information and Communication Engineers of Japan. He is a Fellow of the National Academy of Inventors. He was selected to the TR35 Top Innovator’s List in 2004. He has served as a Distinguished Lecturer of the IEEE SolidState Circuits Society and the IEEE Microwave Theory and Techniques Society (IEEE MTT-S). He was a recipient of the California Institute of Technology’s Graduate Students Council Teaching and Mentoring Award, as well as the Associated Students of California Institute of Technology’s Undergraduate Excellence in Teaching Award. He was the Gold Medal recipient of the National Physics Competition and the Bronze Medal recipient of the 21st International Physics Olympiad, Groningen, The Netherlands. He was a corecipient of the IEEE J OURNAL OF S OLID -S TATE C IRCUITS Best Paper Award (2004), the ISSCC Jack Kilby Outstanding Paper Award, a twotime corecipient of the CICC Best Paper Award, and a three-time recipient of the IBM Faculty Partnership Award, as well as the National Science Foundation CAREER Award, Okawa Foundation Award, and the 2015 IEEE MTT-S Microwave Prize.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

1

A Miniaturized Evanescent Mode Waveguide Filter Using RRRs Jun Ye Jin, Student Member, IEEE, Xian Qi Lin, Senior Member, IEEE, and Quan Xue, Fellow, IEEE

Abstract— In this paper, a miniaturized evanescent mode waveguide filter using rectangular ring resonators (RRRs) is presented. The evanescent mode waveguide resonator cell (EWRC), composed of a section of evanescent mode waveguide loaded with a pair of RRRs, is designed and analyzed. Quality factors of the EWRC are simulated and calculated. The resonant mechanism and the values of the equivalent lumped elements of EWRCs are given for further design and application of this kind of filter. A miniaturized waveguide filter is designed and fabricated to operate at Ka-band, using the W-band standard waveguide (WR-10) with a cross-sectional size of 2.54 mm × 1.27 mm. The proposed filter can realize a large size reduction compared with the conventional waveguide filter, especially leading to a cross-sectional size reduction of 87.2%. Despite a slight frequency shift, good agreement has been achieved between the simulation and measurement results. The proposed filter has large out-of-band rejection, wide stopband, and high selectivity. In addition, a compact diplexer with high isolation is presented. Index Terms— Compact size, diplexer, E-plane, evanescent mode, Ka-band, waveguide filter.

I. I NTRODUCTION

O

VER the past decades, the waveguide has been an indispensable solution to design robust, low loss, and high-power filters [1]–[4] at microwave and millimeterwave frequencies. With the development of microwave and millimeter-wave systems, high-performance narrow-band bandpass filters with low insertion loss, compact size, wide stopband, and high selectivity are increasingly required. In the design of waveguide filter structures, reduction of the physical size and weight has become one of the primary goals. In order to realize miniaturization, most waveguide filters focused on reducing the total length of waveguide filter [5]–[7], by using compact resonators or dual-mode cavities [8], [9]. However, except of waveguide filters, the Manuscript received November 10, 2015; revised May 19, 2016; accepted May 28, 2016. This work was supported in part by the National key Basic Research Program of the Ministry of Science and Technology of China (973 Program) under Grant 2014CB339900, in part by the National Natural Science Foundation of China under Grant 61372056 and Grant 61571084, in part by NCET under Grant NCET-13-0095, and in part by FRF for CU under Grant ZYGX2014J016. J. Y. Jin and Q. Xue are with the State Key Laboratory of Millimeter Waves, Department of Electronic Engineering, City University of Hong Kong, Hong Kong, and also with the Shenzhen Key Laboratory of Millimeter Waves, CityU Shenzhen Research Institute, Shenzhen 518057, China (e-mail: [email protected]; [email protected]). X. Q. Lin is with the EHF Key Laboratory of Fundamental Science, School of Electronic Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TMTT.2016.2574988

sizes of cross section should also be considered in order to design compact waveguide filters. The evanescent mode waveguide filter [10]–[13] has been a suitable choice to realize this goal. The evanescent mode bandpass filters have several advantages over the conventional type of bandpass filter (waveguide used above cutoff frequency). On the one hand, larger out-of-band rejection can be obtained. On the other hand, evanescent mode waveguide filters own the advantages in both volume and weight compared with traditional waveguide. However, evanescent mode filters using conventional capacitive elements such as tuning screws [11], [12] are costly and difficult to mass-produce because of their complicated structure. In addition, the tuning screws limit the sizes of evanescent mode waveguide and they are not applicable at higher frequencies. Besides, transverse inserts used in the designed evanescent mode waveguide filters made the fabrication more expensive and complex [10], [14]. In order to implement easy fabrication, mass production, and low cost, the nontouching E-plane fins [13] were proposed as the capacitive elements to design the evanescent mode filters operating at Ka-band. The height of the fins must be smaller than the height of the evanescent mode waveguide. An evanescent mode waveguide filter with two E-plane fins on a Duroid substrate was simulated and given in [13, Fig. 14], which had a center frequency of about 30.5 GHz and 3 dB bandwidth of 5%. The larger waveguide and the smaller waveguide were WR-28 (7.11 mm × 3.56 mm) and WR-15 (3.76 mm × 1.88 mm), respectively, leading to a cross size reduction of 72%. The equivalent capacitors were emphasized and considered in designing the conventional evanescent mode waveguide filters and the design theory was also based on the termination in capacitors. However, the equivalent inductance also plays an important role in the filter performance. It should be studied for potential design and application. Moreover, most of the previous works designed the evanescent mode waveguide filter with nonstandard cross section sizes, which caused measurement complexity or need of the input and output (I/O) ports of suitable waveguide. Such as in the work of [15], the revealed passband below the cutoff frequency had a large measured transmission loss of more than 30 dB. In this paper, a miniaturized evanescent mode waveguide filter is presented to operate at Ka-band and a cross size reduction of 87.2% is achieved by using WR-10. The resonances of rectangular ring resonators (RRRs) in evanescent mode waveguide are appropriately used. The effect of equivalent inductance is fully considered as well as the capacitance. At the meantime, this filter has good features of low cost

0018-9480 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 2

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Fig. 2. Simulation results of WR-10 loaded with the RRRs and SRRs (g = 0.1 mm).

Fig. 1. Proposed EWRC. (a) Side view. (b) Top view. (c) Simulation results of the EWRC versus l (with h = 1.1 mm and t = 0.1 mm).

and easy fabrication. The basic theory and design analysis are presented in Section II, including equivalent circuit analysis in which the lumped elements are given in closed formulas. One sample is designed, fabricated, and measured in Section III. A compact diplexer with high isolation is further proposed based on the proposed filter design in Section IV. Finally, some necessary conclusions are drawn in Section V. II. BASIC D ESIGN T HEORY It is well known that the W -band standard waveguide WR-10 with 2.54 mm × 1.27 mm is always used to transmit the W -band electromagnetic wave ranging from 75 to 110 GHz, but cannot transmit the Ka-band wave ranging from 26.5 to 40 GHz. Therefore, when a WR-10 is directly connected with the I/O ports of WR-28 with the crosssectional sizes of 7.112 mm × 3.556 mm, the evanescent mode waveguide of WR-10 definitely reflects all the Ka-band electromagnetic wave [16]. A. Evanescent Mode Waveguide Resonator Cell Here, first, the evanescent mode waveguide resonator cell (EWRC) is designed using the WR-10 and RRRs, operating at Ka-band. A pair of identical RRRs, which are placed back-to-back and supported by a Duroid substrate with the relative permittivity of εr = 2.2 and the thickness of 0.127 mm, is inserted in the center E-plane of the WR-10, as shown in Fig. 1. For extremely compact sizes, the evanescent mode waveguide is fully used by the rectangular ring with

Fig. 3. Simulation results of WR-10 loaded with the RRRs and SRRs (h = 1.1 mm, l = 1.6 mm, lsrr = 0.75 mm, g = 0.2 mm, and t = 0.1 mm).

h = 1.1 mm, and h is limited by the narrow wall of the WR-10. A passband is produced at its resonant frequency, which is designed to locate in the Ka-band, as shown in Fig. 1(c). The operating frequency of the EWRC can be adjusted by the size parameters of RRRs. It should be mentioned that split ring resonators (SRRs) is also evaluated, but the obtained results show that the RRRs have a superior performance. In Fig. 2, for the SRRs with the same parameters of l and h of the RRRs, except for a gap with g = 0.1 mm, two resonant modes are observed. The first transmission pole is introduced by a magnetic resonance at a lower frequency, while the second one is induced at a higher frequency by an electric resonance whose current distribution is symmetrical with respect to the center line. Because the waveguide field is electrically coupled to the resonators, the electric resonance plays a more important role than the magnetic resonance. It can be seen that the dominant electric resonant mode has better transmission performance than the magnetic resonant mode. Thus, on the one hand, the magnetic mode is companied by large insertion loss and narrow passband, and it should be suppressed for designing a bandpass filter with a better out-of-band rejection in the lower stopband. On the other hand, the electric resonant mode with lower insertion loss and wider passband is similar to that of the proposed EWRC which realizes magnetic mode suppression. In Fig. 3, the split of SRR is perpendicular to the waveguide longitudinal axis. In this case, only the magnetic resonant mode is excited, as shown in the inset of Fig. 3, while the

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. JIN et al.: MINIATURIZED EVANESCENT MODE WAVEGUIDE FILTER USING RRRs

3

Fig. 5. Parallel LC equivalent circuit of waveguide loaded with the RRRs. (a) Including impedance scaling. (b) Excluding for impedance scaling.

Fig. 4. Evanescent mode waveguide filter. (a) Q u of EWRC. (b) Q e of EWRC.

electric resonance cannot be induced. So the field coupling of SRRs is much weaker than that of the RRRs and the insertion loss is much higher. So considering all the above discussions, the RRRs are preferred. B. Quality Factors of EWRC The relationship between the loaded quality factor (Q l ), unloaded quality factor (Q u ), and external quality factor (Q e ) is determined by 1 1 1 = + . Ql Qu Qe

(1)

Then, the unloaded quality factor (Q u ) of the EWRC is investigated, which is determined by its corresponding section waveguide, supported substrate (dielectric loss of tan δ = 9 × 10−4 ), and the metallic RRRs [copper with conductivity of σ = 5.8 × 107 (S/m)]. By enlarging the length of evanescent mode waveguide at the I/O ports, i.e., lew , to reduce the I/O couplings to the EWRC (until S21 < −20 dB and Q e is approximately infinite), as shown in Fig. 4(a), Q u is simulated and calculated by 1 (2) FBW where FBW is the fractional 3 dB bandwidth. Here, Q u of EWRC with h = 1.1 mm and l = 1.6 mm is 778. Because of its structure, this Q u should be lower than Q u of metallic cavity, but higher than the microstrip resonators. A high Q u is obtained by our proposed EWRC. Q u ≈ Ql =

By reducing the I/O couplings to the EWRC, the unloaded Q u can be calculated. In contrast, by increasing the couplings, the external quality factor (Q e ) is obtained by (2), similarly. Moreover, as the bandwidth of filters related to Q e directly, it is investigated. As shown in Fig. 4(b), lew is the same as in Fig. 4(a), while lsw means the length that the RRRs protrude into the standard waveguide of WR-28. Q e of EWRC with lew = lsw = 0 is 23.6. As shown in Fig. 4(b), Q e is reduced to 20 when the RRRs protrude into the standard waveguide, which can be used to design filter with a wider bandwidth. On the other hand, Q e is increased to 27.3, when there is a section of evanescent mode waveguide at the I/O ports. However, considering the practical fabrication, it is not good to have the RRRs extended out its evanescent mode waveguide, because the substrate circuit is easy to be damaged. Moreover, if we only fabricate the filter structures, the extended substrate and the metallic structure on it makes inconvenient for resembling and measurement. Therefore, it is better for us to allocate the substrate circuit into the waveguide. Much narrower bandwidth filters can be designed by adjusting lew . But, in this design, we focus on the designs with lew = 0 mm for a filter with minimum length. C. Equivalent Circuit Analysis The EWRC, a WR-10 loaded with a pair of identical RRRs in Fig. 1, can be equivalent to a parallel LC circuit, as shown in Fig. 5(a), where L and C are the equivalent inductance and capacitor. Further, let L n and Cn are the equivalent inductance and capacitor before impedance scaling, as shown in Fig. 5(b), so they are determined by L Z0 Cn = C Z 0 Ln =

(3) (4)

where Z 0 is the characteristic impedance of the I/O WR-28 and Z 0 can be calculated by  1 b 1 μ0  (5) Z0 = K ·  2 a1 ε0 λ0 1 − 2a1 where a1 , b1 , μ0 , ε0 , and λ0 are waveguide wide dimension, waveguide narrow dimension, permeability, permittivity of the I/O waveguide WR-28, and free space wavelength, respectively, and K is a coefficient which equals 2 for Z 0 calculated from power and voltage at the center of broadside wall. The calculated results of Z 0 using (5) are shown in Fig. 6 for further design in Section III. From Fig. 6, it can be seen that Z 0 varies as the frequency changes. The resonant frequency f 0 , however, only determines the product of L and C,

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 4

Fig. 6.

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Characteristic impedance Z 0 of WR-28. Fig. 8. cell.

Fig. 7.

Comparisons of the equivalent LC circuit and the evanescent mode

f 0 versus the length of RRRs.

or the product of L n and Cn , i.e., LC = L n Cn = 1/ω02 , where ω0 = 2π f 0 is the angular frequency at the resonant frequency f 0 of the EWRC. By carefully investigation of the EWRC, it can be found that Cn increases when increasing l, and the larger h leads to smaller Cn . Moreover, Cn depends on L n , so we can only focus on the investigation of L n . The parameters of h and l determine the resonant frequency f 0 , while fixing the width of metal strip (t = 0.1 mm), as shown in Fig. 7. Smaller value of l results in higher resonant frequency. The relationship between L n and the sizes of the RRRs is carefully analyzed here. It can be simply determined by L n = [0.37 − 0.1(l + h)] × 10−3 nH.

(6)

A comparison of simulation results of the equivalent LC circuit and the EWRC is shown in Fig. 8, which selects EWRC with t = 0.1 mm, h = 1.1 mm, and l = 1.6 mm. The resonant frequency is 31.6 GHz, at which the value of Z 0 is 505.8 . L n is calculated using (6) to be 0.1 × 10−3 nH, so L is then determined by (3) to be 0.05058 nH, and then C is 0.5023 pF as shown in the inset of Fig. 8. Slight differences are observed far from the center frequency, because the substrate and EWRC are along the direction of wave propagation. Different phases of electromagnetic field are presented at two edges of the structures, So the out-of-band rejection performance is not symmetrical about the center frequency. III. F ILTER D ESIGN AND E XPERIMENT Using the proposed EWRC as a filter cell, it is possible to design a miniaturized evanescent mode waveguide bandpass filter. The schematic graph of the cascaded cells is

Fig. 9. Equivalent LC circuit of the proposed filter. Two cells with (a) length of waveguide, (b) with π -network, and (c) using J-inverter.

represented by the equivalent LC circuit and a length of evanescent mode waveguide, as shown in Fig. 9(a). For the TE evanescent modes, a section of evanescent mode waveguide with le equals a T- or π-network represented by lumped reactance components [17] and the π-network is used in Fig. 9(b). When the evanescent mode waveguide is air-filled, the reactance Z 1 and Z 2 can be calculated by Z1 = j a2 Z2 = j a2

120πb2   2 λ0 2a2

120πb2   2 λ0 2a2

sinh γ le

(7)

γ le 2

(8)

−1 coth −1

where the propagation constant γ is given by    2π λ0 2 γ = − 1. λ0 2a2

(9)

A J-inverter, formed by this section of evanescent mode waveguide, is employed to analyze the proposed evanescent mode filter, as shown in Fig. 9(c). This J-inverter is used

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. JIN et al.: MINIATURIZED EVANESCENT MODE WAVEGUIDE FILTER USING RRRs

Fig. 10.

5

Fig. 11.

Calculated f versus le .

Fig. 12.

Photograph of the proposed evanescent mode waveguide filter.

Proposed evanescent mode waveguide filter structures.

to couple the two EWRCs, resulting in a bandpass filter. From Fig. 9(c), it can be seen that the reactance of Z t , i.e., Z t = Z 1 + Z 2 , can shift the original resonant frequency f 0 of the EWRCs to higher frequencies. In addition, this frequency shift f can be calculated by ⎞ ⎛ 1 ⎝ 1 ⎠ 1

 f = f − f0 = −√ (10) 2π L L LC C L+L 

where f is the center frequency of the proposed filter, and L  is the equivalent inductance of Z t . Z t and L  are determined by Zt = Z1 + Z2 = j a2 L =

Zt H. jw

120πb2   2 λ0 2a2

tanh γ le

(11)

−1 (12)

Using the HFSS software, the simulation model of the proposed filter is shown in Fig. 10, where two EWRCs are placed at appropriate interval of d and the I/O ports of WR-28 are modeled for exact simulation. It should be noted that d should be larger than le , because the RRRs of the EWRCs introduce fringing fields into the evanescent mode waveguide section. Following steps should be taken as the design procedures in this paper. Step 1: According to the operating center frequency of the required to determine the center frequency f 0 of the EWRC, and then to get the sizes of the EWRC by Fig. 7. Step 2: Calculate L n using (6) and Cn from the relationship with L n (L n Cn = 1/ω02 ). Step 3: Calculate the characteristic impedance Z 0 using (5) or obtain it by Fig. 6, and then, get L and C. Step 4: Simulation verification and fine optimization to determine the required value of d. To demonstrate this design method, a sample was designed and fabricated. The EWRCs with h = 1.1 mm and l = 1.66 mm are chosen, which resonate at ∼31 GHz. So the parameters are L n = 0.094 × 10−3 nH,Cn = 278.8 pF, Z 0 = 513.7 , L = 0.048 nH, and C = 0.5427 pF. Moreover, the value of f is calculated using (10) at 31 GHz versus the length of le , as shown in Fig. 11. It can be seen that a frequency shift of more than 1.2 GHz is induced for a section of

air-filled evanescent mode waveguide theoretically. Considering the difference between the ideal lumped components and the realistic structures with losses, the discrepancy between the air-filled waveguide and the substrate inserted waveguide, and existence of fringing fields, this frequency deviation may be larger than the calculated one. A photograph of the filter is shown in Fig. 12. This sample of the proposed miniaturized evanescent mode filter is tested by the PNA Network Analyzer N5244A. The RRRs are fabricated on the Duroid 5880 substrate with the relative permittivity of εr = 2.2 and the thickness of 0.127 mm which is inserted in the center E-plane of the standard WR-10. The I/O ports are the standard waveguides of WR-28. It should be mentioned that this kind of waveguide filter can be realized by fabricating an RRRs-loaded W -band waveguide with the Ka-band flange, which is used to connect with the I/O ports of WR-28. However, in the experiment, the total length of the proposed filter with l = 1.66 mm and d = 2.54 mm is 5.9 mm, which is too short to assemble to the experiment equipment, so the extra lengths of the I/O ports are fabricated together with the filter, as shown in Fig. 12, for easy assembling and test. The step-discontinuity effect at the boundary between the evanescent-mode waveguide and the input standard waveguide is included in the design procedures and the measurement results. Comparisons between the simulation and measurement results are shown in Fig. 13. The simulation and measurement results show that the waveguide filter has large out-of-band rejection and high selectivity. The steepness of the out-of-band insertion loss curve on the higher frequency side is almost equal to the one on the lower frequency side. This happens because the waveguide below its cutoff frequency acts

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 6

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Fig. 13. Comparison between HFSS simulation and measurement results (with l = 1.66 mm, h = 1.1 mm, t = 0.1 mm, and d = 2.54 mm).

Fig. 14. Simulation results of Filter1 and Filter2 (with l1 = l2 = 1.8 mm, t1 = t2 = 0.1 mm, h 1 = 1.1 mm, h 2 = 1 mm, d1 = 2.5 mm, and d2 = 2.6 mm).

TABLE I C OMPARISON W ITH R EPORTED R ESEARCH

like a lumped reactance, contributing to large rejection. The HFSS simulated sample operates at the center frequency of 32.8 GHz with a 3 dB bandwidth of 3%, whereas the measured one works at the center frequency of 32.3 GHz with a 3 dB bandwidth of 3%. It is noted that the center frequency of the designed filter is higher than the resonant frequency of the EWRCs. The measured minimum insertion loss is 0.86 dB and the skirt coefficient of K 0.1 is calculated to be 1.83. Finally, a frequency shift of 0.5 GHz (1.5%) between the HFSS simulation and measurement results is shown, which is mainly because of the fabrication tolerance, resembling, and material errors. Despite the slight frequency shift, good agreement is achieved between the simulation and measurement results. Besides, there is no doubt that the proposed filter can operate exactly at any other central frequencies by changing the sizes of the RRRs, i.e., l and h. Some comparisons have been summarized between a previous waveguide filter operating at Ka-band and ours as shown in Table I. This kind of proposed filter can realize a very large size reduction compared with the conventional waveguide filter. The cross section size of this proposed one is 2.54 mm × 1.27 mm, which is only 12.8% of that of the conventional ones, which are always 7.112 mm × 3.556 mm, such as the filter in [18]. The proposed filter is not only smaller than the previous waveguide filters in length, but also with smaller cross section sizes. IV. D IPLEXER D ESIGN As shown earlier, this kind of waveguide filter with compact size has a larger out-of-band rejection, so it can be used to design a compact diplexer. In this section, a compact diplexer would be proposed using the described evanescent

Fig. 15.

Configuration of the compact diplexer.

mode waveguide filter. First of all, two filters (denoted by Filter1 and Filter2) operate at different frequency bands need to be selected. Here, Filter1 (h 1 = 1.1 mm, l1 = 1.8 mm, t1 = 0.1 mm, and d1 = 2.5 mm) and Filter2 (h 2 = 1.0 mm, l2 = 1.8 mm, t2 = 0.1 mm, and d2 = 2.6 mm) are chosen for the next step. The simulation results of the two filters are shown in Fig. 14. Filter1 operates at 31.86 GHz with a 3 dB bandwidth of 3%, whereas Filter2 operates at 35.82 GHz with a 3 dB bandwidth of 3.1%. These two filters are used in two channels of the diplexer separately for different works. For easy fabrication, the conventional T-junction is utilized, as shown in Fig. 15. The two filters are located at two arms of the T-junction of this designed diplexer. The parameters of dcen1 and dcen2 represent the distances from the center of T-junction to Filter1 and Filter2, respectively. The three waveguide ports (denoted by Port1, Port2, and Port3) are standard WR-28, which are modeled for signal input and exact simulation. The distances of dcen1 and dcen2 are adjusted to optimize the diplexer performance. By adjusting dcen2 , it is possible that the input impedance, looking from the center of the T-junction into the upper channel arm loaded with Filter2, is equivalent to an open circuit for frequencies of the Filter1 passband, then the input signal from Port1 is transmitted into Port2, through Filter1. Vice versa, i.e., the upper channel operates normally while the lower channel (Filter1) equals to an open circuit by adjusting dcen1 . Taking into account the

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. JIN et al.: MINIATURIZED EVANESCENT MODE WAVEGUIDE FILTER USING RRRs

Fig. 16. Simulation results of diplexer (with dcen1 = 3.2 mm and dcen2 = 2.8 mm).

dielectric loss of tan δ = 9 × 10−4 and the conductivity of σ = 5.8 × 107 (S/m), the simulation results of a sample are shown in Fig. 16. It can be seen that the isolation between the two channels is more than 37 dB. Except for the I/O ports, the designed diplexer can be realized by only 18.3 mm, which is 41 dBm output power and >60% efficiency, and for peaking device in class C: >41 dBm output power and >55% efficiency.

A. OMN Design

Fig. 7. Simulated carrier load impedance of a conventional DPA, the DPA with a short-circuited stub, and the proposed DPA at saturation.

Fig. 8. Simulated peaking load impedance of a conventional DPA, the DPA with a short-circuited stub, and the proposed DPA at saturation.

peaking impedance transformer can achieve the optimum load impedance at saturation the same as that in the conventional design, the effect on the load impedances can be effectively reduced and then proper load modulation can be achieved. III. D ESIGN AND S IMULATION In this section, based on the idea of the integrated reactance compensation, a systematic design procedure is presented for the proposed DPA to obtain enhanced back-off efficiency over a wide frequency band and simultaneously achieve the desired load modulation at saturation.

In order to achieve the desired behavior in the proposed design, the OMNs of the carrier and peaking amplifiers must be properly designed. To describe the design procedure, we use Cree CGH40010F GaN HEMT device as an example. First, the optimal load impedances at the device package plane of both the amplifiers were obtained using load-pull simulations. The Cree GaN HEMT circuit model was used in the simulations with bias conditions of a class AB mode and a predefined class C mode. To enlarge the current ratio (δ) and improve load modulation, the drain voltage was set to be 26 and 30 V for the carrier and peaking devices, respectively. For the frequency band of 1.7–2.8 GHz, the results shown in Fig. 9 were generated by sweeping the fundamental load impedances while setting second and third harmonic terminations to predefined loads. For the carrier device, the inside region of the contour represents the area on the Smith chart for greater than 41 dBm output power and 60% efficiency. Meanwhile, the impedance region for the carrier amplifier to achieve high efficiency, greater than 70%, at back-off power is also given for frequencies of 1.7, 2.25, and 2.8 GHz. To achieve a similar output power with the carrier device for proper load modulation, the criterion of higher than 41 dBm output power and 55% efficiency was chosen for the peaking device, as shown in Fig. 9. In addition, the output impedance of the peaking device in class C operation is also depicted, which can be used to design the peaking OMN in the proposed DPA. As mentioned above, the peaking OMN should achieve quasi-short circuit in the low-power region and the desired optimum load impedance at saturation. To design such OMN, a two-point matching technique, deduced using ABCD matrix, is proposed, as discussed in detail below. The peaking OMN is shown in Fig. 10. If the OMN is considered as a lossless reciprocal two-port network, the S-parameter matrix of such network can be

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. XIA et al.: BROADBAND HIGH-EFFICIENCY DPA WITH INTEGRATED CR

Fig. 10.

5

Peaking OMN denoted by ABCD matrix.

expressed as [8]    ∗ e j 2θ21 S11 S12 −S22 S= = S21 S22 1−|S22 |2 e j θ21



1−|S22|2 e j θ21 S22



(10) where θ21 is the phase of S21 . According to the network parameter conversion, the ABCD parameters of the OMN can be written as ∗ e j 2θ21 (1 − S ) + (1 − |S |2 )e j 2θ21 1 − S22 22 22 A= (11) 2 j θ 2 1 − |S22 | e 21 ∗ e j 2θ21 (1 + S ) − (1 − |S |2 )e j 2θ21 1 − S22 22 22 B = Z0 (12) 2 j θ 21 2 1 − |S22 | e ∗ e j 2θ21 (1 − S ) − (1 − |S |2 )e j 2θ21 1 1 + S22 22 22 C = (13) Z0 2 1 − |S22 |2 e j θ21 ∗ e j 2θ21 (1 + S ) + (1 − |S |2 )e j 2θ21 1 + S22 22 22 (14) D= 2 j θ 2 1 − |S22 | e 21 where Z 0 is the termination impedance (usually it is 50 ). A set of linear equations of the voltages and currents on both sides of the peaking OMN can then be expressed in terms of the ABCD parameters as         V P1 A B A B VP VP = = . (15)  I P1 −I P I P C D C D In the low-power region, the peaking device is in the OFFstate and its output impedance is assumed to be Z OUT_ P . Since V P = I P Z OUT_ P , the output impedances of the peaking amplifier becomes Z OUT_ P1 =

V P1 Z OUT_P A + B . =  I P1 Z OUT_ P C + D

(16)

At saturation, the following relationship can be obtained: Z ∗P1 =

Z ∗P A + B . Z ∗P C + D

(17)

In the low-power region, the peaking OMN should transform the impedance Z OUT_P to the desired quasi-short-circuited impedance Z OUT_P1 . On the other hand, the impedance Z P1 should also be transformed into the optimum load impedance Z P when the amplifier is in saturation. When those four impedances have been determined, the design parameters

Fig. 11. Graphical illustration of the two-point matching technique for the peaking OMN design at a frequency of 2.25 GHz.

TABLE I D ESIGN PARAMETERS OF THE P EAKING OMN

of the OMN, i.e., S22 and θ21 , can be obtained according to (10)–(14), (16), and (17). An optimized OMN that can achieve two-point matching can then be designed accordingly. Taking the peaking OMN for example, assuming the center frequency is 2.25 GHz, the optimum load impedance Z P = 20  can be obtained using the load-pull results in Fig. 9, while Z P1 is chosen as 50 . The output impedance of the peaking device in class C operation is found to be Z OUT_P = 0.5 − j 41 . Considering the phase of S21 (θ21) as a degree of freedom, the parameter S22 can then be calculated using (17) and a set of output impedances Z OUT_ P1 can be plotted in the Smith chart, as shown in Fig. 11. When Z OUT_ P1 is close to quasi-short circuit, a series of θ21 = −61°−n ×180° (n = 0, 1, 2, . . .) can be obtained according to the periodicity of the exponential function, but only θ21 with the minimum value should be chosen in order to minimize the phasedelay dispersion over the whole frequency band for wideband operation. Using a similar approach, the design parameters for the frequencies of 1.7 and 2.8 GHz can also be determined, as listed in Table I. Due to the frequency dispersion of the output impedance of the peaking device, it is difficult to achieve a short circuit over the whole frequency band. A low reactance value was used as the target of Z OUT_ P1 for low and high frequencies. Considering the obtained S-parameters and θ21 , the OMN can then be designed by using network theory and tuned by optimization, as shown in Fig. 12. The load impedance at

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 6

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

TABLE II D ESIGN PARAMETERS OF THE C ARRIER OMN

Fig. 12. Designed matching network for both the amplifiers and the impedances on the Smith chart for frequency band of 1.7–2.8 GHz.

Fig. 14. Comparison of the back-off efficiency of the carrier amplifier (terminated with 25- load) with and without integrated CR.

To alleviate the difference in the phase response between the carrier and peaking amplifiers over a wide frequency range, a similar network topology, which can satisfy the requirements of both the amplifiers mentioned above, was used to design the OMNs. The carrier load impedance in the low-power region with integrated CR is shown in Fig. 12. It is worth to mention that, in this design, the offset lines that are normally used in the conventional DPA are not needed in the proposed DPA. B. IMN Design and System Simulation Fig. 13. Graphical illustration of the two-point matching technique for the carrier OMN design at a frequency of 2.25 GHz.

saturation power (with 50- load) and the peaking output impedance in class C operation are also depicted. The results show that the OMN designed using two-point matching technique can satisfy the requirement of the peaking amplifier. Regarding the carrier amplifier, the carrier OMN should achieve the desired effective load impedance for high backoff efficiency in the low-power region and the optimum load impedance at saturation. For convenience, the same optimum load impedance at saturation for 2.25 GHz (20 ) was chosen according to the inside region of the contours in Fig. 9. Using a similar approach, the two-point matching for the carrier OMN design can also be achieved, as shown in Fig. 13. Considering the desired carrier back-off impedance region, the phase of S21 in the carrier OMN can be chosen to be −61°. After calculating the common load with the integrated CR, the design parameters of the carrier OMN for frequencies of 1.7 and 2.8 GHz can be obtained, as shown in Table II.

After the OMNs are determined, the input matching networks (IMNs) can be designed using the stepped-impedance matching network topology to cover the required bandwidth. To verify the efficiency and bandwidth enhancement of the proposed method, for an output power of about 38 dBm, the efficiencies of the carrier amplifier terminated with a 25- load, with and without the peaking amplifier connected to the combining point, were simulated and compared. In the simulation, the carrier amplifier was biased at VGS = −3 V and VDS = 26 V, while the peaking amplifier was in OFF -state with the bias condition of VGS = −5.6 V and VDS = 30 V. The results in Fig. 14 show that a higher than 55% efficiency can be achieved over the whole frequency range with the proposed integrated CR generated by the peaking amplifier. Especially at the low frequency, the efficiency is considerably higher compared with the conventional design without using CR. The final circuit of the proposed DPA is shown in Fig. 15. To precisely convert 50 into 25  at the combining point over a large frequency band, a four-order real-to-real matching network was used. Meanwhile, because the phase response of

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. XIA et al.: BROADBAND HIGH-EFFICIENCY DPA WITH INTEGRATED CR

Fig. 15.

7

Complete schematic of the proposed DPA.

Fig. 16. Simulated load traces of the carrier device in the Doherty operation and the peaking device at saturation.

the carrier and the peaking amplifiers are different over a wide frequency range, a 3-dB 90° hybrid coupler, instead of the Wilkinson power divider, is used as the input splitter to ensure a suitable phase relationship between the output currents of the two amplifiers. In Fig. 16, the simulated load impedance traces of the carrier device (in the Doherty region) and the peaking device (at saturation) are depicted. As discussed in Section II, using the integrated CR does not affect the Doherty load modulation. The load impedances of both the carrier and peaking devices at saturation thus still satisfy the design criteria used in the loadpull simulations. More importantly, because of the integrated CR, the load impedances of the carrier amplifier at low output powers can match the desired load-pull results for the carrier amplifier, and thus high efficiency at back-off powers can be achieved. Fig. 17 shows the simulated efficiency of the proposed DPA at different back-off powers versus frequency, as well as the saturated output power. With the saturated power around 44 dBm, the DPA can achieve the efficiency of higher than 54% and 40% at 6- and 9-dB back-off powers within the entire frequency band of 1.7–2.8 GHz. IV. R EALIZATION AND E XPERIMENTAL V ERIFICATION To validate the proposed method, a broadband DPA was fabricated using Cree CGH40010F GaN HEMT and on a Taconic

Fig. 17. Simulated efficiency and saturated output power of the proposed DPA at frequency band of 1.7–2.8 GHz.

Fig. 18.

Photograph of the fabricated DPA.

RF35 substrate with εr = 3.55 and a thickness of 30 mil over the frequency range of 1.7–2.8 GHz, as shown in Fig. 18. For design convenience, a 3-dB 90° hybrid coupler (Anaren X3C22E1-03S) was used as the input splitter. The carrier amplifier was biased in the class AB condition with 0.05-A quiescent current and VDS = 26 V, while the peaking amplifier was biased in the class C mode to turn ON at about 6-dB back off power and VDS = 30 V. A. Measurement Results With Continuous Wave Signals First, the DPA was measured under continuous wave (CW) signal stimulus from 1.7 to 2.8 GHz. Fig. 19 shows the

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 8

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

TABLE III C OMPARISON W ITH R EPORTED W IDEBAND DPAs

Fig. 19. Measured efficiency and gain of the DPA versus output power at different frequencies.

Fig. 21. Measured output powers, average efficiencies, and ACLRs of the proposed DPA for modulated signal at 6.5-dB back-off power.

Fig. 20. Measured efficiency and gain of the DPA versus frequency at saturation, 6- and 9-dB back-off powers, respectively.

proposed DPA achieves a wider operation bandwidth than most DPAs in Table III, except the DPA designed using the bare-die device with slightly degraded 6-dB back-off efficiency [21]. B. Measurement Results With Modulated Signals

measured efficiency and gain characteristics of the DPA as a function of the output power under CW measurements. The saturated power is greater than 44 dBm from 1.7 to 2.8 GHz with a peak of 44.5 dBm. The results show that the gain, output power, and efficiency maintain consistent over the entire operation frequency band at back-off powers. Fig. 20 shows the measured efficiency and gain at different back-off powers in the frequency band of 1.7–2.8 GHz (approximately 49% fractional bandwidth). For back-off operation, the efficiency is within 50%–55% and 37%–41% at 6- and 9-dB back-off power, while the maximum gain is about 14.5 dB with the gain fluctuation of less than 2.5 dB. Regarding the saturated operation, the efficiency is between 57% and 71% with a gain of roughly 12 dB. Comparison with the reported wideband DPAs is outlined in Table III. For CW measurements, with good back-off efficiency and highly consistent saturated output power, the

To evaluate the performance of the proposed DPA for modulated signal applications, a 20-MHz LTE signal with a PAPR of 6.5 dB was used to test the efficiency and linearity performances over the whole frequency band. As shown in Fig. 21, the proposed DPA can deliver an average efficiency of 50%–55% at 6.5-dB back-off power, while the adjacent channel leakage ratio (ACLR) is around −30 dBc for most operational frequencies. To satisfy the linearity requirement of common wireless systems (i.e., ACLR < −45 dBc), digital predistortion (DPD) technique has been widely used to improve the linearity of the DPAs when they are excited by wideband modulated signals [2], [4], [6], [12], [14]–[17], [21]. To validate the linearity improvement of the DPA, the DPD measurement was conducted at 1.96, 2.14, 2.355, and 2.655 GHz considering the frequency division duplex LTE frequency band allocations. The block diagram of the experimental test bench is shown in Fig. 22. The baseband

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. XIA et al.: BROADBAND HIGH-EFFICIENCY DPA WITH INTEGRATED CR

Fig. 22.

9

Block diagram of the linearization test bench.

Fig. 23. A . M ./ A . M . and A . M ./ P. M . plots for 20 MHz LTE signal before and after DPD at 2.14 GHz. TABLE IV C OMPARISONS OF THE DPD P ERFORMANCES AT D IFFERENT F REQUENCIES

in-phase/quadrature signal was modulated and upconverted to RF signal in a vector signal generator (Keysight E4438C), and then amplified by the DPA. In the feedback loop, the output signal was downconverted and demodulated by a spectrum analyzer (Keysight E4445A) to baseband and then used for DPD inverse model extraction. The piecewise decomposition technique [24] together with the second-order dynamic derivation reduction Volterra model [25] was employed in the DPD modeling. The magnitude threshold was set as {0.4, 0.7} for the normalized data, while the nonlinearity order was selected as {7, 7, 7} and the memory length was set to {3, 3, 3}. After the inverse model was obtained, the predistorted signal can be generated. It can then be modulated and upconverted by the vector signal generator again as the input signal of the DPA for DPD linearization.

Fig. 24. Measured signal spectra before and after linearization for 20-MHz LTE modulated signal at 1.96, 2.14, 2.355, and 2.655 GHz.

To evaluate the linearization performance, taking the frequency of 2.14 GHz for example, the time-domain A . M ./ A . M . and A . M ./ P. M . characterizations at the average power of 38 dBm are shown in Fig. 23, where it can be seen that both the static nonlinearities and memory effects can be effectively removed after DPD linearization. The linearization results at different frequencies are summarized in Table IV. The proposed DPA can achieve an average efficiency of higher than 50% and an ACLR of better than –47 dBc at 6.5-dB back-off power (about 38 dBm) when using DPD linearization. Fig. 24 shows the measured signal spectra of normalized outputs before and after DPD. The results show that the distortion caused by the DPA can be removed effectively. For modulated signal measurements at about 6.5-dB back-off power, the proposed DPA can simultaneously achieve good efficiency and high linearity over the entire design frequency band. V. C ONCLUSION In this paper, a broadband high-efficiency GaN DPA with 49% fractional bandwidth and its linearization results have been reported. The frequency response of the combining network for bandwidth extension of the DPA was analyzed, and a broadband DPA with integrated CR was proposed

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 10

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

to extend the bandwidth in the back-off power region. A two-point matching technique using ABCD matrix was employed to design the OMNs in the DPA. Experiment results showed that the proposed DPA achieves very good average efficiency with high linearity performance after linearization over a wide frequency band when driven with modern modulated signals. A PPENDIX Considering the DPA topology in Fig. 1, the equations of the voltages and currents on both sides of the impedance transformer can be expressed in terms of the ABCD parameters as      A B VC1 VC = . (A.1) IC IC1 C D In the low-power region, the impedance seen on the right of the impedance transformer turns to be Z L . Considering VC1 = IC1 Z L , the carrier load impedance is given by VC ZL A + B . (A.2) = Z C_low = IC ZLC + D For the Doherty region, the fundamental voltage and current at the output of the carrier amplifier is written using (A.1) as VC = AV P + B IC1 IC = C V P + D IC1 .

(A.3) (A.4)

From Fig. 1, the fundamental voltage at the output of the peaking amplifier becomes V P = VC1 = IC1 Z C1.

(A.5)

By means of (A.3) and (A.5), the following expression is obtained: VP VC IC1 = . (A.6) = Z C1 AZ C1 + B According to the active load modulation, the following relationship arises:

 IP Z C1 = Z L 1 + . (A.7) IC1 Substituting (A.6) into (A.7), the impedance Z C1 is expressed as Z L VC + Z L I P B . (A.8) VC − Z L I P A According to the impedance transformation, the impedance Z C can be expressed as Z C1 =

Z C1 A + B . (A.9) Z C1 C + D Thus, the output voltage of the carrier amplifier becomes ZC =

Z C1 A + B . (A.10) Z C1C + D Substituting (A.8) into (A.10) and rearranging the terms results in I P Z L (AD − BC) + IC (Z L A + B) VC = . (A.11) ZLC + D VC = IC Z C = IC

For further derivation, the following relationship is assumed: IP = δe j ϕ (A.12) IC where δ = |I P_sat |/|IC_sat | and ϕ is the phase of the output current of the peaking amplifier. Taking into account (A.11) and (A.12), the carrier load impedance at saturation power can be given as follows: Z C_sat =

VC δe j ϕ Z L (AD − BC) + Z L A + B . = IC ZLC + D

(A.13)

By means of (A.3) and (A.5), the following expression can be obtained: VC Z C1 . (A.14) VP = AZ C1 + B Considering (A.8), (A.11), and (A.14), the peaking load impedance at saturation is written as Z P_sat =

VP e− j ϕ Z L + δ Z L D . = IP δZLC + δD

(A.15)

According to (A.2), (A.13), and (A.15), the frequency response of the DPA, including the load impedance of the carrier amplifier in the low-power region, the load impedances of both the amplifiers at saturation can be analyzed using the ABCD parameters of the impedance transformer at given frequencies. R EFERENCES [1] J. Moon, J. Kim, J. Kim, I. Kim, and B. Kim, “Efficiency enhancement of Doherty amplifier through mitigation of the knee voltage effect,” IEEE Trans. Microw. Theory Techn., vol. 59, no. 1, pp. 143–152, Jan. 2011. [2] J. Kim, B. Fehri, S. Boumaiza, and J. Wood, “Power efficiency and linearity enhancement using optimized asymmetrical Doherty power amplifiers,” IEEE Trans. Microw. Theory Techn., vol. 59, no. 2, pp. 425–434, Feb. 2011. [3] P. Colantonio, F. Giannini, R. Giofre, and L. Piazzon, “Increasing Doherty amplifier average efficiency exploiting device knee voltage behavior,” IEEE Trans. Microw. Theory Techn., vol. 59, no. 9, pp. 2295–2305, Sep. 2011. [4] R. Darraji and F. M. Ghannouchi, “Digital Doherty amplifier with enhanced efficiency and extended range,” IEEE Trans. Microw. Theory Techn., vol. 59, no. 11, pp. 2898–2909, Nov. 2011. [5] S. Chen and Q. Xue, “Optimized load modulation network for Doherty power amplifier performance enhancement,” IEEE Trans. Microw. Theory Techn., vol. 60, no. 11, pp. 3474–3481, Nov. 2012. [6] J. Xia, X. Zhu, L. Zhang, J. Zhai, and Y. Sun, “High-efficiency GaN Doherty power amplifier for 100-MHz LTE-advanced application based on modified load modulation network,” IEEE Trans. Microw. Theory Techn., vol. 61, no. 8, pp. 2911–2921, Aug. 2013. [7] K. Bathich, A. Z. Markos, and G. Boeck, “Frequency response analysis and bandwidth extension of the Doherty amplifier,” IEEE Trans. Microw. Theory Techn., vol. 59, no. 4, pp. 934–944, Apr. 2011. [8] M. Akbarpour, M. Helaoui, and F. M. Ghannouchi, “A transformerless load-modulated (TLLM) architecture for efficient wideband power amplifiers,” IEEE Trans. Microw. Theory Techn., vol. 60, no. 9, pp. 2863–2874, Sep. 2012. [9] D. Kang, D. Kim, Y. Cho, B. Park, J. Kim, and B. Kim, “Design of bandwidth-enhanced Doherty power amplifiers for handset applications,” IEEE Trans. Microw. Theory Techn., vol. 59, no. 12, pp. 3474–3483, Dec. 2011. [10] J. M. Rubio, J. Fang, V. Camarchia, R. Quaglia, M. Pirola, and G. Ghione, “3–3.6-GHz wideband GaN Doherty power amplifier exploiting output compensation stages,” IEEE Trans. Microw. Theory Techn., vol. 60, no. 8, pp. 2543–2548, Aug. 2012. [11] G. Sun and R. H. Jansen, “Broadband Doherty power amplifier via real frequency technique,” IEEE Trans. Microw. Theory Techn., vol. 60, no. 1, pp. 99–111, Jan. 2012.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. XIA et al.: BROADBAND HIGH-EFFICIENCY DPA WITH INTEGRATED CR

[12] D. Y.-T. Wu and S. Boumaiza, “A modified Doherty configuration for broadband amplification using symmetrical devices,” IEEE Trans. Microw. Theory Techn., vol. 60, no. 10, pp. 3201–3213, Oct. 2012. [13] D. Gustafsson, C. M. Andersson, and C. Fager, “A modified Doherty power amplifier with extended bandwidth and reconfigurable efficiency,” IEEE Trans. Microw. Theory Techn., vol. 61, no. 1, pp. 533–542, Jan. 2013. [14] R. Giofre, L. Piazzon, P. Colantonio, and F. Giannini, “A Doherty architecture with high feasibility and defined bandwidth behavior,” IEEE Trans. Microw. Theory Techn., vol. 61, no. 9, pp. 3308–3317, Sep. 2013. [15] L. Piazzon, R. Giofrè, P. Colantonio, and F. Giannini, “A wideband Doherty architecture with 36% of fractional bandwidth,” IEEE Microw. Wireless Compon. Lett., vol. 23, no. 11, pp. 626–628, Nov. 2013. [16] C. M. Andersson, D. Gustafsson, J. C. Cahuana, R. Hellberg, and C. Fager, “A 1–3-GHz digitally controlled dual-RF input power-amplifier design based on a Doherty-outphasing continuum analysis,” IEEE Trans. Microw. Theory Techn., vol. 61, no. 10, pp. 3743–3752, Oct. 2013. [17] M. N. A. Abadi, H. Golestaneh, H. Sarbishaei, and S. Boumaiza, “An extended bandwidth Doherty power amplifier using a novel output combiner,” in IEEE MTT-S Int. Microw. Symp. Dig., Tampa, FL, USA, Jun. 2014, pp. 1–4. [18] X. Fang and K.-M. M. Cheng, “Broadband, wide efficiency range, Doherty amplifier design using frequency-varying complex combining load,” in IEEE MTT-S Int. Microw. Symp. Dig., Phoenix, AZ, USA, May 2015, pp. 1–4. [19] X.-H. Fang and K.-K. M. Cheng, “Improving power utilization factor of broadband Doherty amplifier by using bandpass auxiliary transformer,” IEEE Trans. Microw. Theory Techn., vol. 63, no. 9, pp. 2811–2820, Sep. 2015. [20] S. Watanabe, Y. Takayama, R. Ishikawa, and K. Honjo, “A miniature broadband Doherty power amplifier with a series-connected load,” IEEE Trans. Microw. Theory Techn., vol. 63, no. 2, pp. 572–579, Feb. 2015. [21] R. Giofrè, L. Piazzon, P. Colantonio, and F. Giannini, “A closedform design technique for ultra-wideband Doherty power amplifiers,” IEEE Trans. Microw. Theory Techn., vol. 62, no. 12, pp. 3414–3424, Dec. 2014. [22] S. C. Cripps, RF Power Amplifiers for Wireless Communications, 2nd ed. Norwood, MA, USA: Artech House, 2006. [23] X. H. Fang and K.-M. M. Cheng, “Extension of high-efficiency range of Doherty amplifier by using complex combining load,” IEEE Trans. Microw. Theory Techn., vol. 62, no. 9, pp. 2038–2047, Sep. 2014. [24] A. Zhu, P. J. Draxler, C. Hsia, T. J. Brazil, D. F. Kimball, and P. M. Asbeck, “Digital predistortion for envelope-tracking power amplifiers using decomposed piecewise Volterra series,” IEEE Trans. Microw. Theory Techn., vol. 56, no. 10, pp. 2237–2247, Oct. 2008. [25] L. Guan and A. Zhu, “Simplified dynamic deviation reduction-based Volterra model for Doherty power amplifiers,” in Proc. IEEE Int. Integr. Nonlinear Microw. Millim.-Wave Circuits Workshop, Vienna, Austria, Apr. 2011, pp. 1–4. Jing Xia (S’12–M’15) received the M.E. degree in computer science and technology from Jiangsu University, Zhenjiang, China, in 2007, and the Ph.D. degree in electromagnetic field and microwave technology from Southeast University, Nanjing, China, in 2014. He was a Post-Doctoral Research Fellow with the RF and Microwave Research Group, University College Dublin, Dublin, Ireland, from 2015 to 2016. He is currently an Associate Professor with the School of Computer Science and Communication Engineering, Jiangsu University. His current research interests include high back-off efficiency power amplifier (PA) design, wideband efficient PA design, and digital pre-distortion techniques.

11

Mengsu Yang received the B.E. degree in information engineering and the M.E. degree in electromagnetic fields and microwave technology from Southeast University, Nanjing, China, in 2008 and 2012, respectively. He is currently pursuing the Ph.D. degree with University College Dublin (UCD), Dublin, Ireland. He is currently with the RF and Microwave Research Group, UCD. His current research interests include the design of high-efficiency, broadband power amplifiers.

Yan Guo (S’13) received the B.E. degree in information science and engineering from East China Jiaotong University, Nanchang, China, in 2007, the M.E. degree in communication and information systems from Southeast University, Nanjing, China, in 2011, and the Ph.D. degree in electronics engineering from University College Dublin (UCD), Dublin, Ireland. He is currently a Post-Doctoral Research Fellow with the RF and Microwave Research Group, UCD. His current research interests include digital predistortion for RF power amplifiers and RF digital to analog converter, and related field-programmable gate-array hardware implementations.

Anding Zhu (S’00–M’04–SM’12) received the B.E. degree in telecommunication engineering from North China Electric Power University, Baoding, China, in 1997, the M.E. degree in computer applications from the Beijing University of Posts and Telecommunications, Beijing, China, in 2000, and the Ph.D. degree in electronics engineering from University College Dublin (UCD), Dublin, Ireland, in 2004. He is currently an Associate Professor with the School of Electrical and Electronic Engineering, UCD. His current research interests include high-frequency nonlinear system modeling and device characterization techniques with a particular emphasis on behavioral modeling and linearization for RF power amplifiers, wireless and RF system design, digital signal processing, and nonlinear system identification algorithms.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

1

High-Efficiency Microwave and mm-Wave Stacked Cell CMOS SOI Power Amplifiers Sultan R. Helmi, Student Member, IEEE, Jing-Hwa Chen, Student Member, IEEE, and Saeed Mohammadi, Senior Member, IEEE Abstract— Design and implementation of high-efficiency microwave and mm-wave CMOS silicon-on-insulator (SOI) power amplifiers (PAs) based on a stacked cell approach is presented. Two stacked cell PAs have been implemented in GlobalFoundries 45-nm CMOS SOI technology. The first PA operating at K-band (24–28 GHz) is designed with three stacked triple Cascode cells. Each cell uses three standard transistors with separate layout. At 24 GHz, the K-band PA biased under a supply voltage of 10.8 V measures a maximum linear power gain of 13 dB, a saturated output power PSAT of 25.3 dBm, a −1-dB output power P1dB of 23.8 dBm, and a peak poweradded efficiency (PAE) of 20%. The second PA targeted at U-band frequencies is designed with two stacked triple Cascode cells. Transistors in each cell have a combined layout that reduces parasitic capacitances, leading to significant improvement in the PAE at mm-wave frequencies. The U-band PA operates from 42 to 54 GHz. At 46 GHz, and under a supply voltage of 6 V, it measures a saturated output power ( PSAT ) of 22.4 dBm, a linear gain of 17.4 dB, and an unprecedented peak PAE of 42%. Index Terms— CMOS, fifth-generation (5G), high efficiency, microwave frequency, millimeter-wave (mm-wave) frequency, power amplifier (PA), silicon-on-insulator (SOI), stacked transistors.

I. I NTRODUCTION

T

HE increasing demand for faster mobile communication has recently shifted attention to the under-utilized spectrum in the microwave and millimeter-wave (mm-wave) frequencies as a potentially viable solution for achieving an order of magnitude higher data capacity than that of existing fourth-generation (4G) cellular networks. Beamforming is a key technology for microwave and mm-wave mobile broadband transceivers, which reduces the peak power requirements to a range not far from what can be achieved by nanoscale CMOS technologies [1]. Integration capability and the relatively low cost of CMOS technology, on the other hand, have propelled it into the wireless market. Low efficiencies and poor linearities of CMOS power amplifiers (PAs) are currently the bottlenecks of developing fully integrated transceivers for microwave and mm-wave applications. The main challenges in implementing high-performance CMOS PAs are:

Manuscript received August 9, 2015; revised January 8, 2016; accepted April 10, 2016. S. R. Helmi and S. Mohammadi are with the Department of Electrical and Computer Engineering, Purdue University, West Lafayette, IN 47907 USA (e-mail: [email protected]; [email protected]). J.-H. Chen was with the Department of Electrical and Computer Engineering, Purdue University, West Lafayette, IN 47907 USA. He is now with Qualcomm Technologies Inc., Boxborough, MA 01719 USA (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TMTT.2016.2570212

1) reduced transistor breakdown voltages with CMOS scaling, which limits the safe operating voltage of nanoscale CMOS transistors to ∼1 V and, consequently, limits the output power; 2) frequency, bandwidth, and efficiency limitations caused by parasitic capacitances of both active and passive components to the conductive Si substrate; and 3) low quality factor (Q) of on-chip passive components (inductors, transformers, and transmission lines), which degrades power gain, output power, and efficiency. To increase the PA output power, three general power combining topologies may be used: 1) parallel connection (opencircuit power combining); 2) series connection or stacked approach (short-circuit power combining); and 3) power combining using an arbitrary impedance to add up powers (both voltage and current signals) of individual power cells into one large output power. In the first topology, a parallel combination of transistors is used to increase the overall output current and, hence, output power, while the individual transistors are subjected to an identical output voltage signal. The technique reduces both input and output impedances and necessitates impedance matching networks with large impedance transformation ratios because very wide multifinger transistors are utilized. Matching networks, when implemented on a Si chip, are lossy and degrade the output power, bandwidth, and efficiency of the PA. Additionally, a high current flowing through a parallel combination of multi-finger transistors has both shortand long-term reliability concerns, leading to thermal runaway and catastrophic failure of the PA [2]. The second topology to increase the output power of a PA is to use a series connection of transistors to add the output voltage signals of individual transistors towards a large overall voltage swing and enhanced output impedance [3], [4]. Possible limitations of this topology are substrate breakdown, signal leakage to the substrate, drain–source reach-through, gate– oxide breakdown, instability, and output power and efficiency degradations due to phase and gain variations across individual power transistors [5]. If the dc current path is shared among the stacked transistors, the overall current in this topology is limited by the current of the least conductive transistor in the stack, thus preventing thermal runaway and improving both short-term and long-term reliabilities of the PA. The third approach, namely, the power-combining topology in arbitrary impedance, is widely used in III–V microwave and mm-wave PAs, but suffers from loss and parasitics of powercombining circuits implemented in Si-based technologies. The loss of the power-combining network in combination with its parasitic capacitors to the conductive Si substrate limits the output power, efficiency, and bandwidth of power-combined

0018-9480 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 2

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Fig. 1. Cell designs for stacked-cell CMOS SOI PAs. (a) CS cell with input transformer. (b) Cascode cell (CS and a CG transistor) with an input transformer. (c) Triple cascode cell (CS followed by two CGs) with an input transformer.

Si-based PAs. Moreover, power combiners occupy large Si area, leading to high implementation cost. Nevertheless, in the absence of other viable approaches, this technique has been widely utilized to implement microwave and mm-wave CMOS PAs [6]–[13]. Although the power-combining approach achieves high output powers, it is not area efficient, and leads to low PA efficiencies [11]–[13]. On the other hand, the stacking approach requires small areas and has shown good efficiencies for only moderate amount of output powers [5], [14], [15]. There are several stacked PA designs reported in the literature including stacking a common-source (CS) stage with several common-gate (CG) transistors [14], stacking several CS cells with input transformers [15], or a combination of the two approaches by implementing cells of CS and/or Cascode with input transformers used for dc isolation [5]. The output power of the first stacking approach (one CS and several CGs) is limited, as practically, only a maximum of four transistors have been successfully stacked. The limitation in the number of stacked transistors in this approach is due to the gate– oxide breakdown voltage of transistors, stability concerns, as well as the fact that the bypass capacitor for the top CG stage approaches parasitic capacitance values. If more power is required, power combing must be utilized [12], [13]. The power gain and power-added efficiency (PAE) of the second approach (stacked CS cells) are usually limited due to the fact that this approach uses a relatively lossy transformer per each CS transistor in the stack. The third approach combines the advantages of the two other approaches by facilitating stacking of more than four transistors (Cascodes instead of CS stages) through utilizing input transformers and dynamic biasing of transistors. Power performance metrics such as output power, gain, and PAE can presumably increase in this approach as more transistors are stacked. In reality, however, the power performance is limited by inefficient combining of stacked transistors caused by variations in the phase and amplitude of output voltage signals of stacked transistors. To date, PAs implemented using this approach have achieved low PAEs (below 32%) at microwave and mm-wave frequencies [5], [16], [17]. In this paper, cell designs suitable for stacked cell PAs are first introduced. The self-biased triple Cascode

cell (CS–CG–CG) with an input transformer outperforms its CS and Cascode counterparts by providing high swing voltage, high gain, and high output impedance while maintaining small area through using a single transformer for every three transistors. Next, an overview of efficiency mechanisms and analysis of combining efficiency and output power degradation of stacked cell PAs stemming from parasitic capacitances and feeding effects are provided. The analysis is based on amplitude and phase variations among combining stacked cells, which could lead to inefficient combination of voltage waveforms. Such sub-optimal voltage combining leads to degraded efficiency and low output power. The proposed PA designs target the frequency range of 24–28 and 46–54 GHz. The former frequency range is potential band for the next generation of cellular communication [fifth generation (5G)] selected for small path loss in the atmospheric absorption [18], [19], while the latter design is appropriate for radars and short-range and ultrahigh-bandwidth mm-wave communication systems. Both PAs, implemented in a standard GlobalFoundries 45-nm CMOS silicon-on-insulator (SOI) process achieve good power performance with the U-band PA measuring PAE >40%. To the authors’ best knowledge, this is the first CMOS-based PA operating in microwave and mm-wave frequencies with a measured PAE >40%. The contributions of this paper are: 1) an analytical design technique with a simplified linear model for each cell that predicts efficiency and output power accurately; 2) utilizing triple Cascode transistors as the amplifying cell for stacked PA design at microwave and mm-wave frequencies; 3) combining the layout of triple Cascode transistors to mitigate parasitic capacitances and achieve high efficiency; and 4) demonstration of PAE >40% for the first time for mm-wave PAs. II. C ELL A NALYSIS AND D ESIGN In this section, first we demonstrate how different cell designs for microwave and mm-wave stacked CMOS SOI PAs can be simplified and represented by a linear equivalent circuit. It turns out that the PA analysis using such a simplified equivalent circuit is not only accurate for class-A and class-AB amplification mode, but also provides an insight into the design of these CMOS SOI PAs. Fig. 1 shows schematics of the three different cell designs suitable for implementing stacked cell PAs. The core of each cell may

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. HELMI et al.: HIGH-EFFICIENCY MICROWAVE AND mm-WAVE STACKED CELL CMOS SOI PAs

Fig. 2.

3

AC equivalent circuit of the triple Cascode cell.

be a CS amplifier stage, a Cascode (CS–CG) stage, or a triple Cascode (CS–G–CG) stage. Note that as the cells are designed to be stackable, they utilize input-coupling transformers that provide dc isolation. Consequently, a large supply voltage may be used to provide the PA with high voltage swings and high output powers. A self-biasing scheme through a resistor ladder sets the bias voltages of gate terminals of individual transistors within the dc voltages across the cell output terminals. Capacitor Cc is a coupling capacitor, while capacitors C1 and C2 are bypass capacitors in these designs. The triple Cascode cell has an apparent advantage over the other two cells in that it can provide three times larger voltage swing compared to that of the CS cell (and 1.5 times compared to Cascode). On the other hand, the triple Cascode cell essentially occupies the same chip area as the two other cells since the cell area is dominated by the footprint of the input transformer. Therefore, a PA with a fixed number of stacked transistors, designed based on triple Cascode cells, occupies a smaller area and has less loss originated from transformers, compared to PAs with the same number of stacked transistors designed with the other two types of cells. As a promising building block for the design of stacked cell PAs, the triple Cascode cell demands a closer scrutiny performed in the following analysis. The analysis can be easily extended to the CS and Cascode cells. The performance of the stacked cell PA is often limited by various internodal parasitic capacitances [16]. To understand the importance of these capacitors, the small-signal equivalent circuit of the triple Cascode cell, shown in Fig. 2, is analyzed. The method of open-circuit time constants is utilized to find the importance of each capacitor by looking at the resistance facing the capacitor while others are open circuited [20]. In the figure, capacitors Cgsi , Cgdi , and Cdsi (i = 1, 2, 3) are the gate-to-source capacitor, gate-to-drain capacitor, and drain-tosource capacitor, respectively. gmi (i = 1, 2, 3) represents the transconductance of each transistor in the stack. The following simplifications are made in order to make the analysis possible. First, resistors in the self-biasing resistor ladder are ignored in the ac equivalent circuit since they are usually large. The output resistance of each transistor (Rds ) is also ignored to simplify the analysis. Capacitors Cc , C1 , and C2 are coupling and bypass capacitors that do not limit the performance of the circuit at high frequencies, hence, shorted in the analysis. Their role is to equate the drain–source voltage swings of the three transistors in the triple Cascode configuration [14]. Parasitic capacitors that exist between gate terminals of stacked transistors and substrate appear in parallel with Cgs1 , C1 , and C2 ,

Fig. 3. Small-signal equivalent circuits to measure the open circuit resistance seen by: (a) capacitors Cgs2 , Cds1 , and Cpar1 , (b) capacitors Cgd2 , Cgs3 , and Cpar2 , (c) capacitor Cds2 , (d) capacitor Cds3 , and (e) capacitors Cgd2 and Cpar3 .

and are absorbed by those capacitors. Capacitors Cgs1 and Cgd1 are often tuned out with the inductance seen from the secondary winding of the input transformer, and thus are ignored in the analysis (considered open circuit). No power dividing is caused by the source resistance Rs (hence, ignored) as it appears in parallel with a large impedance from tuning of Cgs1 and Cgd1 . Capacitors Cpar1 , Cpar2 , and Cpar3 are parasitic capacitances due to drain-to-substrate and source-to-substrate capacitance (negligible for SOI transistors), the interconnect lines, and layout of the transistors. These capacitors form internodal capacitors that may not be ignored in the analysis. Following the method of open-circuit time constants, Fig. 3(a) is used to find the resistance seen by capacitors Cgs2 , Cds1 , and Cpar1 . All other capacitors in the circuit are open circuited and a test voltage v t is placed instead of capacitors

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 4

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Cgs2 , Cds1 , and Cpar1 , as its current i t is measured. The ratio of the test voltage v t to the test current i t is the resistance seen by the following capacitors: Ro1 =

−v gs2 vt 1 = = . it −gm2 v gs2 gm2

(1)

For a power transistor with a width of several 100 μm biased under a high current, the transconductance gm2 is large, leading to a small resistance (Ro1 ) seen by capacitors Cgs2 , Cds1 , and Cpar1 . Hence, these capacitors do not have much impact in limiting the bandwidth and performance of the triple Cascode cell. Fig. 3(b) shows the circuit used to find the resistance seen by capacitors Cgs3 , Cgd2 , and Cpar2 . The resistance is calculated according to Ro2 =

v gs3 vt 1 = = . it gm3 v gs3 gm3

(2)

As the transconductance gm3 is also large, the resistance (Ro2 ) seen by these capacitors is small, thus they do not have much impact on the performance of the triple Cascode cell. The analysis of the resistance seen by capacitor Cds2 is more involved, as shown in Fig. 3(c). Assuming that the two transistors have identical transconductance, gm2 = gm3 (same drain currents and same W/L ratios), the resistance seen by the capacitor Cds2 is calculated according to Ro3

v gs2 − v gs3 vt 1 = = = . it gm2 v gs2 − gm3 v gs3 gm2

(3)

Thus, the impact of Cds2 on the performance of the triple Cascode cell can be ignored. The impact of the capacitor Cds3 is found by calculating its open-circuit resistance according to Fig. 3(d). Note that the current flowing into R L is zero, hence the voltage that appears across it is also zero, leading to the following equation: Ro4 =

v gs3 vt 1 = = . it gm3 v gs3 gm3

(4)

Thus, the impact of Cds3 can be ignored. Finally, the impact of capacitors Cgd3 and Cpar3 is found from Fig. 3(e). As the current source gm3 v gs3 is open circuited and must be set to zero, the resistance seen by these two capacitors is calculated according to vt Ro5 = = RL . (5) it The load resistance of the cell, R L , is typically much larger than 1/gmi . Therefore, Cgd3 and Cpar3 enforce the dominant pole of the triple Cascode transfer function and their impact cannot be ignored in the analysis. In summary, the triple Cascode cell has only two capacitors (Cgd3 and Cpar3 ) that have a significant impact in limiting its high-frequency performance. These two capacitors (Cgd3 and Cpar3 shown in Fig. 2) are effectively in parallel and appear at the output node of the cell. Let us now analyze the condition under which the triple Cascode cell provides maximum output voltage swing. As the performance of scaled CMOS SOI transistors is limited by the gate–drain–oxide breakdown and drain–source reach-through

breakdown mechanisms, the maximum swing voltage of the triple Cascode cell is obtained under two necessary conditions. First, there should be negligible phase difference among the three drain–source voltage signals of the transistors in the triple Cascode cell such that the signals add up constructively. This condition is satisfied by negligible phase contributions from the internodal capacitances of the cell, as demonstrated in the above discussion. Secondly, the swing voltage across the drain–source of each transistor must be identical (v ds1 = v ds2 = v ds3 ) such that at the maximum output swing voltage of the cell, the three transistors have identical drain–source swing voltages just below the maximum value set by the reach-through breakdown condition. Considering the smallsignal equivalent circuit of the triple Cascode cell shown in Fig. 2, the values of the capacitors C1 and C2 may be adjusted in order to achieve identical swing voltages across the three transistors, i.e., v ds1 = v ds2 = v ds3 . By assuming identical transconductance gm1 = gm2 = gm3 due to the same drain current and same W/L ratios of the three transistors, one obtains identical voltage swings across the gate source of each transistor, i.e., v gs1 = v gs2 = v gs3 . The following equations describe the ac gate–source voltages of the top two transistors C1 v ds1 C1 + Cgs2 C2 (v ds1 + v ds2 ) =− C2 + Cgs3 (v ds1 + v ds2 + v ds3 ) =− . gm3 R L

v gs2 = −

(6)

v gs3

(7)

v gs3

(8)

Therefore, the values of the capacitors C1 and C2 to obtain identical swing voltages are calculated by setting v ds1 = v ds2 = v ds3 . Equations (6) and (8) lead to C1 =

3Cgs2 . gm3 R L − 3

(9)

Equations (7) and (8) lead to C2 =

3Cgs3 . 2gm3 R L − 3

(10)

If it is assumed that the transistors have the same W and L, capacitors Cgs2 and Cgs3 become identical. Note that a minimum transconductance value of gm3 > 3/R L is necessary to achieve proper circuit operation. On the other hand, if transconductance gm3 is too large, both C1 and C2 approach values below parasitic capacitances seen at the corresponding gate terminals to GND (substrate), leading to an impractical design. The layout design of the transistors has a significant impact on the performance of the PA. Fig. 4 shows a partial transistor layout of two different triple Cascode cells. The first layout shown in Fig. 4(a) depicts three standard transistors with their layout generated directly by the process design kit. Therefore, the three transistors are completely separated from each other and are connected using metal interconnects. To reduce the parasitic capacitances of the transistors to the substrate and the overall area of the cell, the layout of the three transistors has been combined into one cell, as shown in Fig. 4(b). In this design, the drain of the CS transistor and the source of the first

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. HELMI et al.: HIGH-EFFICIENCY MICROWAVE AND mm-WAVE STACKED CELL CMOS SOI PAs

5

(for example, a maximum of 50% for class A), the kneevoltage efficiency ηknee , the matching network efficiency ηm , the impedance transformation efficiency ηtran , and the combining efficiency ηcomb of the PA. Some of these efficiencylimiting mechanisms are described briefly in the following, whereas a detailed analysis on the combining efficiency is provided in Section IV. The maximum voltage swing at the output of an amplifier is the difference between the peak voltage Vmax set by the safe-operating voltage of the transistor and a minimum voltage known as the knee-voltage Vknee , the voltage across the transistor when its current is at a maximum. The knee-voltage efficiency is defined according to the following equation:     Vmax − Vknee 2 Vknee 2 = 1− . (12) ηknee = Vmax Vmax

Fig. 4. (a) Standard (non-combined) transistor layout and (b) combined transistor layout of the triple Cascode cell.

CG transistor share the same N+ region. Similarly, the drain of the first CG transistor and the source of the second CG transistor share the same N+ region with no metallization in between them. As a result, each transistor and consequently each cell have lower parasitic capacitances to the substrate. Layout parasitic extraction and fitting to the PA measurement data revealed that the values of the internodal parasitic capacitances (Cpar1 , Cpar2 , and Cpar3 ) of the combined transistor layout are in average reduced to one-third of the original capacitances where separate transistors with identical W and L values are utilized. As experimentally demonstrated by PA measurements, suppressing these capacitors has a significant impact in boosting the overall efficiency of the PA. III. E FFICIENCY M ECHANISMS There are several mechanisms that influence the PAE of a PA (including a stacked cell PA) as described by the aid of the following equation [21]:     1 1 PAE = 1 − η < 1− ηclass ηknee ηm ηtran ηcomb . G G (11) In the above equation, η is the drain efficiency (DE) and G is the power gain of the amplifier. The maximum DE that can be attained depends on several efficiency limiting mechanisms including the efficiency of the class of amplification ηclass

To maximize the knee-voltage efficiency of a PA, a small Vknee / Vmax ratio is desired. In III–V technologies utilizing transistors with large breakdown voltages, this ratio is small, leading to knee-voltage efficiencies close to 100%. On the contrary, nanoscale CMOS power transistors operating at large current densities and small safe operating voltages present significant Vknee /Vmax . By operating a CMOS transistor under low current densities, the knee voltage can be suppressed, leading to improved knee-voltage efficiencies. Stacked PAs can operate with large voltage swings and relatively small current densities, facilitating improved knee-voltage efficiencies compared to power-combined CMOS PA designs. Low quality factors Q’s of on-chip output matching networks reduce the power delivered to the load and lower the matching network efficiency according to [21] 2  RLoad (13) ηm = RLoad + Rpar where RLoad = 50  is the load impedance and Rpar is the effective series parasitic resistance of the output matching network at the operating frequency. Utilizing small inductors with high quality factors in the output matching network improves this efficiency. Impedance transformation from the output impedance of the PA to a 50- load also causes degradation in the efficiency. For a simple L-match that uses an inductor with a quality factor of Q L and an impedance transformation ratio of 50/Rout , where Rout is the output resistance of the PA before output matching network, the impedance transformation efficiency is expressed by [21] ⎛ ⎞2 Q L ⎠ . (14) ηtran = ⎝  50 Q L + Rout − 1 Using a stacked PA topology, one can design the output impedance of the PA to be close to 50 , leading to the elimination of the matching network and, thus, 100% impedance transformation efficiency. The overall efficiency of any PA, including a stacked cell PA, may be affected by amplitude and phase variations across the combining signals (current, voltage, or both), leading to PA output power and efficiency degradations. These amplitude

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 6

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

and phase variations may be due to variations in the bias or temperature across power transistors or may be due to process nonuniformities (device-to-device variations). Such random variations are expected to be negligible in CMOS PAs with large transistor sizes. Instead, systematic errors in the amplitudes and phases of combining signals caused by asymmetry in the PA circuit should be considered. The asymmetry stems from parasitic capacitances or inductances of the PA circuit. PAs with parallel connection of transistors (open-circuit power combining) may be affected by parasitic series inductors, which force extra delays among current signals flowing into each transistor. If the paths from the drain of each transistor to the output node are identical, the systematic amplitude and phase differences of transistor currents are negligible leading to 100% combining efficiency. For asymmetric output signal paths, the efficiency is maintained high at low frequencies due to negligible extra phase (electrical delay) introduced by the inductance of the asymmetric interconnect design. Similar argument holds for power-combined PAs where the paths from each cell to the output node are usually designed to be symmetric. On the contrary, stacked CMOS PAs on conductive Si substrate have inherent asymmetry seen from both input and output of each transistor due to parasitic capacitors from PA’s internal nodes to the conductive Si substrate. The asymmetry at high frequency causes systematic phase and amplitude variations of combining voltage signals, leading to the degradation of combining efficiency. Assuming the voltage swings across combining cells of a stacked PA are identical and current signals flowing through parasitic capacitors at all internal nodes are negligible, the overall combining efficiency (ηcomb ) may be expressed by [22] n 2 1 ( j ∅out−k ) ηcomb = 2 e (15) n k=1

where out−k is the relative phase of the drain–source voltage signal of each transistor. In an earlier work [3], a constant phase difference between drain–source voltage waveforms of adjacent transistor cells (θd ) was assumed, leading to a linear increase of phase with the number of transistors (k) in the stack (16) ∅out−k = (k − 1)θd . In practice, the amplitudes of the combining voltage signals and their phase differences are not constant, as discussed in Section IV. IV. C OMBINING E FFICIENCY A NALYSIS Parasitic capacitors to the conductive Si substrate in a stacked cell PA cause both amplitude and phase variations of combining voltage signals. Phases vary due to both RC and LC delays caused by parasitic capacitors, while amplitudes vary since parasitic capacitors modify the voltage gain of each cell. Assume that a stacked cell PA composed of n identical amplifying cells is directly connected to a 50- load impedance and is operating in weak nonlinear regime (below its −1-dB compression point) such that a linear gain analysis is still valid. Each cell can be any of the three topologies shown in Fig. 1. Each cell is represented using a small-signal model

Fig. 5. (a) Simplified small-signal model of the stacked PA. (b) Input impedance equivalent circuit of each amplifying cell of the stacked PA.

with an equivalent transconductance G m (G m = gm1 for the triple Cascode cell) and a capacitance (Cout = Cpar3 + Cgd3 for the triple Cascode cell) at the output node, as shown in Fig. 5(a). Note that this assumption is valid as the analysis of the triple Cascode cell (and, by the same token, the Cascode cell) in Section II demonstrated that only the capacitor seen at the output node (Cout = Cpar3 + Cgd3 ) is significant. To calculate the output voltage signal of each amplifying cell, the variations in the input signals Vgs,k have to be considered. If amplifying cells with high G m are utilized, the input impedance of each cell shown in Fig. 5(b) is almost constant, despite varying Z out,k−1 at the source of each cell. Therefore, the input signal Vgs,k of each cell is constant (Vgs ), leading to the following equations deducted from Fig. 5(a) for the output voltage signal and phase of each amplifying cell:   RL Cout v out,k−1 − v out,k−2 (17) v out,k = 2 + j ω n v DS,k = v out,k − v out,k−1 (18)

 Imag Vout,k − Vout,k−1 ∅out−k = tan −1 . (19) Real(Vout,k − Vout,k−1 ) The above equations show how the phase and amplitude of the combining signals vary across the cells of the stacked cell PA. According to this analysis, the first cell (the bottom

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. HELMI et al.: HIGH-EFFICIENCY MICROWAVE AND mm-WAVE STACKED CELL CMOS SOI PAs

Fig. 6. Combining efficiency across the frequency range of 20–30 GHz for a different number of stacked cells. The parasitic capacitance Cout = 67 fF is extracted from post-layout simulation for a 320- μm-wide and 45-nm-long CMOS SOI RF NMOS transistor.

of the stack) has the lowest peak-to-peak output voltage. For the top cell in the stack, on the contrary, the peak-topeak output voltage is the highest among all cells. When a stacked cell PA is pushed to higher output powers (and higher output voltage swings), the disparity in the output voltage swings across the cells results in premature source– drain reach-through breakdown of the transistors in the top cell before the bottom cells have any appreciable voltage swings. To prevent the breakdown, the swing across the top cell must be kept below the maximum value leading to limited overall output voltage swing and, hence, limited output power. Overall, output voltage and combining efficiency of a stacked cell PA can be calculated incorporating the effects of phase variation at the output drain line and output voltage variations due to asymmetry [see (17)–(19)]. Combining efficiency is calculated when the drain–source peak-to-peak voltage of the top cell, |Vout,n − Vout,n−1 |, reaches its maximum allowed voltage. As opposed to (15) where combining efficiency was calculated for varying phases, but constant amplitudes of combining signals, the combining efficiency calculated here takes into account varying phases and amplitudes of combining signals as derived by (18) and (19), Vout,n =

n

k=1

ηcomb =

VDS,k e( j ∅out−k ) Vout,n

n Vout,n − Vout,n−1

(20) 2 .

(21)

Fig. 6 plots the combining efficiency [see (21)] across the frequency from 20 to 30 GHz for one of the PAs designed in this work using a standard 45-nm CMOS SOI technology. RF NMOS transistors with 320 fingers and a total width of 320 μm are utilized in the design. Stacking more than three cells results in a noticeable drop in combing efficiency (below 80%). The degradation is more pronounced at higher frequencies (such as ∼50 GHz), practically limiting the

Fig. 7.

7

(a) Bottom feeding and (b) top feeding.

number of stacked cells to two cells at such high frequencies, leading to reduced output powers for mm-wave designs [17]. Note that the overall efficiency will be much less than the targeted 80% due to other efficiency limiting mechanisms discussed in Section III. For a class-AB PA with a combining efficiency of 80%, an overall DE of ∼ 30% is expected. In plotting Fig. 6, Cout = 67 fF is extrapolated from postlayout parasitic extraction of a single finger transistor. The actual value of the capacitor may be slightly higher due to parasitics caused by interconnect metallization. V. F EEDING E FFECT Fig. 7(a) and (b) shows a simplified schematic of two different topologies for the feeding of the input signal, one from the bottom and one from the top of the stack, respectively. The effects of feeding on the output power and efficiency have been discussed in [3] with feeding from the bottom resulting in a better power performance. A similar experimental observation is made in this work (i.e., a top feeding 20–30-GHz PA has no gain) and an analysis of this observation is provided in the following. Note that the input signal in our stacked PA design is fed to series connected transformers, as schematically shown in Fig. 7(a). The combination of inductances and mutual inductances of transformers with the parallel combination of parasitic capacitances from the primary windings of the transformers to GND and capacitances transferred from the secondary to the primary form an LC ladder network rather than an RC ladder network. The LC ladder network of transformers combined with distributed transmission line effects of interconnects introduces a constant phase between adjacent cells, leading to a linear increase of phase with the number of cells in the stack. The relative phase for the kth cell is calculated according to θin−k = (k − 1)θg

(22)

where θg is the constant phase difference between the input voltages across the primary windings of adjacent cells.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 8

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Fig. 9. Circuit schematic of the K-band PA with standard (non-combined) transistors. Fig. 8.

Total phase for top and bottom feedings at 24 GHz.

The total phase differences for feeding from bottom and top can be calculated according to θtotal−n = out−n ± θin−n

(23)

where θtotal−n is the total phase difference, out−n is the overall output phase of the stack [see (19)], and θin−n is the overall input phase of the stack [see (22)]. The plus sign represents the feeding from the top while the minus sign represents the feeding from the bottom. The feeding from the bottom not only improves the combining efficiency, but also increases the isolation between the input and output ports leading to improved circuit stability. Fig. 8 shows the change of phase for different number of stacks when the PA of the previous example is fed from the top and bottom under the assumption of θg = 3o estimated at the operating frequency of 24 GHz. Moreover, the figure shows the effect of feeding on combining efficiency. By feeding from the top, the phase variation increases leading to lower combining efficiency, while feeding from the bottom compensates for part of the phase variations and facilitates higher combining efficiencies. VI. PA C IRCUIT D ESIGN Two main design variable of a stacked cell PA operating at a specific frequency are transistor sizing and the number of stacked transistors. Transistor sizing affects input and output impedance, internodal parasitic capacitances, current drive capability, and gain. On the other hand, the number of stacked transistors in the PA design determines input and output impedance, as well as the combining efficiency and maximum output power. To avoid large impedance transformation ratios, the sizing of stacked transistors should be estimated from the output impedance at a specific frequency, as described in [5]. The design of a bottom-fed 20–30-GHz stacked cell PA that operates in class-AB amplification mode is first pursued. We assume utilizing triple Cascode cells as the building block for the PA. Such a cell designed with NMOS SOI transistors with total widths of 320 μm and lengths of 45 nm provides suitable output impedance (far from the edge of

the Smith chart and limited change with frequency) in the frequency range from 20 to 30 GHz. To achieve good output power and PAE performance, the combining efficiency shown in Fig. 8 should be maximized. An expected class efficiency for a class-AB amplifier is ∼60% while an estimated kneevoltage efficiency of 76%, based on a safe operating voltage of 1.2 V and a knee voltage of approximately 0.3 V is considered. If the output impedance is close to 50  and no impedance transformation is used, the loss from the matching network and impedance transformation network is minimized, leading to an estimated 90% matching network and impedance transformation efficiencies. Therefore, the dominant factor in the overall efficiency will be the combining efficiency. If an overall DE of higher than 30% is desired, the combining efficiency must be higher than 73%, leading to a maximum number of four cells (12 transistors) that can be implemented in the proposed 20–30-GHz stacked cell PA design. By a similar argument, a PA with an operating frequency of 45–55 GHz may only utilize a maximum of three stacked Cascode cells (nine transistors) to keep the overall DE above 30%. In this work, two fully integrated stacked cell PAs are implemented in a standard 45-nm CMOS SOI technology. Both PAs are designed based on triple Casecode cells with a bottom feed configuration. The first PA is designed for K-band/5G applications (20–30 GHz), while the second PA is targeted at U-band (45–55 GHz) frequencies for ultrahigh-bandwidth communications. The CMOS SOI technology surrounds each transistor with buried and trench oxide, leading to electrically isolated transistors. When transistors are stacked, no transistorto-substrate or transistor-to-transistor leakage currents and breakdown are observed. As shown in Fig. 9, the K-band PA is designed with three stacked cells to keep the DE high at the cost of reduced saturated output power. To boost the output power, each cell of the PA consists of a triple Cascode design with one transformercoupled CS stage and two CG stages (M1 –M3 ). Each cell utilizes one transformer for every three transistors, leading to high isolation and high gain of the cell at microwave frequencies within a very compact area. The utilization of three transistors per cell allows higher supply voltage and higher output

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. HELMI et al.: HIGH-EFFICIENCY MICROWAVE AND mm-WAVE STACKED CELL CMOS SOI PAs

9

Fig. 11. Simulated coupling coefficient of the transformer using ANSYS HFSS for K-band PA. Fig. 10.

Voltage waveform across each cell for K-band PA.

power without degrading the combining efficiency when compared to those of a standard Cascode cell design [16]. The triple Cascode configuration adds extra parasitic capacitances, but as discussed in Section II, most of these parasitics can be ignored to the first degree. Only the parasitic capacitance between the output node and GND (Cout = Cgd3 + Cpar3 ) is considered in the design. The triple Cascode configuration provides higher gain, which helps boost the PAE. A network of feedback resistors dynamically biases NMOS transistors in class-AB mode and overcomes the gate–oxide breakdown in the stacked cell PA design [23], [24]. Transistor sizes (each with 320 fingers and an overall width of 320 μm) are optimized to provide optimum output impedance close to 50  in the frequency range from 20 to 30 GHz. To obtain high power, it is essential to achieve uniform drain–source voltage swing contribution from each transistor within the cell with minimum phase differences. The capacitors at the gate of CG stages calculated from (9) and (10) are further optimized to achieve uniform voltage swings across each transistor, as confirmed by simulation, leading to capacitor values of Cc = C1 = 800 fF and C2 = 230 fF (a gm R L = 5 is estimated) [25]. Fig. 10 presents simulation results of drain–source voltage waveforms of the three transistors within each triple Cascode cell. Uniform waveforms are achieved by tuning the value of bypass capacitors C1 and C2 . ANSYS HFSS is utilized to optimize the radii and metal widths and gaps of input transformers to achieve the required center frequency (24 GHz) and bandwidth (∼ 10 GHz) while maintaining a reasonably high Q (> 12). These transformers are utilized to couple the input signal to the circuit with a coupling factor above 0.7 in the frequency range from 20 to 30 GHz, as shown in Fig. 11. A similar procedure was followed for the design of the U-band stacked cell PA shown in Fig. 12. The PA utilizes only two stacked triple Cascode cells to maintain high efficiency. To reduce the parasitic capacitances of the transistors to the substrate and the overall area of the cell, the layout of the three transistors has been combined into one cell, as discussed

Fig. 12. Circuit schematic of the U-band PA with combined transistors used for each triple Cascode cell.

in Section II. The capacitor values at the gate of the two CG transistors are optimized to maintain identical voltage swing across the drain–source of each transistor (C1 = 302 fF and C2 = 60 fF for an estimated gm R L = 4). Similar to the K-band PA design, this design uses dynamic biasing through resistor ladders to overcome the gate–oxide breakdown of the top CG transistor, as shown in Fig. 12. Following the approach taken for K-band transformer design, ANSYS HFSS was used to optimize the design of the transformer for U-band frequencies and maintain coupling factor above 0.7 for frequencies from 45 to 55 GHz. As the design uses only two cells, each designed with combined transistor layout, small variations of amplitude, and phase of the combining voltage signals are expected, leading to high combing efficiency, high gain, high output power, and high PAE. VII. K-BAND PA I MPLEMENTATION AND M EASUREMENT As shown by the chip photomicrograph in Fig. 9, the K-band PA occupies a small area of 0.59 × 0.48 mm2 facilitated by the absence of power-combining networks. The PA is implemented with only three stacked cells to maintain the overall DE close to 30%. To boost the output power,

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 10

Fig. 13. Measured output power Pout , gain, PAE, and DE at 24 and 28 GHz of the K-band PA when biased under supply voltage of 10.8 V.

each cell of the PA is implemented with a triple Cascode design with one transformer-coupled CS stage and two CG stages. Standard RF NMOS transistors in a 45-nm CMOS SOI technology each with 320 fingers and a total width of 320 μm and a length of 45 nm are utilized. The transistor layouts are non-overlapping and are connected to each other using metal interconnect lines, leading to extra parasitic capacitance in between the neighboring transistors. Each cell utilizes one transformer for every three transistors, leading to high isolation and high gain of the cell at microwave frequencies within a very compact area. Power measurement was conducted using a continuous wave (CW) signal provided by a Keysight 83640L signal generator, a Gigatronics GT-1050A driver amplifier, and a Keysight spectrum analyzer E4448A. Fig. 13 shows the PA power performance when biased under a supply voltage of 10.8 V (1.2 V per transistor to maintain safe operating mode). A saturated output power (PSAT ) of 25.3 dBm and a 1-dB compression power (P1 dB ) of 23.8 dBm with a DE of 25% and a peak PAE of 20% are measured at 24 GHz. The DE is slightly lower than the predicted 30% value from our analysis. The difference is partially attributed to the loss in 120-pH output biasing inductor utilized in the design. At 28 GHz, and biased under the same supply voltage, PSAT and P1 dB slightly decrease to 24.6 and 22.6 dBm, respectively, with a DE of 18% and PAE of 15%. A gain expansion characterized by a slight hump in the 24-GHz power gain curve (for input powers from 5 to 10 dBm) is due to operation in the class-AB amplification mode. Slight gain expansion was also observed at other measured frequencies (see 28-GHz power gain curve in Fig. 13). The PA was also measured for a range of frequencies from 20 to 30 GHz under a supply bias of 10.8 V, as shown in Fig. 14. The PA achieves the highest power gain of 17.3 dB at 28 GHz while its best DE of 25% and PAE of 20% are measured at 24 GHz. P1 dB remains above 23 dBm for the frequency range from 24 to 28 GHz. Frequencies below 24 GHz show lower gain resulting in limited output powers and efficiencies.

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Fig. 14. Measured large-signal performance of the K-band PA across 20– 30-GHz band when biased under 10.8 V.

Fig. 15. Measured power gain, PAE, and DE of the K-band PA at 24 GHz when biased under supply voltages of 7.2 V (0.8 V per transistor), 8 V (0.88 V per transistor), and 10.8 V (1.2 V per transistor).

Fig. 15 illustrates the performance of the PA at 24 GHz under different supply biasing conditions of 7.2, 8, and 10.8 V. The PA achieves the highest efficiencies at a supply voltage of 8 V with DE and PAE of 27% and 22.5%, respectively. Biasing at higher supply voltages allows higher output powers and power gains with slight degradation in efficiencies. By adjusting the supply voltage, on the other hand, the PA can operate efficiently at different output power levels. For instance, at 4-dB back-off output power, the PA maintains a PAE of 20% if biased at a lower supply voltage of 7.2 V. Two-tone measurements were also conducted with a 10-MHz offset frequency between the two tones using a Keysight 83640L signal generator and Keysight E8361A PNA network analyzer with a power combiner under a power supply of 10.8 V, as shown in Fig. 16. By measuring both output power (POUT ) and third-order inter-modulation distortion (IM3) using a Keysight E4448A spectrum analyzer, the output third-order intercept point (OIP3 ) of 27.3 and 25.3 dBm are extrapolated for operating frequencies of 24 and 28 GHz, respectively. Moreover, the PA was tested under two different supply voltages of 10.8 and 8 V at 28 GHz

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. HELMI et al.: HIGH-EFFICIENCY MICROWAVE AND mm-WAVE STACKED CELL CMOS SOI PAs

Fig. 16. Linearity measurement of the K-band PA at 24 and 28 GHz when biased under a supply voltage of 10.8 V. The two tones are within 10-MHz spacing from each other.

Fig. 17. Two-tone linearity measurement of the K-band PA at 28 GHz when biased under supply voltages of 10.8 and 8 V with frequency spacing of 5, 10, 20, and 50 MHz.

with different frequency spacing of 5, 10, 20, and 50 MHz, as shown in Fig. 17. The OIP3 is above 23.7 dBm for different frequency spacings when the PA is biased under a supply voltage of 10.8 V. When the supply voltage is reduced to 8 V, the highest OIP3 is 16 dBm at 10-MHz spacing while the lowest is 12 dBm at 50-MHz spacing. VIII. U-BAND PA I MPLEMENTATION AND M EASUREMENT As mentioned before, both K-band and U-band PAs are designed based on stacked triple Cascode cells with design details provided. In terms of implementation, the U-band PA has several important differences with the K-band design. First, the U-band PA uses only two stacked triple Cascode cell as opposed to three cells used for the K-band PA. Therefore, a lower power supply voltage must be used, leading to lower saturated and 1-dB compression output powers. Secondly, the U-band PA uses RF NMOS transistors each with 120 fingers and a total width of 120 μm and a length of 45 nm. The smaller size transistors compared to those used for the K-band design are necessary to boost the gain at higher frequency (frequency is almost doubled) and provide a reasonable output matching transformer ratio. Thirdly, in the U-band PA design, the layout of the three transistors in the triple Cascode cells

11

Fig. 18. Power measurements of the U-band PA with two triple-stacked cells at 46 GHz under supply voltages of 4.8 and 6 V.

Fig. 19. Power measurements of the U-band PA under two supply voltages of 4.8 and 6 V for the frequency range from 42 to 54 GHz.

is combined into one cell without using interconnect metallization in between neighboring transistors. As discussed, the combined layout leads to lower parasitic capacitances (by approximately two-thirds) and an expected increase in the overall PA efficiency. Fig. 12 shows the photomicrographs of the U-band PA with a total chip area of 0.28 mm2 . The large-signal performance of the U-band PA is measured using Keysight E4419B power meters with input power provided by the Keysight E8361A PNA network analyzer and the Giga-Tronic GT-1050A driver amplifier. Fig. 18 shows the measured power performance of the U-band PA at two different supply voltages. Output power, linear gain, PAE, and DE versus input power at 46 GHz are measured under supply voltages of 4.8 V (0.8 V per transistor) and 6 V (1 V per transistor). The U-band PA delivers a PSAT of 20 dBm, a linear gain of 16 dB with a peak PAE of 45%, and a DE of 53% under the supply voltage of 4.8 V. By increasing the supply voltage to 6 V, PSAT increases to 22.4 dBm (∼174 mW) at 46 GHz while the peak PAE and DE drop to 42.5% and 49%, respectively. To the authors’ best knowledge, the PAE values achieved for the U-band PA (45% and 42.5%) are the highest efficiencies for microwave and mm-wave CMOS-based PAs reported to date. Fig. 19 shows the power performance of the U-band PA measured over a range of frequencies. Output power

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 12

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

TABLE I P ERFORMANCE C OMPARISON OF K-BAND CMOS PAs W ITH C ALCULATED I NTERNATIONAL T ECHNOLOGY ROADMAP FOR S EMICONDUCTORS FOM (ITRS FOM)

at peak PAE, −1-dB output compression power, linear gain, peak PAE, and the corresponding DE are measured at different frequencies and under two biasing conditions of 4.8 and 6 V and are plotted for the frequency range from 42 to 54 GHz. Under a supply voltage of 6 V, the output powers at peak PAEs are higher than 18 dBm with Gain > 8.6 dB for the entire frequency range from 42 to 54 GHz, while a DE >16% is maintained. The U-band PA measures the highest PSAT of 22.4 dBm, highest P1 dB of 18.6 dBm, and also the highest DE of 52.3% at 46 GHz. IX. C ONCLUSION Analysis and design of stacked cell PAs have been presented. The parasitics of the triple Cascode cell used as the building block for the PA design have been analyzed. It was found that, to the first degree, only the output capacitance of the triple Cascode cell has a significant impact in determining the cell high-frequency performance. Different mechanisms that affect the efficiency of stacked cell PAs have also been discussed. Moreover, an analysis of the amplitude and phase variation in the stacked cell PAs has been presented. The calculated DE of the designed K-band PA with three stacked cells is ∼30% while its saturated power is expected to be around PSAT = (2 V D D )2 /(8Ropt )× ηknee × ηcomb × ηm = 27.5 dBm. With only one transformer per cell (three overall)

in the K-band design, the loss in the input matching network and the overall chip area are much reduced. A U-band PA with only two stacked cells was also designed. By combining the layout of the three transistors in the triple Cascode cell, parasitic capacitances have been significantly reduced, leading to an overall DE of ∼ 50% and a saturated output power of (2 V D D )2 /(8Ropt ) × ηknee × ηcomb × ηm = 23.6 dBm. The expected high efficiencies of the two PAs are attributed to limiting the overall phase and amplitude variations across the stacked cells according to the analytical studies provided in this paper. Table I presents a summary of the K-band PA performance in comparison with other reported K-band CMOS PAs. The presented results show the highest linear output power with a small area of 0.28 mm2 , due to the absence of the output power-combining network. The efficiency of the K-band PA is comparable with other designs while a much larger operating bandwidth (14%) is achieved. The large gain of this PA at microwave frequencies is attributed to utilizing triple stacked Cascode cells, which have inherently higher gain than CS and standard Cascode cells. Additionally, with only one transformer per cell (three overall), the loss in the input matching network and the overall chip area are further reduced. The high output power and high bandwidth are due to the elimination of the output matching network in the K-band design.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. HELMI et al.: HIGH-EFFICIENCY MICROWAVE AND mm-WAVE STACKED CELL CMOS SOI PAs

13

TABLE II P ERFORMANCE C OMPARISON OF U-BAND CMOS PAs W ITH C ALCULATED I NTERNATIONAL T ECHNOLOGY ROADMAP FOR S EMICONDUCTORS FOM (ITRS FOM)

The relatively good efficiency of the PA is attributed to limiting the overall phase and amplitude variations across the three stacked cells according to the analysis provided. The U-band PA uses one less cell (six stacked transistors instead of nine) with smaller transistors (120-μm width instead of 320 μm) with combined transistor layouts. The relatively high saturated output power, high gain, and especially high PAE of the U-band PA (> 40% at 46 GHz) are attributed to suppression of amplitude and phase variations of combining voltage signals, partially achieved by reducing the number of combining cells. Further improvement is achieved by optimizing the layout of the transistors (combined layout for three transistors) in each cell, which leads to lower parasitic capacitances to the substrate. Table II shows a comparison among CMOS PAs reported at U-band frequencies. The figure of merit (FOM), used in the table for fair comparison, is reported in [26]. Among all reported U-band PAs, our PA with two triple Cascode cells shows the highest FOM and highest PAE with a very small active area of 0.08 mm2 . To the authors’ best knowledge, the U-band PA reported here is the first CMOS PA with PAE > 40% operating in mm-wave frequencies. The output power of the U-band PA is comparable to other designs, except the ones based on power combining, which have extremely high output powers at these frequencies, but consume a large

area and suffer from low efficiencies. It is important to note that a differential design based on our U-band PA would achieve comparable output power to the power-combining approaches, but with much smaller area and perhaps a much higher efficiency. R EFERENCES [1] V. Vidojkovic et al., “A low-power radio chipset in 40 nm LP CMOS with beamforming for 60 GHz high-data-rate wireless communication,” in IEEE Int. Solid-State Circuits Conf. Tech. Dig., 2013, pp. 236–237. [2] J.-W. Lee and J. Lin, “Series-biased CMOS power amplifiers operating at high voltage for 24 GHz radar applications,” in IEEE Int. SoC Design Conf., Nov. 2010, pp. 360–363. [3] P.-C. Huang, Z.-M. Tsai, K.-Y. Lin, and H. Wang, “A high-efficiency, broadband CMOS power amplifier for cognitive radio applications,” IEEE Trans. Microw. Theory Techn., vol. 58, no. 12, pp. 3556–3565, Dec. 2010. [4] S. Pornpromlikit, J. Jeong, C. D. Presti, A. Scuderi, and P. M. Asbeck, “A watt-level stacked-FET linear power amplifier in silicon-on-insulator CMOS,” IEEE Trans. Microw. Theory Techn., vol. 58, no. 1, pp. 57–64, Jan. 2010. [5] J.-H. Chen, S. R. Helmi, R. Azadegan, F. Aryanfar, and S. Mohammadi, “A broadband stacked power amplifier in 45-nm CMOS SOI technology,” IEEE J. Solid-State Circuits, vol. 48, no. 11, pp. 2775–2784, Nov. 2013. [6] J.-W. Lee and B.-S. Kim, “A K-band high-voltage four-way series-bias cascode power amplifier in 0.13 μm CMOS,” IEEE Microw. Wireless Compon. Lett., vol. 20, no. 7, pp. 408–410, Jul. 2010. [7] P.-C. Huang, Z.-M. Tsai, K.-Y. Lin, and H. Wang, “A 22-dBm 24-GHz power amplifier using 0.18-μm CMOS technology,” in IEEE MTT-S Int. Microw. Symp. Dig., 2010, no. 1, pp. 248–251.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 14

[8] C.-C. Hung, J.-L. Kuo, K.-Y. Lin, and H. Wang, “A 22.5-dB gain, 20.1dBm output power K-band power amplifier in 0.18-μm CMOS,” in IEEE Radio Freq. Integr. Circuits Symp., 2010, pp. 557–560. [9] R. Bhat, A. Chakrabarti, and H. Krishnaswamy, “Large-scale powercombining and linearization in watt-class mmWave CMOS power amplifiers,” in IEEE Radio Freq. Integr. Circuits Symp., 2013, pp. 283–286. [10] Y. Kawano, A. Mineyama, T. Suzuki, M. Sato, T. Hirose, and K. Joshin, “A fully-integrated K-band CMOS power amplifier with Psat of 23.8 dBm and PAE of 25.1%,” in IEEE Radio Freq. Integr. Circuits Symp., 2011, vol. 2, pp. 1–4. [11] I. Aoki, S. D. Kee, D. B. Rutledge, and A. Hajimiri, “Fully integrated CMOS power amplifier design using the distributed active-transformer architecture,” IEEE J. Solid-State Circuits, vol. 37, no. 3, pp. 371–383, Mar. 2002. [12] R. Bhat, A. Chakrabarti, and H. Krishnaswamy, “Large-scale power combining and mixed-signal linearizing architectures for watt-class mmWave CMOS power amplifiers,” IEEE Trans. Microw. Theory Techn., vol. 63, no. 2, pp. 703–718, Feb. 2015. [13] B. Hanafi, O. Gurbuz, H. Dabag, S. Pornpromlikit, G. Rebeiz, and P. Asbeck, “A CMOS 45 GHz power amplifier with output power > 600 mW using spatial power combining,” in IEEE MTT-S Int. Microw. Symp. Dig., 2014, pp. 1–3. [14] H. T. Dabag, B. Hanafi, F. Golcuk, A. Agah, J. F. Buckwalter, and P. M. Asbeck, “Analysis and design of stacked-FET millimeter-wave power amplifiers,” IEEE Trans. Microw. Theory Techn., vol. 61, no. 4, pp. 1543–1556, Apr. 2013. [15] J. G. McRory, G. G. Rabjohn, and R. H. Johnston, “Transformer coupled stacked FET power amplifiers,” IEEE J. Solid-State Circuits, vol. 34, no. 2, pp. 157–161, Feb. 1999. [16] J.-H. Chen, S. R. Helmi, A. Y. Jou, and S. Mohammadi, “A wideband power amplifier in 45 nm CMOS SOI technology for X band applications,” IEEE Microw. Wireless Compon. Lett., vol. 23, no. 11, pp. 587–589, Nov. 2013. [17] S. R. Helmi, J. Chen, and S. Mohammadi, “A stacked Cascode CMOS SOI power amplifier for mm-wave applications,” in IEEE MTT-S Int. Microw. Symp. Dig., 2014, pp. 1–3. [18] Y. Azar et al., “28 GHz propagation measurements for outdoor cellular communications using steerable beam antennas in New York City,” in IEEE Int. Commun. Conf., 2013, pp. 5143–5147. [19] W. Hong and K. Baek, “Design and analysis of a low-profile 28 GHz beam steering antenna solution for Future 5G cellular applications,” in IEEE MTT-S Int. Microw. Symp. Dig., 2014, pp. 1–4. [20] P. E. Gray and C. L. Searle, Electronic Principles, Physics, Models and Circuits, 1st ed. New York, NY, USA: Wiley, 1969. [21] T.-P. Hung, A. G. Metzger, P. J. Zampardi, M. Iwamoto, and P. M. Asbeck, “Design of high-efficiency current-mode class-D amplifiers for wireless handsets,” IEEE Trans. Microw. Theory Techn., vol. 53, no. 1, pp. 144–151, Jan. 2005. [22] M. Lei, Z. Tsai, K.-Y. Lin, and H. Wang, “Design and analysis of stacked power amplifier in series-input and series-output configuration,” IEEE Trans. Microw. Theory Techn., vol. 55, no. 12, pp. 2802–2812, Dec. 2007. [23] J.-H. Chen, S. R. Helmi, H. Pajouhi, Y. Sim, and S. Mohammadi, “A wideband RF power amplifier in 45-nm CMOS SOI technology with substrate transferred to AlN,” IEEE Trans. Microw. Theory Techn., vol. 60, no. 12, pp. 4089–4096, Dec. 2012. [24] J.-H. Chen, S. R. Helmi, D. Nobbe, and S. Mohammadi, “A fullyintegrated high power wideband power amplifier in 0.25 μm CMOS SOS technology,” in IEEE MTT-S Int. Microw. Symp. Dig., 2013, pp. 1–3. [25] J. Jeong, S. Pornpromlikit, P. M. Asbeck, and D. Kelly, “A 20 dBm linear RF power amplifier using stacked silicon-on-sapphire MOSFETs,” IEEE Microw. Wireless Compon. Lett., vol. 16, no. 12, pp. 684–686, Dec. 2006.

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

[26] A. Chakrabarti and H. Krishnaswamy, “High-power high-efficiency class-E-like stacked mmWave PAs in SOI and bulk CMOS: Theory and implementation,” IEEE Trans. Microw. Theory Techn., vol. 62, no. 8, pp. 1686–1704, Aug. 2014. [27] J.-L. Kuo and H. Wang, “A 24 GHz CMOS power amplifier using reversed body bias technique to improve linearity and power added efficiency,” in IEEE MTT-S Int. Microw. Symp. Dig., Jun. 2012, pp. 1–3. [28] C. W. Kuo, H.-K. Chiou, and H.-Y. Chung, “An 18 to 33 GHz fullyintegrated Darlington power amplifier with Guanella-type transmissionline transformers in 0.18 μm CMOS technology,” IEEE Microw. Wireless Compon. Lett., vol. 23, no. 12, pp. 668–670, Dec. 2013. [29] K. Kim and C. Nguyen, “A 16.5–28 GHz 0.18-μm BiCMOS power amplifier with flat 19.4±1.2 dBm output power,” IEEE Microw. Wireless Compon. Lett., vol. 24, no. 2, pp. 108–110, Feb. 2014. [30] A. Balteanu et al., “A 2-bit, 24 dBm, millimeter-wave SOI CMOS power-DAC cell for watt-level high-efficiency, fully digital m-ary QAM transmitters,” IEEE J. Solid-State Circuits, vol. 48, no. 5, pp. 1126–1137, May 2013. [31] A. Chakrabarti and H. Krishnaswamy, “High power, high efficiency stacked mmWave class-E-like power amplifiers in 45 nm SOI CMOS,” in Proc. IEEE Custom Integr. Circuits Conf., 2012, pp. 1–4.

Sultan R. Helmi (S’11) was born in Jeddah, Saudi Arabia, in 1982. He received the B.S. degree in electrical and computer engineering from King AbdulAziz University, Jeddah, Saudi Arabia, in 2005, the M.S. degree in electrical and computer engineering from Purdue University, West Lafayette, IN, USA, in 2011, and is currently working toward the Ph.D. degree in electrical and computer engineering at Purdue University. His research interests include the design of power amplifiers, RF circuits, and RF components using CMOS technology.

Jing-Hwa Chen (S’10) received the B.S. degree in electrical engineering from National Central University, Jhongli, Taiwan, in 2007, and the Ph.D. degree in electrical and computer engineering from Purdue University, West Lafayette, IN, USA, in 2013. He is currently a Senior Engineer with Qualcomm Technologies Inc., Boxborough, MA, USA.

Saeed Mohammadi (S’89–M’92–SM’02) received the Ph.D. degree in electrical engineering from the University of Michigan, Ann Arbor, MI, USA, in 2000. He is currently an Associate Professor of electrical and computer engineering with Purdue University, West Lafayette, IN, USA. His research interests include RF devices and circuits and nanotechnology.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

1

A 70–80-GHz SiGe Amplifier With Peak Output Power of 27.3 dBm Hsin-Chang Lin, Member, IEEE, and Gabriel M. Rebeiz, Fellow, IEEE

Abstract— This paper presents a fully integrated 16-way power-combining amplifier for 67–92-GHz applications in an advanced 90-nm silicon germanium HBT technology. The 16-way amplifier is implemented using three-stage commonemitter single-ended power amplifiers (PAs) as building blocks, and reactive λ/4 impedance transformation networks are used for power combining. The three-stage single PA breakout has a small-signal gain of 22 dB at 74 GHz, and saturation output power ( Psat ) of 14.3–16.4 dBm at 68–99 GHz. The powercombining PA achieves a small-signal gain of 19.3 dB at 74 GHz, and Psat of 25.3–27.3 dBm at 68–88 GHz with a maximum power added efficiency of 12.4%. The 16-way amplifier occupies 6.48 mm2 (including pads) and consumes a maximum current of 2.1 A from a 1.8 V supply. To the best of our knowledge, this is the highest power silicon-based E-band amplifier to date. Index Terms— E-band, HBT, millimeter-wave (mm-wave) integrated circuits, power amplifier (PA), silicon germanium (SiGe).

I. I NTRODUCTION

S

ILICON-BASED millimeter-wave (mm-wave) systems at E-band (71–76, 81–86, and 92–95 GHz) have been developed over the past few years for point-to-point multi-Gb/s communication [1]–[5] and automotive radar systems [6]–[11]. Although advanced CMOS and silicon germanium (SiGe) technologies provide low-cost, high-yield, and high-integration solutions for such systems, the transmit output power is still limited due to the scaling down in transistor sizes and lower breakdown voltages. Therefore, III–V technologies, such as GaAs [12]–[14], GaN [15]–[18], and InP [19]–[21], still dominate the power amplifier (PA) area at these frequencies despite of their relatively high cost. Power-combining techniques, such as voltage combining where transformers are used [22]–[25], current combining where the Wilkinson combiners or T-junctions are used [26]–[30], and spatial power combing [31]–[33], are usually employed to increase the output power of CMOS and SiGe technologies. Wilkinson- and transformer-based powercombining techniques result in relatively high loss when the number of combining elements increases, and the free-space power-combining technique requires on-chip antennas, which

Manuscript received March 28, 2016; revised May 15, 2016; accepted May 25, 2016. This work was supported in part by Analog Devices and in part by the Defense Advanced Research Projects Agency (DARPA) under the Elastx Program. The authors are with the Electrical and Computer Engineering Department, University of California at San Diego, La Jolla, CA 92093 USA (e-mail: [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TMTT.2016.2574863

Fig. 1.

16-way reactive power-combined amplifier.

occupy large area on the chip and have a 40%–55% radiation efficiency [31]–[35]. In this paper, a 16-way reactively power-combined amplifier with low-loss λ/4 combining networks is presented (Fig. 1). The output combining network is based on λ/4 microstrip transmission lines with wide signal lines to achieve low-loss power combining. The simulated 16-way combining loss is 0.75 dB at 70–90 GHz when all the amplifiers are driven in phase. Furthermore, the output T-network combiner is robust to variations in the electrical length and Z 0 of the constitutive transmission lines. The amplifier unit cell is implemented using three-stage common-emitter amplifier with a smallsignal gain of 22 dB at 74 GHz. The 16-way power-combining amplifier results in a Psat of 25–27.3 dBm at 68–88 GHz. In addition, an 8-way combined PA using the same amplifier unit with λ/4 combining network is demonstrated. The 8-way power-combined PA delivers a Psat > 21 dBm at 66–95 GHz with a peak value of 24 dBm at 72–80 GHz. II. T ECHNOLOGY The 80-GHz PAs are designed using the IBM 9HP BiCMOS process [36]. It is a 90-nm SiGe HBT process built on top of a 90-nm CMOS process, with a ten-layer copper metal backend and high-density metal–insulator–metal (MIM) capacitors (12.2 fF/μm2 ). The 4 × 0.1 μm2 transistor model with a single emitter finger, dual collector, and base fingers (C–B–E–B–C) from the Cadence library results in a peak ft / f max of 310/350 GHz at 1.5–2.5-mA/μm bias current when referred to M1. However, when the interconnect parasitics from M2 to LD are considered for the collector and base fingers, and from M2 to M2_4B for the emitter finger, the peak f t / f max becomes

0018-9480 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 2

Fig. 2. (a) Measured H21 and U at 2.2 mA/μm. (b) f t and f max of 4 × 0.1 μm2 transistor versus current density.

260/300 GHz (Fig. 2). The emitter finger is connected to ground through higher-level metal, M2_4B, instead of lower level metal (M1–M3) for reliability consideration and to pass electromigration rules. The interconnects for the 4 × 0.1 μm2 transistor [Fig. 3(a)] are inserted into a full electromagnetic (EM) simulator (Sonnet EM suite ver.13.54) and the extracted parasitic lumped elements are shown in Fig. 3(b). The interconnect parasitics are much better than CMOS transistors, which typically reduce ft / f max from 460 GHz (referred to M1) to 260 GHz (referred to top metal) [37]. The better performance is due to the all-copper backend and the thick dielectrics used in the IBM 9HP process. Fig. 4(a) shows the 50- microstrip transmission lines used in this paper with two different ground planes. The 10-μm wide transmission line (TL1) is implemented using the top-metal LD and M2_4B, and is used in the PA cell for the matching stubs, resulting in a small area and compact layout. The 20-μm wide transmission line (TL2) implemented using LD and M3 is used in the input distribution and output combining networks, and has lower loss than TL1. EM simulations using Sonnet show a loss of 0.45–0.6 and 0.28–0.38 dB/mm at 60–100 GHz for TL1 and TL2, respectively [Fig. 4(b)]. The corresponding transmission line Q is 24–30 and 39–48 GHz, respectively, at 60–100 GHz. In practice, and from different measurements done on IBM 8HP and other processes, the loss is ∼20% higher than simulated.

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Fig. 3. (a) Interconnections from M2 to top-metal LD. (b) Lumped-element model for the interconnect parasitics from M2 to LD.

Fig. 4. (a) 50- microstrip transmission lines with M2_4B (TL1) or M3 (TL2) as ground plane and (b) simulated loss.

Still, a loss of 0.45–0.6 and 0.28–0.38 dB/mm is taken for TL1 and TL2 in the Cadence and Sonnet simulations, since there are no measurements yet in our group for this process at mm-wave frequencies. III. D ESIGN A. Single-Ended PA The PA consists of three common-emitter gain stages biased in the class A region [Fig. 5(a)]. The transistors are implemented by aggregating smaller, high- f t n-p-n standard cells.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. LIN AND REBEIZ: 70–80-GHz SiGe AMPLIFIER WITH PEAK OUTPUT POWER OF 27.3 dBm

3

Fig. 5. (a) Schematic of the 70–80-GHz three-stage amplifier and (b) EM modeling of the amplifier done using Sonnet. The f t / f max simulations include the interconnections to the top-metal layer.

Transistors with the dimensions of 4 × 0.1μm2 or 8 × 0.1 μm2 are used as the standard cells with C–B–E–B–C configuration, and each cell is surrounded by a deep-trench isolation ring. The transistors are biased near their peak f t current density at a quiescent current of 1.6 mA/μm. The first stage (Q1) consists of two parallel 4 × 0.1 μm2 n-p-n cells, while the second stage (Q2) is implemented using two 8 × 0.1 μm2 n-p-n cells connected in parallel. The output transistor (Q3) is implemented using four 8 × 0.1 μm2 n-p-n cells connected in parallel to form a 32-μm emitter-length device. The first-stage amplifier provides a small-signal gain of 9 dB at 80 GHz while the second and third stages provide a gain of 8–9 dB each. All amplifiers are driven by a single Vdd plane (1.7–1.8 V) distributed using M1_4B (0.81 μm) when TL1 is used and M1 and M2 when TL2 is used. The EM simulation environment of the matching networks for the 80-GHz PA is shown in Fig. 5(b). The input matching, interstage matching, and output matching networks are implemented using LC-resonant circuits and are all modeled in Sonnet. Compact metal–oxide–metal (MOM) capacitors, implemented using the top-metal LD to metal M1_2B, are used as series matching elements where dc blocks are required. Inductors are implemented using shorted 50- TL1 in different lengths. Customized MOM between LD and M1_2B

capacitors are chosen over the process design kit MIM and vertical-natural (VN) capacitors for impedance matching, because they show a higher Q (∼30) value than VN (∼25) and MIM (∼8) capacitors [27]. The Vdd stubs connected to the collector nodes are followed by dual 710-fF MIM capacitors, which are operating close to self-resonance and provide a very low impedance (∼1 ) at 70–80 GHz. The M2_4B layer is used as a ground plane and the M1_4B layer is used as the Vdd plane for the Vdd stubs. Both ground and Vdd planes are cheesed to pass the metal density rule. A quarter-wavelength 50- microstrip line is used at the Q3 collector as an RF chock [38], which presents an open circuit (>500 ) to the low impedance collector node (∼10 ), and eases the implementation of the output matching network. Fig. 6 shows the biasing circuitry for each of the three-stage PA when used in the single-ended PA [Fig. 6(a)] or in the combined 8 and 16-element PA [Fig. 6(b)]. For each stage, the current ratio in the n-p-n current mirror is 1:2, 1:4, and 1:8 for Q1, Q2, and Q3, respectively. The current ratio of the preceding two-stage MOS current mirror is set to 1:10 for each stage to reduce the power consumption in the biasing network. The RF isolation between each amplifier base voltage and the bias-circuitry base voltage is >30 dB due to

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 4

Fig. 6. Bias circuit for each stage of the three-stage PA (a) when used in the single-ended PA or (b) when used in the combined PA.

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Fig. 8. Interstage matching between Q1 and Q2, Q2 and Q3, and output matching for Q3.

Fig. 7. (a) Simulated Psat and Vbe versus Rb for the Q3 stage alone with an optimized load and (b) bias circuit using λ/4 transmission line allowing for even a lower Rb to be used, but taking much larger area than a simple 300- bias resistor.

the 300- impedance of the bias circuitry and the low input impedance of each amplifier stage (10.5−10.1 j  from Q1 to 4.5−1 j  for Q3). Fig. 7 shows the simulated Psat versus Rb when only the Q3 stage is considered. A higher Psat is achieved when a lower Rb is used at the base node, and this results in a lower Ib Rb drop, which allows for a higher Vbe at large currents. A Rb = 300  is chosen, since it is much higher than the base input impedance of Q1 (10.5−10.1 j ), Q2 (7−3.3 j ), and Q3 (4.5−1 j ), and does not load the network. A lower value Rb could have been used with λ/4 transmission-line bias circuitry [Fig. 7(b)] [38] to isolate the base input impedance and achieve even a higher Psat , but is not used here due to area considerations. Since the transistors have sufficient small-signal gain at 70–90 GHz, the interstage matching between Q1 and Q2, Q2 and Q3, and Q3 to the 50- load is all optimized using load-pull analysis for maximum power transfer at 80 GHz instead of complex-conjugate matching for maximum gain transfer. This explains why the peak small-signal gain is not centered at 80 GHz, but the maximum output power is at 80 GHz. Load–pull analysis is first done on Q1, and the optimum load impedance for maximum power delivery is 41 + j 22  for an output power of 9.4 dBm. Then, the input impedance of Q2 (base node) is transformed using a series capacitor

Fig. 9. (a) Simulated power contours for Q3. (b) Simulated load lines for Q3 with different output power levels. The three-stage amplifier is simulated and not Q3 alone.

and shunt inductor to the optimum load impedance for Q1 so as to ensure maximum power delivery. Similarly, load–pull analysis and interstage matching between Q2 and Q3 is done separately, and the optimum load impedance for Q2 is found to be 22+ j 8  with an output power of 14.6 dBm. Due to the collector-to-base capacitor (Cbe ) and the relatively large S12 for Q2 and Q3 (approximately −20 dB at 80 GHz), the stages cannot be designed independently, and manual iterations are done in Cadence on the interstage matching networks for optimal power transfer at 80 GHz. The final design is shown in Fig. 8. The optimum load impedance for Q3 is 15− j 2  with a maximum power of 17.2 dBm at 80 GHz [Fig. 9(a)], and is

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. LIN AND REBEIZ: 70–80-GHz SiGe AMPLIFIER WITH PEAK OUTPUT POWER OF 27.3 dBm

obtained using a TL2 with Z 0 = 27  and  = 90°. The 27  TL2 line is 48 μm wide with a simulated loss of 0.3 dB/mm at 80 GHz (Q = 48). Note that the transistor output impedance is 8.5− j 6  and is not conjugate matched to the load impedance (power match condition is used). Fig. 9(b) shows the Q3 load lines at small signal, OP1dB (output power 1-dB compression point), OP1dB + 3 dB, and Psat (OP1dB + 4.6 dB). The load lines are all ∼15  due to the TL2 matching network and the dc bias current shifts up for large signals due to the transistor self-bias characteristics. The three-stage PA has a simulated peak small-signal gain of 27.2 dB at 70 GHz with a 3-dB bandwidth of 65–82 GHz. The simulated Psat is 17.4 dBm at 80 GHz with a linear gain of 25 dB, a peak power added efficiency (PAE) of 16.4%, and an output power 1-dB compression point (OP1dB ) of 12.8 dBm. The small-signal gain peaks at 70 GHz while Psat peaks at 80 GHz due to the power matching condition used at 80 GHz instead of a complex-conjugated matching condition at 80 GHz. The 4.6 dB difference between Psat and OP1dB is because the first and second stages, Q1 and Q2, start to saturate earlier than the last stage Q3, and the matching networks are optimized for maximum Psat instead of maximum OP1dB (in hindsight, this was a design error and will be changed in the future work). The quiescent bias current is 110 mA and the bias current at Psat is 180 mA for a Vdd of 1.7 V. The simulated S12 is less than −55 dB at 60–90 GHz. The small-signal gain and Psat variation versus temperature are also checked in simulations. The simulated small-signal gain at 70 GHz decreases to 25–24 dB and Psat at 80 GHz decreases to 16.5–16 dBm at 85–120 °C, respectively. This is still acceptable performance at 120 °C. B. 8-Way Power Combining The 8-way power-combining PA is implemented using eight three-stage PAs with λ/4 matching networks connected at the input and output ports [Fig. 10(a)]. Two adjacent single PAs are first tied together at the input and output nodes to form a PA pair. The Q3 collector nodes in the PA pair share the same λ/4 Vdd stub in order to reduce the chip area. The λ/4 Vdd stub (TL1, W = 27 μm, and Z 0 = 30 ) is designed for the electromigration rule and reliability consideration, and used to bias two 32-μm transistors (∼160 mA in Psat ). The IR drop for the λ/4 Vdd stub is minimal (0.012–0.04 V) due to the large width of the λ/4 stub. The optimum load impedance at the output of the single-ended PA is 15− j 2 . In order to connect four PA pairs to a common 50- output port, a 38- quarter-wave TL2 is used to transfer the 50- port impedance (100  in common mode) to 14.4 . Next, 14.4  (28.8  in common mode) is transferred to 15− j 2  at the output of each amplifier using a 15- (TL1, W = 66 μm) quarter-wave line and a T-junction network. Note that TL1 is employed close to the PA pair, since a TL2 would have been too wide. At the input port, the RF signal is distributed to each PA pair using λ/4 T-junction networks. The input impedance of the PA pair is 25 , and is transferred successively to 50 and 100  using λ/4 networks. The final T-junction

5

Fig. 10. (a) Schematic of 8-way power-combining amplifier. (b) Simulated output combining and input distribution network S21 and S22 versus frequency.

combines two 100- loads into a 50- impedance. All input networks are done in TL2 for low loss. The λ/4 power-combining networks are wideband and with low ohmic loss. The combining bandwidth and loss are studied as follows [27]. The output impedance of the single-ended PA is assumed to be constant at 15 + j 2  versus frequency (this impedance is the complex-conjugated of the optimum load impedance), and the combining loss and S22 are simulated for the two λ/4 T-junction networks using Sonnet. A loss of 0.42–0.5 dB is found together with an S22 0 at dc 200 GHz for all cases (three cases for 8-way and four cases for 16-way combined PA) due to the very low S12 of the three-stage amplifier. The amplifiers are unconditionally stable at all frequencies.

The S-parameter is measured using two different setups. An Agilent dc 67 GHz vector network analyzer (VNA) is first used to measure S-parameters from dc 70 GHz, and then an Agilent 50 GHz VNA is used with Agilent mm-wave head controller and VDI WR-10 extenders for measurements at 70–110 GHz (Fig. 15). Standard short-open-load-thru calibration on the CS-5 calibration substrate is done for both the cases. The results include the input and output GSG pads loss (measured 0.3–0.5 dB total loss at 60–110 GHz). The chip microphotographs of the 16-way and 8-way powercombining PAs, and the single PA are shown in Fig. 16. Fig. 17 shows the measured small-signal gain of the single, 8-way combined, and 16-way combined PA. The single PA results a peak gain of 22 dB at 74 GHz with a 3-dB bandwidth of 68–82 GHz. The 4.4-dB difference between simulations and measurements is due to the actual transmission-line loss, which is higher than the simulated value in Sonnet, and the inaccuracy in transistor modeling. The IBM9HP process is still in development and the transistor performance can vary a bit from process run to run. The 8-way PA has a peak gain of 21.2 dB at 73 GHz with a 3-dB bandwidth of 68–82 GHz, and a 5.8-dB difference between simulations and measurements. This is an additional 1.4-dB difference than the three-stage PA. The S22 of the 8-way combined PA (−10 dB) is better than S22 of the single PA (−7 dB) at 74 GHz, and considering this, the extra transmission-line loss in the 8-way combined PA is 0.9 dB. This is due to additional input and output transmission-line loss not taken in simulations. The 16-way power-combining amplifier has a peak gain of 19.3 dB at 74 GHz with a 3-dB bandwidth of 69–81 GHz. There is yet another 1-dB difference between measurements for the 16-way and 8-way PAs. This is mainly due to the long input transmission line. The measured S12 is 0 at dc 110 GHz for all the amplifiers.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 8

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Fig. 16. Chip microphotograph of (a) 16-way power-combining amplifier (2.7 × 2.4 mm2 including pads), (b) 8-way power-combining amplifier (1.6 × 2.2 mm2 ), and (c) three-stage amplifier (1 × 0.6 mm2 ). Unlabeled pads are all for ground and Vdd power supply.

Fig. 18.

Power measurement setup for the 70–80-GHz amplifiers.

simulations versus frequency except for the extra transmissionline loss and lower transistor gain, and show the accuracy of the Cadence design and the Sonnet EM models. B. Power Measurements

Fig. 17. Measured S-parameters of single, 8-way combined, and 16-way combined amplifier.

The small-signal power consumption in each case is 257 mW (143 mA), 1.94 W (1.03 A), and 3.65 W (2.03 A) from a 1.8 V supply, respectively. Measurements agree with

Fig. 18 shows the power measurement setup. The input 65–105-GHz signal is generated using VDI-AMC 332–334 multiplier chains and the output power is monitored using PM4 Erickson power sensor (manufactured by Virginia Diodes). A WR-10 variable attenuator (−6 dB) is used in front of the power sensor due to the 200-mW upper limit of the power sensor. The WR-10 GSG probe loss is measured using a thru on the SiGe wafer, and the measured thru loss is 3 dB at 80 GHz with a GSG pad loss of 0.2 dB each. This means that the probe loss is 1.3 dB each, and agrees well with the 1.25-dB loss provided by the manufacturer data sheet (Picoprobe model 120, GGB Industries, Inc.). All measurements are referenced to the WR-10 GSG probe tips, and include the 0.2-dB GSG pad loss at the input and output ports. The single PA results in 15.8 dBm at 80 GHz with a peak PAE of 10.7%. OP1dB is 11.2 dBm with an associated PAE of 5.1% (not shown for brevity). The 8-way combined PA results in an output power of 24.1 dBm at 76 GHz with a peak PAE of 11.3% and associated gain of 9 dB. OP1dB is 20.4 dBm with an associated PAE of 5.8%. For the 16-way powercombining PA, an output power of 27.3 dBm is achieved with a peak PAE of 12.4% at 76 GHz and with an associated gain of 9 dB (Fig. 19). OP1dB is 21.2 dBm with an associated PAE of 3.8%.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. LIN AND REBEIZ: 70–80-GHz SiGe AMPLIFIER WITH PEAK OUTPUT POWER OF 27.3 dBm

9

TABLE I P ERFORMANCE S UMMARY FOR E -BAND A MPLIFIERS IN SiGe AND CMOS T ECHNOLOGY

Fig. 19. Measured output power, gain, and PAE versus input power of the 8-way and 16-way power-combining amplifier at 76 GHz.

Fig. 20. Measured Psat , OP1dB , and PAE versus frequency of single and 8-way power-combining amplifier.

The same measurement setup is used for output power versus frequency, and the results are calibrated to the probe tips by carefully deembedding the input and output losses at 65–105 GHz. The measured Psat for the single PA is >14 dBm at 64–105 GHz with ≥16 dBm at 80–95 GHz.

Measurements on the 8-way combined PA show ≥21 dBm at 66–95 GHz with 24 dBm at 72–80 GHz (Fig. 20). The 16-way power-combining PA achieves a Psat > 25 dBm at 68–88 GHz (Fig. 21). In particular, the 16-way power-combining amplifier results in 26.5–27.3 dBm at 72–84 GHz with PAE greater

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 10

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

power over a wide frequency. The design technique does not employ λ/4 chokes in the biasing network and can be used with different silicon processes (advanced CMOS, SiGe with ground vias) for high-power, high-efficiency W -band PAs. ACKNOWLEDGMENT The authors would like to thank Analog Devices for technical discussions, M. Chang at the University of Michigan for process and technical discussions, and D. Harame at IBM for the 9HP wafer space. R EFERENCES Fig. 21. Measured Psat , OP1dB , and PAE versus frequency of the 16-way combining amplifier.

Fig. 22. Psat and PAE of the 8-way and 16-way amplifiers compared with published work.

than 8%. We have measured four different chips with nearly identical results. Table I and Fig. 22 summarize the measured results and compare this paper with recently published amplifiers. The 16-way power-combining PA shows the highest output power in this frequency range using silicon technology to date. V. C ONCLUSION This paper presented a high-power SiGe amplifier at 70–80 GHz with reactive power division at the input and reactive power combining at the output. The four-stage reactive combiner results in a loss of 0.6–1 dB at 65–95 GHz for the 16-way combined amplifier, and the state-of-the-art output

[1] Y. Yang, S. Zihir, H. Lin, O. Inac, W. Shin, and G. M. Rebeiz, “A 155 GHz 20 Gbit/s QPSK transceiver in 45nm CMOS,” in Proc. IEEE Radio Freq. Integr. Circuits Symp., Tampa, FL, USA, Jun. 2014, pp. 365–368. [2] X. Yu, S. P. Sah, H. Rashtian, S. Mirabbasi, P. P. Pande, and D. Heo, “A 1.2-pJ/bit 16-Gb/s 60-GHz OOK transmitter in 65-nm CMOS for wireless network-on-chip,” IEEE Trans. Microw. Theory Techn., vol. 62, no. 10, pp. 2357–2369, Oct. 2014. [3] S. Shopov, A. Balteanu, and S. P. Voinigescu, “A 19 dBm, 15 Gbaud, 9 bit SOI CMOS power-DAC cell for high-order QAM W-band transmitters,” IEEE J. Solid-State Circuits, vol. 49, no. 7, pp. 1653–1664, Jul. 2014. [4] I. Sarkas et al., “An 18-Gb/s, direct QPSK modulation SiGe BiCMOS transceiver for last mile links in the 70–80 GHz band,” IEEE J. Solid-State Circuits, vol. 45, no. 10, pp. 1968–1980, Oct. 2010. [5] Y. Zhao, E. Ojefors, K. Aufinger, T. F. Meister, and U. R. Pfeiffer, “A 160-GHz subharmonic transmitter and receiver chipset in an SiGe HBT technology,” IEEE Trans. Microw. Theory Techn., vol. 60, no. 10, pp. 3286–3299, Oct. 2012. [6] B.-H. Ko et al., “A 77–81-GHz 16-element phased-array receiver with ±50° beam scanning for advanced automotive radars,” IEEE Trans. Microw. Theory Techn., vol. 62, no. 11, pp. 2823–2832, Nov. 2014. [7] J. Park, H. Ryu, K.-W. Ha, J.-G. Kim, and D. Baek, “76–81-GHz CMOS transmitter with a phase-locked-loop-based multichirp modulator for automotive radar,” IEEE Trans. Microw. Theory Techn., vol. 63, no. 4, pp. 1399–1408, Apr. 2015. [8] J. Oh, J. Jang, C. Y. Kim, and S. Hong, “A W-band 4-GHz bandwidth phase-modulated pulse compression radar transmitter in 65-nm CMOS,” IEEE Trans. Microw. Theory Techn., vol. 63, no. 8, pp. 2609–2618, Aug. 2015. [9] I. Sarkas, J. Hasch, A. Balteanu, and S. P. Voinigescu, “A fundamental frequency 120-GHz SiGe BiCMOS distance sensor with integrated antenna,” IEEE Trans. Microw. Theory Techn., vol. 60, no. 3, pp. 795–812, Mar. 2012. [10] B. P. Ginsburg, S. M. Ramaswamy, V. Rentala, E. Seok, S. Sankaran, and B. Haroun, “A 160 GHz pulsed radar transceiver in 65 nm CMOS,” IEEE J. Solid-State Circuits, vol. 49, no. 4, pp. 984–995, Apr. 2014. [11] I. Sarkas, M. G. Girma, J. Hasch, T. Zwick, and S. P. Voinigescu, “A fundamental frequency 143–152 GHz radar transceiver with builtin calibration and self-test,” in Proc. IEEE Compound Semicond. Integr. Circuit Symp., Monterey, CA, USA, Oct. 2012, pp. 1–4. [12] F. D. Canales and M. Abbasi, “A 75–90 GHz high linearity MMIC power amplifier with integrated output power detector,” in IEEE MTT-S Int. Microw. Symp. Dig., Seattle, WA, USA, Jun. 2013, pp. 1–4. [13] M. Gavell, I. Angelov, M. Ferndahl, and H. Zirath, “A high voltage mm-wave stacked HEMT power amplifier in 0.1 μm InGaAs technology,” in IEEE MTT-S Int. Microw. Symp. Dig., Phoneix, AZ, USA, May 2015, pp. 1–3. [14] M. Abbasi et al., “A broadband 60-to-120 GHz single-chip MMIC multiplier chain,” in IEEE MTT-S Int. Microw. Symp. Dig., Boston, MA, USA, Jun. 2009, pp. 441–444. [15] M. Micovic et al., “92–96 GHz GaN power amplifiers,” in IEEE MTT-S Int. Microw. Symp. Dig., Montreal, QC, Canada, Jun. 2012, pp. 1–3. [16] J. Schellenberg, B. Kim, and T. Phan, “W-band, broadband 2W GaN MMIC,” in IEEE MTT-S Int. Microw. Symp. Dig., Seattle, WA, USA, Jun. 2013, pp. 1–4. [17] A. Margomenos et al., “GaN technology for E, W and G-band applications,” in Proc. Compound Semiconductor Integr. Circuit Symp. (CSICs), La Jolla, CA, USA, Oct. 2014, pp. 1–4.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. LIN AND REBEIZ: 70–80-GHz SiGe AMPLIFIER WITH PEAK OUTPUT POWER OF 27.3 dBm

[18] D. Schwantuschke et al., “Q- and E-band amplifier MMICs for satellite communication,” in IEEE MTT-S Int. Microw. Symp. Dig., Tampa, FL, USA, Jun. 2014, pp. 1–4. [19] H.-C. Park, S. Daneshgar, Z. Griffith, M. Urteaga, B.-S. Kim, and M. Rodwell, “Millimeter-wave series power combining using subquarter-wavelength baluns,” IEEE J. Solid-State Circuits, vol. 49, no. 10, pp. 2089–2102, Oct. 2014. [20] Z. Griffith, M. Urteaga, P. Rowell, and R. Pierson, “A >0 mW SSPA from 76–94 GHz, with peak 28.9% PAE at 86 GHz,” in Proc. IEEE Compound Semiconductor Integr. Circuit Symp., La Jolla, CA, USA, Oct. 2014, pp. 1–4. [21] H. Takahashi, T. Kosugi, A. Hirata, J. Takeuchi, K. Murata, and N. Kukutsu, “120-GHz-band fully integrated wireless link using QSPK for realtime 10-Gbit/s transmission,” IEEE Trans. Microw. Theory Techn., vol. 61, no. 12, pp. 4745–4753, Dec. 2013. [22] G. J. Gu, Z. Xu, and M.-C. F. Chang, “Two-way current-combining W -band power amplifier in 65-nm CMOS,” IEEE Trans. Microw. Theory Techn., vol. 60, no. 5, pp. 1365–1374, May 2012. [23] Y. Zhao and J. R. Long, “A wideband, dual-path, millimeter-wave power amplifier with 20 dBm output power and PAE above 15% in 130 nm SiGe-BiCMOS,” IEEE J. Solid-State Circuits, vol. 47, no. 9, pp. 1981–1997, Sep. 2012. [24] M. Thian, M. Tiebout, N. B. Buchanan, V. F. Fusco, and F. Dielacher, “A 76–84 GHz SiGe power amplifier array employing low-loss fourway differential combining transformer,” IEEE Trans. Microw. Theory Techn., vol. 61, no. 2, pp. 931–938, Feb. 2013. [25] E. Kaymaksut, D. Zhao, and P. Reynaert, “Transformer-based Doherty power amplifiers for mm-wave applications in 40-nm CMOS,” IEEE Trans. Microw. Theory Techn., vol. 63, no. 4, pp. 1186–1192, Apr. 2015. [26] W. Tai, L. R. Carley, and D. S. Ricketts, “A 0.7 W fully integrated 42 GHz power amplifier with 10% PAE in 0.13 μm SiGe BiCMOS,” in Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC), San Fransisco, CA, USA, Feb. 2013, pp. 142–143. [27] H.-C. Lin and G. M. Rebeiz, “A 110–134-GHz SiGe amplifier with peak output power of 100–120 mW,” IEEE Trans. Microw. Theory Techn., vol. 62, no. 12, pp. 2990–3000, Dec. 2014. [28] A. Y.-K. Chen, Y. Baeyens, Y.-K. Chen, and J. Lin, “An 83-GHz highgain SiGe BiCMOS power amplifier using transmission-line currentcombining technique,” IEEE Trans. Microw. Theory Techn., vol. 61, no. 4, pp. 1557–1569, Apr. 2013. [29] Y.-H. Hsiao, Z.-M. Tsai, H.-C. Liao, J.-C. Kao, and H. Wang, “Millimeter-wave CMOS power amplifiers with high output power and wideband performances,” IEEE Trans. Microw. Theory Techn., vol. 61, no. 12, pp. 4520–4533, Dec. 2013. [30] W. Tai and D. S. Ricketts, “A W-band 21.1 dBm power amplifier with an 8-way zero-degree combiner in 45 nm SOI CMOS,” in Proc. IEEE Int. Microw. Symp., Tampa, FL, USA, Jun. 2014, pp. 1–3. [31] Y. A. Atesal, B. Cetinoneri, M. Chang, R. Alhalabi, and G. M. Rebeiz, “Millimeter-wave wafer-scale silicon BiCMOS power amplifiers using free-space power combining,” IEEE Trans. Microw. Theory Techn., vol. 59, no. 4, pp. 954–965, Apr. 2011. [32] W. Shin, B.-H. Ku, O. Inac, Y.-C. Ou, and G. M. Rebeiz, “A 108–114 GHz 4×4 wafer-scale phased array transmitter with highefficiency on-chip antennas,” IEEE J. Solid-State Circuits, vol. 48, no. 9, pp. 2041–2055, Sep. 2013. [33] J. Jayamon et al., “Spatially power-combined W-band power amplifier using stacked CMOS,” in Proc. IEEE Radio Freq. Integr. Circuits Symp., Tampa, FL, USA, Jun. 2014, pp. 151–154. [34] Y.-C. Ou and G. M. Rebeiz, “Differential microstrip and slot-ring antennas for millimeter-wave silicon systems,” IEEE Trans. Antennas Propag., vol. 60, no. 6, pp. 2611–2619, Jun. 2012. [35] J. M. Edwards and G. M. Rebeiz, “High-efficiency elliptical slot antennas with quartz superstrates for silicon RFICs,” IEEE Trans. Antennas Propag., vol. 60, no. 11, pp. 5010–5020, Nov. 2012. [36] J.-S. Rieh et al., “SiGe HBTs with cut-off frequency of 350 GHz,” in Proc. Int. Electron Devices Meeting, San Fransisco, CA, USA, Dec. 2002, pp. 771–774. [37] O. Inac, M. Uzunkol, and G. M. Rebeiz, “45-nm CMOS SOI technology characterization for millimeter-wave applications,” IEEE Trans. Microw. Theory Techn., vol. 62, no. 6, pp. 1301–1311, Jun. 2014.

11

[38] M. Chang and G. Rebeiz, “A wideband high-efficiency 79–97 GHz SiGe linear power amplifier with >90 mW output,” in Proc. IEEE Bipolar/BiCMOS Circuits Technol. Meeting, Oct. 2008, pp. 69–72. [39] E. Öjefors, C. Stoij, B. Heinemann, and H. Rücker, “An 8-way powercombining E-band amplifier in a SiGe HBT technology,” in Proc. IEEE Eur. Microw. Integr. Circuit Conf. (EuMIC)., Rome, Italy, Oct. 2014, pp. 45–48. [40] D. Zhao and P. Reynaert, “An E-band power amplifier with broadband parallel-series power combiner in 40-nm CMOS,” IEEE Trans. Microw. Theory Techn., vol. 63, no. 2, pp. 683–690, Feb. 2015. [41] J. Kim, H. Dabag, P. Asbeck, and J. F. Buckwalter, “Q-band and W -band power amplifiers in 45-nm CMOS SOI,” IEEE Trans. Microw. Theory Techn., vol. 60, no. 6, pp. 1870–1877, Jun. 2012.

Hsin-Chang Lin (GSM’12–M’16) received the B.S. degree in engineering and system science from National Tsing Hua University, Hsinchu, Taiwan, in 2008, and the M.S. degree in electrical and computer engineering from the University of California at San Diego, La Jolla, CA, USA, in 2012, where he is currently pursuing the Ph.D. degree with the Department of Electrical and Computer Engineering. His Ph.D. study includes analog, RF, and millimeterwave integrated circuits in silicon technologies for phased array systems and wireless communications.

Gabriel M. Rebeiz (S’86–M’88–SM’93–F’97) received the Ph.D. degree from the California Institute of Technology, Pasadena, CA, USA. He was with the University of Michigan, Ann Arbor, MI, USA, from 1988 to 2004. He contributed to planar millimeter-wave (mmw) and terahertz (THz) antennas and imaging arrays from 1988 to 1996, and his group has optimized the dielectric-lens antennas, which is the most widely used antenna at mmw and THz frequencies. His group also developed 6–18 and 40–50 GHz 8- and 16-element phased arrays on a single silicon chip, making them one of the most complex RFICs at this frequency range. His group also demonstrated high-Q RF MEMS tunable filters at 1–6 GHz (Q > 200) and the new angularbased RF MEMS capacitive and metal-contact switches. As a Consultant, he helped develop the USM/ViaSat 24-GHz single-chip automotive radar, phased arrays operating at X, Ku-band, and W -band for defense and commercial applications, the RFMD RF MEMS switch, and the Agilent RF MEMS switch. He is currently a Professor of Electrical and Computer Engineering with the University of California at San Diego (UCSD), La Jolla, CA, USA. He is the Director of the UCSD/DARPA Center on RF MEMS Reliability and Design Fundamentals. He has graduated over 40 Ph.D. students, and leads a group of 20 Ph.D. students and 5 post-doctoral fellows in the areas of mmw RFIC, microwaves circuits, RF MEMS, planar mmw antennas, and THz systems. He authored RF MEMS: Theory, Design and Technology (Wiley, 2003). Prof. Rebeiz was a recipient of an URSI Koga Gold Medal. He was a recipient of the IEEE MTT 2000 Microwave Prize and the IEEE MTT 2010 Distinguished Educator Award. He was the recipient of the 1998 Eta-KappaNu Professor of the Year Award and the 1998 Amoco Teaching Award given to the Best Undergraduate Teacher at the University of Michigan, and the 2008 Teacher of the Year Award at the Jacobs School of Engineering, UCSD. His students have received a total of 19 Best Paper Awards at the IEEE MTT, RFIC, and AP-S conferences. He is an NSF Presidential Young Investigator. He was the IEEE MTT-S 2003 Distinguished Young Engineer. He has been an Associate Editor of the IEEE MTT-S, and a Distinguished Lecturer of the IEEE MTT-S and the IEEE AP-S.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

1

A SiGe Multiplier Array With Output Power of 5–8 dBm at 200–230 GHz Hsin-Chang Lin, Student Member, IEEE, and Gabriel M. Rebeiz, Fellow, IEEE

Abstract— This paper presents an integrated four-way power-combining multiplier for 200–230-GHz applications in an advanced 90-nm silicon germanium HBT technology. The active multiplier is implemented using balanced transistor pairs driven by pseudodifferential power amplifiers (PAs) at 100–120 GHz and the outputs at 200–240 GHz are power combined using reactive λ/4 impedance transformation networks. A single multiplier breakout results in an output power of 1.8 dBm at 245 GHz with a peak conversion gain of −15.5 dB. The power-combining multiplier achieves an output power of 8 dBm at 215 GHz and >5 dBm at 200–230 GHz. A peak conversion gain of 1.6 dB is achieved at an output power of ∼0 dBm at 215 GHz. The fourway combined multiplier occupies 3.63 mm2 , including pads, and consumes 1.2 A from a 1.8 V supply mainly due to the differential 100–120-GHz PAs. To the best of our knowledge, this is the highest output power achieved in multipliers and amplifiers among all silicon-based technology to-date at frequencies above 200 GHz and with a record wide bandwidth. Index Terms— Frequency doubler, millimeter-wave (mmw) integrated circuits, multiplier, power amplifier (PA), silicon germanium (SiGe) HBT, terahertz (THz).

I. I NTRODUCTION

M

ILLIMETER-WAVE (mmw) systems above 120 GHz have been in active development in the past few years and significant progress has been made to enable complex communication and radar systems to operate at this frequencies. For example, high data-rate communication systems with OOK, QPSK, and 16 QAM modulation schemes have been demonstrated using silicon germanium (SiGe), CMOS, and III–V technologies up to 340 GHz [1]–[5]. Other applications are in radar, security, and imaging systems, which have higher spatial resolution due to the shorter wavelengths at 150–500 GHz [6]–[10]. Advanced CMOS and SiGe currently result in transistor cutoff frequencies f t and f max up to 300–400 GHz, and provide low cost, high yield, and high-integration solutions. However, the transmit power is still in the 0–5 dBm at >200 GHz, since the power amplifiers (PAs) operate close to f t and f max and provide low gain and poor power added efficiency. Therefore, signal generation at >200 GHz is usually Manuscript received October 16, 2015; revised April 25, 2016; accepted May 21, 2016. This work was supported in part by the Defense Advanced Research Projects Agency (DARPA) within the Terahertz Program through Teledyne Scientific, Thousand Oaks, CA, USA, and in part by DARPA within the Elastx Program through the University of California at San Diego. The authors are with the Electrical and Computer Engineering Department, University of California at San Diego, La Jolla, CA 92093 USA (e-mail: [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TMTT.2016.2574310

Fig. 1. High power 220–260-GHz source based on four doublers and reactive power combining.

implemented using high-frequency oscillators (or harmonic oscillators) and wideband-frequency multipliers [11]–[21]. However, oscillators operating at >200 GHz usually have poor phase noise and require high-frequency dividers to lock to a reference. These dividers are very challenging at >150 GHz and very few publications exist [22]. Another approach is frequency multipliers, which result in better phase noise and wideband power generation. The output power of oscillators and multipliers is quite low at >200 GHz and power-combining technique is essential. Voltage/current combining using transformers [23]–[26], current combining using Wilkinson combiners or T-junctions [27]–[31], and spatial combining using on-chip antennas [32]–[36] have all been demonstrated to increase the output power from the chip, and with various degrees of combining efficiency. In this paper, a four-way reactively power-combined multiplier with low-loss current combining networks is presented (Fig. 1). The output combining network is based on λ/4 microstrip transmission lines. The multiplier unit cell is implemented using an active balanced transistor pair with double V cc stubs for better harmonic suppression. A pseudodifferential four-stage common-emitter amplifier with a wideband balun is used as the driver amplifier for the multiplier cell. The four-way power-combining multiplier results a peak output power of 8 dBm and >5 dBm at 200–230 GHz. II. T ECHNOLOGY The IBM 9HP BiCMOS process [37] is chosen for this design. It is a 90-nm SiGe HBT process built on top of a 90-nm CMOS process, with a ten-layer copper metal backend and high-density metal–insulator–metal (MIM) capacitors (12.2 fF/μm2 ). The 4 × 0.1 μm2 transistor model

0018-9480 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 2

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Fig. 2. (a) 50- microstrip transmission line implemented using top metal LD with M2_4B as ground plane. (b) Simulated loss.

with a single emitter finger, dual collector, and base fingers (C-B-E-B-C) results in a peak f t / f max of 310/350 GHz at 1.5–2.5 mA/μm bias current when referred to M1 (Metal 1) [38], and f t / fmax drops to 260/300 GHz when referred to the top metal LD due to the interconnection parasitics [28], [39]. This is much better than the CMOS transistors which typically reduce f t / f max from 460 GHz (referred to M1) to 260 GHz (referred to the top metal) [40]. The better performance is due to the all-copper backend and the thick dielectrics used in the IBM 9HP process. Fig. 2(a) shows the 50- microstrip transmission line for matching stubs and power combining and distributing networks. It is implemented using the top metal LD with 10 μm width and M2_4B as the ground plane (M1_4B as the V cc plane). Electromagnetic (EM) simulations using Sonnet show a loss of 0.71–1.06 dB/mm at 125–250 GHz (0.86–1.3 dB/mm for a G-CPW line at these frequencies) [Fig. 2(b)]. The corresponding transmission line Q is 32–41 at 125–250 GHz (25–35 for a G-CPW at 125–250 GHz). In practice, and from the different measurements done on IBM 8HP and other processes, the actual loss is about ∼30% higher than simulated. Still, a loss of 0.71–1.06 dB/mm at 125–250 GHz is taken for the 50- microstrip lines in the Cadence and Sonnet version 13.54 simulations, since there are no measurements in our group for this process at mmw frequencies. III. D ESIGN A. 250-GHz Doubler Unit Cell Fig. 3(a) shows the schematic of the active balanced doubler. The doubler is designed to deliver maximum output power at 250 GHz with an input power at 125 GHz. The single-ended input is converted into differential using a passive balun and fed into the balanced transistor pair. The collector nodes are connected together to combine all the even harmonics from the transistor pair, and create a broadband short circuit for the fundamental and the odd harmonics.

Fig. 3. (a) Schematic of 250-GHz doubler. (b) Simulated second output power and conversion gain versus transistor size with Pin = 15 dBm with a different transmission line (t-line) loss. (c) Sonnet EM modeling for the input matching network of the doubler unit cell.

The transistor size is optimized using Cadence library models for maximum output power and conversion gain for an input power of 14–15 dBm [Fig. 3(b)]. Note that this optimization is first done without taking into account the input and output matching networks loss and using an ideal lossless balun. This power is selected because the 115–130-GHz PA is capable of delivering 14–16 dBm when operating in differential mode [28] without the use of a balun. Transistors with total widths of 24 μm result in an output power of 8.8 dBm and conversion gain of −6.2 dB at 250 GHz with an input power of 15 dBm. The output power is limited to 8.8 dBm due to the allowed current densities (5 dBm at 234–280 GHz. All unwanted harmonics are 50 dB below the second harmonics when double V cc stubs are used. The output power and conversion gain at 250 GHz drop to 5.2 dBm and −9.8 dB when 30% additional transmission-line loss are added to the matching networks. This 30% extra loss is based on previous measurements and experience on a similar process (IBM8HP). The doubler consumes 16 mA in smallsignal condition and 40 mA in large signal from a 1.7 V supply. B. 250-GHz Four-Way Combined Doubler The multiplier is implemented using four doubler cells and λ/4 impedance transformation networks at output. Four pseudodifferential PAs are implemented using eight single ended four common-emitter gain stages (see [28] for detailed PA design and stability considerations). The baluns are connected at the PA inputs so as not to result in a 1.5-dB loss between the PAs and the doublers (Fig. 6). At the input port, a Wilkinson power divider is first used to divide the RF signal to the left and right circuits with isolation. Then, a 1.5 mm long 50- transmission line is employed together with a 35  λ/4 line and T-junction network to present 50  to the input of each balun. Finally, the balun converts the single-ended signal to a differential signal at the input of the PAs. The balun is designed using the top metal (LD) for the first core and the second metal (OL). The shunt 25-fF MOM capacitor implemented using LD to M2_2B, is connected to the differential port of the balun to achieve simultaneous matching at the input (50 ) and output ports (100 ). The simulated balun insertion loss is 1 are designed by including MOM and MOS structures in a dense layout [23]. The LNAs with the RF PADs are matched to the off-chip 50  load. The design approach relies on carefully modeled passive components. The transformers, transmission lines, MOM capacitors, and RF PADs, including the interconnections and the vias, are imported as a single device into ADS Momentum for EM simulation, which ensures the accuracy of the simulation. VI. M EASUREMENT R ESULTS The two LNAs are fabricated in a 65-nm 7-metal digital CMOS process. The top metal layer has a thickness of 0.77 μm, which to some extent limits the Q of the passive components achievable compared to processes with a thicker top metal layer. The die photographs are shown in Fig. 14. The area of the TF-based LNA and the TL-based LNA excluding all the PADs are 0.22 × 0.51 mm2 and 0.30 × 0.47 mm2 , respectively. Two 110-GHz ground–signal– ground (GSG) probes, two dc probes, a vector network analyzer, a V-band fundamental mixer, a signal generator, a spectrum analyzer, a power meter, and a power sensor are used to perform the measurements. All the measurements include the input and output PADs. Figs. 15 and 16 show the power gain, return loss, and NF measurements of the TF-based LNA and TL-base LNA, respectively. The lowest NF and highest power gain of the TF-based LNA and TL-based LNA are 3.6 and 28.2 dB and

Fig. 15. Measured versus simulated: (a) power gain and return loss and (b) NF of the TF-based LNA.

3.8 and 25.4 dB, respectively, around 54 GHz. As MOM capacitors used in the matching of the TL-based LNA are composed of thin low metal layers, the matching insertion

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. GUO et al.: TRANSFORMER FEEDBACK G m -BOOSTING TECHNIQUE FOR GAIN IMPROVEMENT AND NOISE REDUCTION IN mm-WAVE CASCODE LNAs

9

has been theoretically formulated and detailed analyses on the effects of the parasitic capacitance on the cascode performance have been presented. Simulation results in a 65-nm CMOS process have shown that the proposed technique is capable of increasing the power gain by 3.3 dB and reduced the NF by 0.35 dB. Two 54-GHz single-ended cascode LNAs in a 65-nm CMOS process have been implemented to demonstrate the effectiveness of the proposed technique in realizing lownoise high-gain single-ended LNA at mmWave frequencies. ACKNOWLEDGMENT The authors would like to acknowledge S. Sankaran and B. Kramer, both with Texas Instruments (TI) Incorporated, for their great help and support during the chip tape-out. The authors would also would like to thank TI Incorporated for chip fabrication. R EFERENCES

Fig. 16. Measured versus simulated: (a) power gain and return loss and (b) NF of the TL-based LNA.

loss of the TL-based LNA is degraded slightly due to the high parasitic resistance and substrate losses of the MOM capacitors, making the NF and power gain of the TL-based LNA worse than the TF-based LNA. The measured IP1 dB of the TF-based LNA and the TL-based LNA are −29.4 and −27.5 dBm, respectively. Both the LNAs consume 18 mA from a power supply of 1.1 V. Table I summarizes our LNAs’ performance in comparison to the state-of-the-art. Both of the LNAs achieve comparable or better NF and gain performance compared to existing works. In particular, the TF-based LNA has the best NF and power gain among all the published V-band LNAs in 65-nm CMOS. VII. C ONCLUSION In this paper, a gain improvement and noise reduction technique for a single-ended cascode LNA has been proposed. A transformer has been used in a cascode topology to increase both the power gain and voltage gain and reduce the noise factors. The effective transconductance of the cascode transistor has been boosted and the parasitic capacitance at the node between the two transistors has been eliminated, thus improving the gain and reduce the noise. The proposed technique

[1] H. Darabi and A. A. Abidi, “A 4.5 mW 900-MHz CMOS receiver for wireless paging,” IEEE J. Solid-State Circuits, vol. 35, no. 8, pp. 1085–1096, Aug. 2000. [2] X. Li, S. Shekhar, and D. J. Allstot, “Gm-boosted common-gate LNA and differential Colpitts VCO/QVCO in 0.18-μm CMOS,” IEEE J. Solid-State Circuits, vol. 40, no. 12, pp. 2609–2619, Dec. 2005. [3] D. J. Cassan and J. R. Long, “A 1-V transformer-feedback low-noise amplifier for 5-GHz wireless LNA in 0.18-μm CMOS,” IEEE J. SolidState Circuits, vol. 38, no. 3, pp. 427–435, Mar. 2003. [4] D. Gangopadhyay, S. Shekhar, J. S. Walling, and D. J. Allstot, “A 1.6 mW 5.4 GHz transformer-feedback gm-boosted current-reuse LNA in 0.18-μm CMOS,” in Proc. IEEE Int. Circuits Syst. Symp., May 2010, pp. 1635–1638. [5] A. Liscidini, C. Ghezzi, E. Depaoli, G. Albasini, I. Bietti, and R. Castello, “Common gate transformer feedback LNA in a high IIP3 current mode RF CMOS front-end,” in Proc. IEEE Custom Integr. Circuits Conf., Sep. 2006, pp. 25–28. [6] T. Yao et al., “Algorithmic design of CMOS LNAs and PAs for 60-GHz radio,” IEEE J. Solid-State Circuits, vol. 42, no. 5, pp. 1047–1054, May 2007. [7] R. Fujimoto, K. Kojima, and S. Otaka, “A 7-GHz 1.8-dB NF CMOS low-noise amplifier,” IEEE J. Solid-State Circuits., vol. 37, no. 7, pp. 852–856, Jul. 2002. [8] H. Samavati, H. R. Rategh, and T. H. Lee, “A 5-GHz CMOS wireless LAN receiver front-end,” IEEE J. Solid-State Circuits, vol. 35, no. 5, pp. 765–772, May 2000. [9] B. Huang, K. Lin, and H. Wang, “Millimeter-wave low power and miniature CMOS multicascode low-noise amplifiers with noise reduction topology,” IEEE Trans. Microw. Theory Techn., vol. 57, no. 12, pp. 3049–3059, Dec. 2009. [10] H. Hsieh, P. Wu, C. Jou, F. Hsueh, and G. Huang, “60 GHz high-gain low-noise amplifiers with a common-gate inductive feedback in 65 nm CMOS,” in IEEE RF Integr. Circuits Conf., Jun. 5–7, 2011, p. 1, 4. [11] X. Fan, H. Zhang, and E. Sanchez-Sinencio, “A noise reduction and linearity improvement technique for a differential cascode LNA,” IEEE J. Solid-State Circuits, vol. 43, no. 3, pp. 588–599, Mar. 2008. [12] S. Guo et al., “54 GHz CMOS LNAs with 3.6 dB NF and 28.2 dB gain using transformer feedback gm-boosting technique,” in Proc. IEEE Asian Solid-State Circuits Conf., Nov. 2014, pp. 185–188. [13] T. H. Lee, The Design of CMOS Radio-Frequency Integrated CIrcuits. Cambridge, U.K: Cambridge Univ. Press, 1998. [14] D. K. Sheffer and T. H. Lee, “A 1.5 V, 1.5 GHz CMOS low-noise amplifier,” IEEE J. Solid-State Circuits, vol. 32, no. 5, pp. 745–759, May 1997. [15] D. K. Sheffer and T. H. Lee, “Corrections to ‘A 1.5 V, 1.5 GHz CMOS low-noise amplifier’,” IEEE J. Solid-State Circuits, vol. 40, no. 6, pp. 1397–1398, Jun. 2005. [16] T. K. Nguyen, C. H. Kim, G. J. Ihm, M. S. Yang, and S. G. Lee, “CMOS low-noise amplifier design optimization techniques,” IEEE Trans. Microw. Theory Techn., vol. 52, no. 5, pp. 1433–1442, May 2004. [17] J. R. Long, “Monolithic transformers for silicon RF IC design,” IEEE J. Solid-State Circuits, vol. 35, no. 9, pp. 1368–1382, Sep. 2000.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 10

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

[18] H. Yeh, Z. Liao, and H. Wang, “Analysis and design millimeter-wave low-power CMOS LNA with transformer-multicascode topology,” IEEE Trans. Microw. Theory Techn., vol. 59, no. 12, pp. 3441–3454, Jan. 2011. [19] S. H. M. Lavasani and S. Kiaei, “A new method to stabilize high frequency high gain CMOS LNA,” in Proc. IEEE Electron., Circuits, Syst. Conf., Shadah, UAE, Dec. 2003, pp. 982–985. [20] T. Yao et al., “Algorithmic design of CMOS LNAs and PAs for 60-GHz radio,” IEEE J. Solid-State Circuits, vol. 42, no. 5, pp. 1044–1057, May 2007. [21] C. H. Doan, S. Emami, A. M. Niknejad, and R. W. Brodersen, “Millimeter-wave CMOS design,” IEEE J. Solid-State Circuits, vol. 40, no. 1, pp. 144–155, Jan. 2005. [22] T. O. Dickson et al., “The invariance of characteristic current densities in nanoscale MOSFETs and its impact on algorithmic design methodologies and design porting of Si(Ge) (Bi)CMOS high-speed building blocks,” IEEE J. Solid-State Circuits, vol. 41, no. 8, pp. 1830–1845, Aug. 2006. [23] E. Cohen, O. Degani, and D. Ritter, “A wideband gain-boosting 8 mW LNA with 23 dB gain and 4 dB NF in 65 nm CMOS process for 60 GHz application,” in Proc. IEEE RF Integr. Circuits, Jun. 17–19, 2012, p. 207, 210. [24] H. Yeh, C. Chiong, S. Aloui, and H. Wang, “Analysis and design of millimeter-wave low-voltage CMOS cascode LNA with magnetic coupled technique,” IEEE Trans. Microw. Theory Techn., vol. 60, no. 12, pp. 4066–4079, Dec. 2012. [25] M. Tsai, S. S. H. Hsu, F. Hsueh, C. Jou, and T. Yeh, “Design of 60GHz low-noise amplifiers with low NF and robust ESD protection in 65-nm CMOS,” IEEE Trans. Microw. Theory Techn., vol. 61, no. 1, pp. 553–561, Jan. 2013. [26] D. Cai, Y. Shang, H. Yu, and J. Ren, “Design of ultra-low-power 60-GHz direct-conversion receivers in 65-nm CMOS,” IEEE Trans. Microw. Theory Techn., vol. 61, no. 9, pp. 3360–3372, Sep. 2013. [27] H. Wu, N. Wang, Y. Du, and M. F. Chang, “A blocker-tolerant current mode 60-GHz receiver with 7.5-GHz bandwidth and 3.8-dB minimum NF in 65-nm CMOS,” IEEE Trans. Microw. Theory Techn., vol. 63, no. 3, pp. 1053–1062, Mar. 2015.

Shita Guo (GSM’14–M’15) received the B.S. and M.S. degrees in electrical engineering from the University of Science and Technology of China (USTC), Hefei, China, in 2006 and 2009, respectively, and the Ph.D. degree in electrical engineering from Southern Methodist University (SMU), Dallas, TX, USA, in 2015. From 2009 to 2011, he was with Sychip Inc., Shanghai, China, where he was involved in RF integrated passive devices and wireless subsystems for mobile products. Since 2014, he has been with Texas Instruments Incorporated, Dallas, TX, USA, with the High-Speed Interface Group. He has authored or coauthored 18 peer-reviewed publications. His research interests include RF/mm-wave integrated circuits (ICs) for wireless communication system, and high-performance analog and mixed-signal ICs for high-speed serial data links.

Tianzuo Xi (GSM’14–M’14) received the B.S. and M.S. degrees in electrical engineering from the University of Science and Technology of China (USTC), Hefei, China, in 2007 and 2010, respectively, and the Ph.D. degree from Southern Methodist University (SMU), Dallas, TX, USA, in 2015. In 2013 and 2014, he interned with Qualcomm and Samsung. Since 2015, he has been involved with power amplifiers with Qualcomm, Boston, MA, USA. His research interests are RF and millimeterwave (mmWave) integrated circuit (IC) design.

Ping Gui (S’03–M’04–SM’09) received the Ph.D. degree in electrical and computer engineering from the University of Delaware, Newark, DE, USA. In 2004, she joined the faculty of the Lyle School of Engineering, Southern Methodist University (SMU), Dallas, TX, USA, where she is currently an Associate Professor of electrical engineering. Her research interests include digital, analog, mixed-signal, and RF integrated circuit (IC) design for a variety of applications including high-speed wireline transceivers, wideband wireless communications using millimeter-wave (mmWave) high-speed ADCs and circuits and systems for harsh and extreme environments. Dr. Gui has served as the Technical Chair of the IEEE Solid-States Circuits Society Dallas Chapter since 2007. Since 2015, she has served on the Technical Program Committee (TPC), IEEE Radio-Frequency Integrated Circuits Symposium (RFIC). She was a recipient of the CERN Scientific Associate Award (2008–2010), IEEE Dallas Section Outstanding Service Award (2011), and SMU Ford Research Fellowship Award (2015).

Daquan Huang (A’04–M’04–SM’06) possesses 30 years of experience started beginning in 1985 when he was with the faculty of the Department of Information and Electronics Engineering, Zhejiang University, where his research interests have included RF integrated circuit (RFIC) and monolithic microwave integrated circuit (MMIC) designs, microwave and antenna theory, optoelectronics, and neural networks. From 1995 to 2002, he served as the Head of the Electronic Information Technology Laboratory and, since 2000, as the Vice Chair of the department. From 2002 to 2007, he was with the University of California at Los Angeles, as a research faculty member, where he pioneered CMOS millimeter-wave (mmWave) technologies that focused on 60-GHz short-range ultra-high-speed communications applications. He has invented varies CMOS mmWave technologies including the on-chip transformer folded cascode circuit topology (Origami) and the on-chip digital controllable artificial dielectrics (DiCAD), which were recognized as groundbreaking technologies by the Defense Advanced Research Projects Agency (DARPA). Origami solves the fundamental issues in low-power, low-noise, and high linear actives operation, and DiCAD addresses the fundamental passives needs for wideband frequency, phase, and impedance tuning. During his stay with the University of California at Los Angeles (UCLA), he helped the High-Speed Electronics Laboratory receive millions of dollars in research funding from DARPA. In 2007, he joined Texas Instruments Incorporated, as a Member of Group Technical Staff. While focusing on the advanced CMOS RFICs for the next generation of wireless communications systems, he has been extensively involved with digital RF processors (DRP). He designed and led the effort in the development of 10 ∼ 20 GB/s field-effect transistor (FET) switches, 14-GB/s LVDS switches, and multiple other wireline communications products for Apple, Cisco, Huawei, etc. In January 2009, he joined the newly formed research center of Texas Instruments Incorporated, Kilby Laboratories, where he led the effort in developing ultra-high-speed shortdistance communications products including the 60-GHz TRx and the start-ofthe-art 40-GHz phase-locked loop (PLL) with a root mean square (rms) jitter of 100 fs. From 2011 to 2013, he was involved with the design of 77-GHz radar product development in 45-nm CMOS. He has been a Director with Samsung Research America, Dallas, TX, USA, since 2013. As a Principal Engineer, he leads the activities of CMOS mmWave integrated circuit (IC) development for fifth-generation (5G) communications, including 28-GHz CMOS front-ends for cellular applications and 125-GHz transceiver systemson-chip (SoCs) with RF beamformers for next-generation Wi-Fi applications. He has authored or coauthored over 100 journal and conference papers with over 700 citations. He holds 18 U.S. patents and several pending patents. Dr. Huang has served as an IEEE journal reviewer. He has given over 30 technical presentations, lectures, and invited talks at technical conferences. He was the recipient of the 2011 Kilby Laboratories’ Innovator Award for mmWave transceivers. In 2013, he was elected a top 30 Texas Instruments Incorporated innovator.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. GUO et al.: TRANSFORMER FEEDBACK G m -BOOSTING TECHNIQUE FOR GAIN IMPROVEMENT AND NOISE REDUCTION IN mm-WAVE CASCODE LNAs

Yanli Fan (M’08) received the M.S. degree in electrical engineering from the University of Maryland at College Park, College Park, MD, USA. She was with IBM and the Hittite Microware Corporation, prior to joingin Texas Instruments Incorporated, Dallas, TX, USA, in 2002, where she is currently a Design Engineer Manager and a Distinguished Member of the Technical Staff. Her research interests include high-speed clock data recovery and transceiver circuit design. She has authored or coauthored many publications appearing in major technical journals. She holds more than 16 patents.

11

Mark Morgan attended the University of Wisconsin, Madison, WI, USA. He received the BSEE and MSEE degrees from Marquette University, Milwaukee, WI, USA. He is currently a Manager with Kilby Laboratories, Texas Instruments Incorporated, Dallas, TX, USA, where he directs the High Voltage Isolation Technology Program. In 1997, he joined Texas Instruments Incorporated as an Analog Design Engineer in HPA, where he focused on high-speed interface designs. Prior to joining Kilby Laboratories, he was Chief Technical Officer (CTO) for the INT Business Unit, where he managed a design team focused on advanced circuit technologies. He has also served as the Analog Design Branch Manager for the high-speed interface product line. He is also a Distinguished Member of the Technical Staff. He holds 25 patents.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

1

Low-Phase-Noise 54-GHz Transformer-Coupled Quadrature VCO and 76-/90-GHz VCOs in 65-nm CMOS Tianzuo Xi, Member, IEEE, Shita Guo, Member, IEEE, Ping Gui, Senior Member, IEEE, Daquan Huang, Senior Member, IEEE, Yanli Fan, Member, IEEE, and Mark Morgan, Member, IEEE

Abstract— This paper presents new circuit topologies and design techniques for low-phase-noise (PN) complementary metal–oxide–semiconductor (CMOS) millimeter-wave quadrature voltage-controlled oscillator (QVCO) and VCOs. A transformer-coupled QVCO topology with extra phase shift is proposed to replace the coupling transistors, which eliminates coupling transistors’ noise, decouples the tradeoff between PN and phase error, and improves the PN performance. This technique is demonstrated in a millimeter-wave QVCO with a measured PN of −119.2 dBc/Hz at 10-MHz offset of a 56.2-GHz carrier and a tuning range of 9.1%. In addition, an inductivedivider-feedback technique is proposed in an LC VCO design to improve the transconductance linearity, resulting in a larger signal swing and lower PN compared with the conventional LC VCOs. The effectiveness of this approach is demonstrated in a 76- and a 90-GHz VCO design, both fabricated in a 65-nm CMOS process, with an FOMT of 173.6 and 173.1 dBc/Hz, respectively. Index Terms— Oscillator, phase error, phase noise (PN), quadrature voltage-controlled oscillator (QVCO), transconductance linearization, transformer, VCO.

I. I NTRODUCTION

L

OW-PHASE-NOISE (PN) voltage-controlled oscillators (VCOs) and quadrature VCOs (QVCOs) are among the most critical and challenging components in millimeterwave IC design for wireless communications, automotive radar, satellite communications, and other applications. Quadrature signal references enable direct-conversion transceiver architectures and provide the driving signals for the phase rotators in the phased arrays system [1]. QVCOs ∼60 GHz with active transistors coupling were published in [2] and [3], achieving PN of −75 and −85 dBc/Hz at 1-MHz offset, respectively. An injection-locked 60-GHz QVCO with an oscillator source running at 20 GHz was reported in [4], which could reach a PN of −113 dBc/Hz

Manuscript received June 16, 2015; revised December 13, 2015 and May 19, 2016; accepted May 25, 2016. T. Xi, S. Guo, and P. Gui are with the Department of Electric Engineering, Southern Methodist University, Dallas, TX 75205 USA (e-mail: [email protected]; [email protected]; [email protected]). D. Huang was with Texas Instruments Incorporated, Dallas, TX 75243 USA. He is now with Samsung Research America, Richardson, TX 75082 USA (e-mail: [email protected]). Y. Fan and M. Morgan are with Texas Instruments, Dallas, TX 75243 USA (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TMTT.2016.2574716

at 10-MHz offset. Reference [5] reported a ring-based transformer-coupled 60-GHz QVCO with a PN performance of −117 dBc/Hz at 10-MHz offset. A diode-connected-transistor coupled 60-GHz QVCO in [6] reached low power and a PN of −115 dBc/Hz at 10-MHz offset. On the VCO side, it is challenging to design complementary metal–oxide–semiconductor (CMOS) W -band VCOs with low power and low PN. One common method is to use a VCO running at lower frequencies together with doublers, multipliers, subharmonic-injection-locking or mixing, which can achieve better PN than a fundamental VCO, but consume more power and silicon area. A 100-GHz active-varactor VCO with four cross-coupled pairs and transformers was introduced in [7]. A 105-GHz VCO using four coupled Colpitts oscillators was introduced in [8]. While these techniques demonstrated low PN, it is still desirable to design low-power and low-PN millimeter-wave VCO using more straightforward topologies. An approach based on transconductance linearization of the active devices by capacitively dividing the drain voltage of the core transistors was used in a 25-GHz VCO to increase the signal swing to achieve low PN [9]. This topology needed a large biasing inductor choke, the self-resonance frequency of which needed to be much higher than the oscillation frequency, making this topology not suitable for designing fundamental W -band VCO. In [10], this linearization approach was used in conjunction with a frequency doubler to realize an 80-GHz frequency synthesizer. This paper presents new circuit topologies, design techniques, and measurement results of the low-PN millimeterwave QVCO and VCOs in CMOS [11]. In the QVCO design, a new transformer-coupled QVCO topology with extra phase shift is proposed to realize a low PN of −119.2 dBc/Hz at 10-MHz offset of the carrier frequency of 56.2 GHz. In the VCO design, a new transconductance linearization method by inductively dividing the gate voltage of the crosscoupled transistors is proposed, which reduces the PN by utilizing additional power, and eliminates the large inductor choke required in [9], rendering it more suitable for W -band VCO operation. This approach is applied to the designs of a 76- and a 90-GHz VCOs and the measurement results demonstrate good PN and figure of merit (FOM). Section II presents the analysis and design of the proposed 54-GHz low-PN QVCO. Section III presents the analysis and design of the proposed low-PN W -band VCOs.

0018-9480 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 2

Fig. 1.

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Schematic of the novel transformer-coupled QVCO.

Section IV shows the measurement results and Section V draws the conclusions. II. 54-GH Z QVCO BASED ON T RANSFORMER C OUPLING Conventional LC-QVCO designs suffer from a tradeoff between phase error and PN. As the coupling strength increases, the phase error decreases, but the PN rises due to the quadrature coupling between the I and Q tanks. This tradeoff is decoupled in the proposed QVCO by adding an extra phase shift between the I and Q tanks, making the coupling between them almost in phase with each other. In addition, the proposed QVCO is based on transformer coupling, which eliminates the thermal noise and 1/ f noise of the active coupling devices, thus improving the PN performance significantly compared with the conventional actively coupled QVCOs. A. Working Mechanism of the Proposed QVCO The detailed schematic of the proposed QVCO is shown in Fig. 1. The QVCO is composed of two LC VCOs with passive transformer coupling. Each LC VCO contains two cross-coupled transistors, the primary coil of one transformer, a 4-b switched capacitor bank C1 plus a varactor, the secondary coil of the other transformer, and a 3-b switched capacitor bank C2 . The primary coil of each transformer at the drain node of each VCO is used to resonate with the output capacitance and to simultaneously couple to the secondary coil at the source node of the other VCO. The proposed structure is similar to the transformer-coupling-based 17-GHz QVCO presented in [12], but with a key component added to decouple the PN and phase accuracy: the switched-capacitor bank C2 . Without C2 , the coupled current is in quadrature with the tank’s inherent current, which makes the oscillation frequency depart away from the resonance frequency of the tank and degrades the PN of the QVCO. By adding C2 in the secondary coil of the transformers, a phase shift of up to 90° can be added to the coupled current to make it in phase with the tank current, which improves the PN performance without affecting the phase accuracy of the QVCO, and decouples the tradeoff between phase accuracy and PN. In this way, the two LC-tanks can realize in-phase coupled QVCO. The 90° phase-shifting coupling network in our proposed QVCO scheme is, in essence, a coupled resonator, which was

first used in the ring-structure quadrature VCO [13] based on capacitive coupling. The ring-structure QVCO was then improved by replacing the capacitive coupled resonators with magnetically coupled resonators to decrease the silicon area and to make it more suitable for millimeter wave in [5]. The ring-structure QVCOs [5], [13] only contain one loop for both the oscillation and the coupling. This means the coupled resonators are not only used for coupling but also as key parts of the ring oscillation loop themselves, which would affect the oscillation loop gain. In our proposed QVCO, the transformercoupling network works in the same role as the active coupling transistors in traditional LC-tank QVCO, locking the quadrature LC tanks with 90° phase difference in a weak strength loop. More importantly, the coupling network is not part of the oscillation loop. The transformer’s secondary coil and C2 in the proposed QVCO introduce source degeneration to the driving transistors, which to some extent would require larger transconductance compared with the nondegeneration case, but the source degeneration brings an important benefit that is improving the transistor’s linearity and thus PN performance albeit with additional power consumption. B. Detailed Analysis of the Transformer Coupling Network With Phase Shifting As shown in Fig. 1, the QVCO coupling loop starts from transistors M1 /M2 , through the coupled resonator CR1 and transistors M3 /M4 , and then to CR2 , which is cross coupled back to the source node of M1 /M2 . When the QVCO oscillates, the I and Q tanks oscillate with 90° phase difference between each tank’s intrinsic currents, I I and I Q . We denote the current through the primary coil of CR1 as IIL , the coupled current from the primary coil to the secondary coil of CR1 as I K , and the coupled current from tank I to tank Q as IKQ . I I is 90° ahead of IIL when C1 resonates with the primary coil. Without C2 , IKQ is the same as I K , having a phase close to that of I I (in practice, the circuit would have some phase shift due to the transistor load and parasitic capacitance), which is 90° ahead of I Q . Due to the 90° phase difference between the coupled current, IKQ , and the tank’s intrinsic current, I Q , QVCO would oscillate at a frequency away from the resonance frequency of the main tank and obtain a worse PN than a VCO does [14], which is why the PN degrades in a typical

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. XI et al.: LOW-PN 54-GHz TRANSFORMER-COUPLED QUADRATURE VCO AND 76-/90-GHz VCOs

3

where Vin is the voltage across the primary coil, IIL is the current through the first coil L 1 , M is the mutual inductance (M = k(L 1 L 2 )1/2 , k being the transformer coupling coefficient), and I K is the current through the secondary coil. Likewise, for the secondary coil, we have [based on Fig. 3(b)]   1 + R S1 + R S2 + s L 2 = Ms I L1 (2) IK sC2

Fig. 2.

where L 2 and C2 are the inductance and the capacitance of the secondary coil, respectively. Combining (1) and (2) yields I K as  L2 Y2nd_coil (s) (3) I K = Vin k L1

Phasor relationship of I Q , I I , and IKQ .

where Y2nd_coil is defined as Y2nd_coil (s) =

Fig. 3. Magnetically coupling network with secondary coil’s equivalent resistance (a) in parallel and (b) in series with C2 .

QVCO compared with a VCO. When C2 is added and with the secondary coil, IKQ is made 90° lagging behind I K , thus in-phase with I Q , which avoids the PN degradation in typical QVCO. Fig. 2 shows the phasor relationship of I I , I Q , and IKQ (with and without C2 ). In the actual implementation, the finite quality factor Q of the tanks would prevent a perfect 90° phase shift from being accomplished. In our design, the total phase shift added between IKQ and I I is around −74.3°. The Q value of the primary coil of the transformer decreases due to the coupling network. However, since the Q of the LC tank is mainly determined by the switched capacitor bank C1 at millimeter-wave frequencies, the Q degradation of the inductor by the coupling network would not affect the PN much. Detailed analysis on these is presented below. 1) Phase Shift by the Secondary Tank of the Transformer: To understand how much phase shift the transformer coupling network can provide, we will find the phase relationship between IKQ and I Q in the following analysis. We start with finding the expression for I K . Fig. 3(a) is the RLC representation of the transformer coupling network (where R P1 , R P2 , R P L2 ,R PC2 , and R Pload represent the equivalent parallel resistance of each of the following: the primary tank, the secondary tank, L 2 , C2 , and the transistors loading, respectively). Fig. 3(b) draws the secondary portion in its serial format for the convenience of the analysis where R S1 and R S2 are the equivalent resistances in series with L 2 and C2 (including the transistor loading from the secondary tank), respectively. From Fig. 3(a), we have Vin = L 1 s IIL − Ms I K

(1)

1 1 sC 2

+ R S1 + R S2 + s L 2 (1 + k 2 )

.

(4)

Based on Fig. 3(a), IKQ can be obtained as  1 L2 1 IKQ = I K = Vin k Y2nd_coil (s) . 1 + s R P2 C2 L1 1 + s R P2 C2 (5) From (4), we find that when s = j ω2 where ω2 = 

1 C2 L 2 (1 + k 2 )

(6)

the imaginary part of Y2nd_coil is zero. Thus, at this ω2 , (5) can be rewritten as  1 L2 1 IKQ = Vin k . (7) L 1 (R S1 + R S2 ) 1 + s R P2 C2 Furthermore, if Q of the secondary tank Q 2 is relatively high, i.e., |s R P2 C2 |  1, since Q 2 = ω2 C2 R P2 =

1 ω2 C2 (R S1 + R S2 )

and (7) can be written as  L2 1 1 IKQ ≈ Vin k · . L 1 (R S1 + R S2)R P2 C2 s

(8)

We can see from (8) that the phase of IKQ is now 90° lagging of Vin at ω2 , thus it is in phase with I Q , namely, the two LC tanks couple in-phase. However, many factors including the effect of the secondary tank on the first tank and the strength of coupling loop limit the choice of Q 2 . To understand how the secondary tank affects the Q of the first tank and the oscillation frequency, we examine Z in , as shown in Fig. 3, the impedance looking into the first coil of the transformer without the loss of the first coil itself (which is already included in R P1 in Fig. 3). Using (1) and (2), Z in and its quality factor Q(Z in ) are

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 4

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

obtained as follows: Z in =



Vin = s L1 1 + II L

sk 2 L 2

which allows for more freedom in our approach to optimize both GainCL and GainOL . This is one of the key differences between our approach and [5]. In the steady state, gm would decrease to the equivalent transconductance G m , at which GainOL would equal to 1, and α would decrease a bit because GainCL decrease more than GainOL does. Therefore, α should be chosen to ensure acceptable phase error in the steady state case. From (12) to (15), we can get an approximated analytical expression for θerror as



+ s L 2 + R S1 + R S2  2 k 2 1 − ω2 L1 C 2 2 = s L1 + s L1  2  2 1 − ω2 L1 C + Q12 1 sC 2

2 2

2

k 2 L 1 (R S1 + R S2 )/L 2 + 2  2 + 12 1 − ω2 L1 C Q2 2 2  2    1 Q2 1 2 1− 2 . + Q(Z in ) ≈ 2 k ω L 2 C2 Q2

(9)

(10)

We know from Fig. 3 that the oscillation frequency of the QVCO ωOSC is determined as ωOSC ≈ √

1 im(Z in )C1

(11)

where im(Z in ) is the imaginary part of Z in . From (10), we can tell that the value of k and Q 2 should be chosen low enough to achieve a high Q of Z in , minimizing the degradation on the primary tank’s Q and improving the PN. However, as will be demonstrated in the next section, Q 2 together with k determines the strength of the coupling loop, and their values cannot be chosen too low, or the coupling strength will not be high enough to keep a low phase error. 2) Strength of the Coupling Loop: In conventional activeparallel-coupling QVCO design, the coupling factor α is defined as GainCL (12) α= GainOL where GainOL is the oscillation loop gain from VI + to VI − , and GainCL is the coupling loop gain from VI + to VQ+ . From [14], we get the phase error as θerror ∝

Q 1 ω · α 2 ωOSC

(13)

where ω is the mismatch between the resonant frequencies of the two tanks and Q 1 is the quality factor of the primary tank. α is typically chosen to be in the range of 0.2–0.25 to realize acceptable phase error and reasonable PN (typically 3–5 dB higher than that of a VCO) [14]. From Fig. 1, GainCL from VI + to VQ+ can be expressed as GainCL

IKQ R P2 R P1 R P1 Vout = = gm gm Vin 2 Vin 2  1 L2 R P1 =k Y2nd_coil (s) gm L1 1 + s R P2 C2 2

(14)

and GainOL from VI + to VI − as GainOL ≈

R P1 gm · 1 + gm R P2 /2 2

(15)

where the first term in (15) is due to the source degeneration of the secondary coil and C2 . Note that the transformer coupling coefficient k only appears in GainCL but not in GainOL ,

θerror ∝

Q1 ω · . Q 22 k 2 ωOSC

(16)

3) Choice of C2 , k, Loop Gain, and Coupling Factor: Based on the above analysis, in our design Q 2 and k are chosen to be around 1 and 0.2, respectively, which results in Q(Z in ) close to 27. The Q of the primary tank itself is around 6, determined by the switched nMOS capacitor array, and is decreased to 4.9 by the secondary tank. With such values of Q 2 and k, the second and third terms in (9) become almost negligible, and the imaginary part of Z in is close to sL1 . The oscillation frequency thus becomes 1 (17) L 1 C1 which is close to the original resonant frequency of the primary tank. At the same time, the phase lag from the term 1/(1+s R P2C2 ) is ∼49.8° instead of 90° at ω2 , as shown earlier in (8). The additional phase shift desired can be provided by term Y2nd_coil in (5). We choose L 2 C2 to be 40% larger than L 1 C1 , (which means ωOSC is ∼1.183 ω2 ), so that Y2nd_coil can provide 24.5° phase lag at the oscillation frequency ωOSC . With the phase lag of 49.8° from 1/(1 + j ωOSC R P2 C2 ), the total phase lag of IKQ compared with I I is 74.3°, which makes IKQ to be almost in phase with I Q . In a typical QVCO, GainOL is designed as 1.5 (i.e., gm R P = 3) for robust oscillation startup. In our design, GainOL is designed to be around 1.7 (gm R P = 4.9) and GainOL is less sensitive to possible device modeling inaccuracies compared with that in a typical QVCO due to the source degeneration. In our design, we chose the coupling factor α in steady state to be around 0.3 to maintain low phase error while minimizing the PN using the proposed transformer coupled approach. ωOSC ≈ √

C. Implementation of the Transformers The detailed 2-D view with the device dimensions of the two transformers, together with the ground and interconnections, is shown in Fig. 4. The QVCO is implemented in a process with seven copper layers plus a top aluminum (Al) layer. The core transformers are implemented using the top copper layer, which has a thickness of 0.77 μm and a distance of 3.48 μm to the substrate. Due to the thin thickness of the top copper layer, the transformers are implemented using 6-μm-width metal trace to realize the primary coil of 70 pH with Q of 22 and an estimated parasitic capacitance of 9 fF to the substrate. In typical QVCO implementations, the two LC tanks would couple with each other magnetically through the air and the substrate, especially at millimeter-wave frequencies. If there

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. XI et al.: LOW-PN 54-GHz TRANSFORMER-COUPLED QUADRATURE VCO AND 76-/90-GHz VCOs

5

Fig. 5. PN simulation of the proposed transformer coupled QVCO with C2 , without C2 , and a typical VCO at 54 GHz.

Fig. 4. Detailed 2-D view of the two transformers, ground, and wire connections.

is no quadrature coupling loop constraining the two LC tanks, the two LC tanks would magnetically inject into each other and lock to each other with exactly the same phase. In the QVCO, since the two LC tanks oscillate with 90° difference, the transformer coupling strength between the two tanks through the air and substrate would produce phase error and some PN degradation. If the two LC tanks are located too close to each other, this mechanism would introduce large phase error. If the two LC tanks are located too far away, the interconnections between them would have large parasitic inductance and resistance, decreasing the quadrature coupling strength and increasing the source degeneration strength, which in turn would require large power consumption to compensate. In our design, the two core transformer tanks were located 40 μm away. Furthermore, the top Al layer, with a thickness of 1.063 μm, is used to build solid ground around the transformers to reshape the magnetic field (MF) distribution, making most of them to be confined in the area between the tank and the solid Al ground. This reduces the amount of MF going to the other tank, which would otherwise introduce phase error. Meanwhile, the Al ground also makes most of the electrical field (EF) lines end up on the Al ground, preventing the EF from penetrating into the lossy silicon substrate, which otherwise would have increased the PN and phase error. The large space between the two Al ground is to decrease the coupling strength between the two tanks. To ensure the accuracy of the simulation, the two transformers, including the interconnections, the vias, and the Al ground, were imported as a single device into ADS Momentum for EM simulation.

the same LC tank and the same bias current (VCO consumes the same current as one tank of QVCO). The simulation results are shown in Fig. 5. The PN of the QVCO with C2 added in the secondary coil is improved by up to 8 dB at 1-MHz offset frequency and 3.4 dB at 10-MHz offset frequency compared with that without adding C2 in the secondary coil. The improvement on the PN at 10-MHz offset frequency is mainly due to the reduction in thermal noise, whereas the improvement at 1MHz offset frequency is due to reduction in both flicker noise and thermal noise. We can also see that the corner frequency between 1/f 3 and 1/f 2 is moved from around 5.5 MHz without C2 to around 1.5 MHz with C2 , which is due to the flicker noise reduction. The QVCO without C2 can achieve similar PN performance as a typical VCO at 54 GHz, because the transformer coupled technique [12] introduces much less noise compared with transistor coupling technique used in typical QVCO, meanwhile providing inductance source degeneration, which help decrease the PN. E. Effectiveness of Al Ground on Decreasing the Systematic Phase Error of QVCO The systematic phase error is simulated with EM data of the passive devices (including interconnections and the coupling between two transformers) and extracted transistors layout and it is smaller than 0.5°. Without the Al ground around the transformers, the phase error is larger than 5° under the same conditions. This demonstrates that the Al ground is effective in decreasing the systematic phase error, caused by magnetic coupling through the air/substrate and electric coupling through the substrate between the two transformers. F. Simulation of Phase Error Dependence on k, Q 1 , and Q 2 of the Proposed QVCO Fig. 6(a) shows the simulation of the phase error as a function of k and Fig. 6(b) shows the phase error as a function of Q 1 , where a 1% mismatch is added between the two inductors in the transformer and k = 0.2, Q 1 = 5, and Q 2 = 1 are assumed. The simulation results agree with (16).

D. Simulation on the Phase Noise of the Transformer Coupled QVCOs With, Without C2 , and a Typical VCO

G. Design Methodology of the Proposed QVCO

We simulated the PN of QVCOs for cases with C2 and without C2 and the PN of a typical VCO, all at 54 GHz with

The proposed QVCO can be designed, following the following steps.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 6

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Fig. 7.

Traditional LC VCO.

Q of the tank as shown in the following: R P ||(2ron (t)) . (19) ωL In particular, ron (t) could be even smaller than R P when the devices are in triode region. In practice, most millimeter-wave VCOs’ oscillation amplitude VAmp typically needs to be larger than 0.5 Vth to meet the amplitude requirement to drive power-efficient buffers and to get good PN performance. However, too large oscillation amplitude would make the transistors to spend more time in the triode region, which degrades the Q of the LC tank and the PN performance. This is why we usually bias the VCO at the edge between the current-limiting regime and the voltage-limiting regime. Beyond the voltage-limit regime, further increasing the current for higher signal amplitude would not improve or even degrade the PN performance due to the Q reduction. As illustrated in the following, our proposed method compared with the traditional LC VCO is able to not only increase the signal amplitude but also keep the Q from being degraded, thus reducing the PN. The detailed schematic of the proposed VCO based on transconductance linearization of the active devices is shown in Fig. 8(a). The VCO is composed of three inductors L 1 , L 2 , and L 3 , and a capacitor C1 , which consists of an MOS capacitor array, varactor C V , nMOS M1 and M2 , and a resistor for biasing. Q(t) =

Fig. 6. Phase error simulation of the transformer coupled QVCO with different (a) k and (b) Q 1 values.

1) Design a standard millimeter-wave VCO with the targeting center frequency and tuning range. 2) Build a transformer (including the connection wires) with the target k value, and replace the inductor in the standard VCO with the transformer. Then design the QVCO following the proposed topology as shown in Fig. 1. Adjust the transformer size and capacitor value to obtain the targeting oscillation frequency and tuning range. 3) Sweep capacitor C2 , and find the proper value of C2 for the lowest PN. 4) Add the ground strips around the transformer, and adjust the transformer size and capacitor C1 to obtain the targeted oscillation frequency. The spacing between the ground stripe and transformer is chosen to meet the phase error requirement. Smaller spacing leads to lower phase error, but a too small spacing would decrease the inductance of the transformer, which would require larger-size transformer and introduce more loss from the substrate. III. 76- AND 90-GHz VCOs U SING I NDUCTIVE D IVIDER F EEDBACK T ECHNIQUE The PN performance of an LC VCO is largely determined by the quality factor Q of the tank and the maximum achievable signal amplitude VAmp . When VAmp is small, the transistors are in saturation region and the effective large-signal transconductance G m,eff of the devices is equal to gm of the driving transistors. When VAmp increases to a value such that Vth VG 1 − V D1 > (18) 2 2 where VG1 and V D1 are the gate and drain voltages of transistor M1 as shown in Fig. 7, M1 starts to work in the triode region, G m,eff decreases, and the channel resistance ron (t) of M1 also decreases. This reduced ron (t) would decrease the VAmp =

A. Principle of the Transconductance Linearization Method In our proposed topology, the gates of M1 and M2 sense only part of the LC tank’s oscillation amplitude through the inductive divider made of L 2 /L 3 and L 1 . In MOSFETs, the large-signal nonlinearity of device’s transconductance G m,eff is determined by the voltage difference between the gate, drain, and source of the transistor. By using inductively dividing, the voltage amplitude at the gate is decreased, thus achieving a more linear transconductance than the traditional LC VCOs. Fig. 8(b) shows the half circuit of the proposed VCO with transconductance linearization and its equivalent circuit. The ac voltage at the gate effectively becomes VG = −k f V D , where V D is the ac voltage at the drain, and kf =

0.5L 1 . 0.5L 1 + L 2

(20)

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. XI et al.: LOW-PN 54-GHz TRANSFORMER-COUPLED QUADRATURE VCO AND 76-/90-GHz VCOs

7

Fig. 9. 3-D view of the three inductors implementation in the proposed VCOs.

Fig. 8. (a) Schematic of the proposed millimeter-wave VCO with inductive dividing. (b) Half circuit of the proposed VCO. (c) Half circuit showing the reduction of the effective gate capacitance.

When the drain voltage, V D (t), increases to the value VAmp,triode = Vth /(1 + k f ), we have |VG (t) − V D (t)| = | − k f V (t) − V (t)| = (1 + k f )V (t) > Vth (21) and the transistors would enter the triode region. Therefore, in the proposed VCO, the LC tank’s oscillation amplitude is increased m times [m = 2/(1 + k f )] compared with the traditional VCO before the transistors enter into the triode region. In other words, this method effectively extends the voltage-limited regime, giving rise to a larger oscillation amplitude and a lower PN. Additional benefits of using the proposed inductive divider feedback are the reduction on the gate capacitance as well as on its nonlinear variations. The gate capacitance is a significant part of the total capacitance of the LC tank in W -band VCOs, which limits the achievable oscillation frequency. As shown in Fig. 8(c), the effective gate capacitance C G,eff is reduced by the inductive divider to C G,eff = k f C G , where C G is the gate capacitance to ac ground. Furthermore, since the gate capacitance changes nonlinearly when the gate voltage changes, dividing the gate voltage helps to decrease the variations of the oscillation frequency due to the gate capacitance variation, which means that the gate capacitance is more linearized. In summary, the inductive divider feedback technique can decrease the effect of the gate capacitance, thus obtaining higher oscillation frequency than traditional VCOs. B. Implementation of the VCO The 3-D view of the three inductors is shown in Fig. 9. In order to minimize the parasitics of the metal-connection

Fig. 10. PN simulation results at 10-MHz offset for 90-GHz proposed VCOs with different inductor ratios k f .

wires, the gate of transistor M1 /M2 is located beneath the interconnection between L 1 and L 2 /L 3 to avoid extra metal wires. The serial connected L 1 , L 2 , and L 3 constitute the total inductance of the LC tank and the silicon area required is close to that of using one inductor in a conventional LC VCO. The quality factor of the total inductor would be a bit lower than that of an inductor with the same inductance implemented in an octagon shape. However, since the quality factor of the LC tank is mainly limited by that of the switched nMOS capacitor-array, the degradation on the overall Q is minimum with proper k f . C. Optimal Choice of the Inductor Ratio The inductor ratio is defined in (20), and from Section III-A, we know that with smaller k f , VCO can extend the voltagelimited regime more. However, a too small k f would decrease much the Q value of the overall inductor. Therefore, there is an optimized k f to obtain the lowest PN. Fig. 10 shows the PN simulation results versus different k f for the proposed 90-GHz VCO, all under the same bias condition and tuning range. The typical VCO is with k f = 1. We can see that the proposed VCO has better PN with k f around 0.6 than the typical VCO. D. Design Methodology of the Proposed VCO The proposed VCO can be designed, following the following steps. 1) Design a standard millimeter-wave VCO with a tuning range that is 1.2∼1.5 times the targeted one. 2) Add schematic inductors L 2 /L 3 to VCO, decrease the tuning caps to keep the target oscillation frequency, and build the VCO as shown in Fig. 8(a). Sweep the

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 8

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Fig. 11.

Fig. 12.

Die photo of QVCO and two VCOs.

Fig. 13.

Measured and simulated PN results of the QVCO at 56.2 GHz.

Fig. 14.

Frequency tuning and band overlap of the 54-GHz QVCO.

Fig. 15.

PN at 10-MHz offset measurement result of the QVCO.

Measurement setup.

inductance of L 2 /L 3 and choose the inductance value for the lowest PN. 3) Implement the layout of L 1 and L 2 /L 3 as shown in Fig. 9, and tune the component value to get the target oscillation frequency and PN. IV. M EASUREMENT R ESULTS The QVCO and VCOs were fabricated in a 65-nm 7-metal digital CMOS process. The die photo is shown in Fig. 11. The area of the QVCO together with its open drain buffers is 210 μm × 150 μm. Each VCO plus its source follower buffers occupy an area of 110 μm × 120 μm. A 110-GHz waveguide GSG probe, two dc probes, a V -band fundamental mixer (for QVCO measurement), a W -band harmonic mixer (for VCO measurement), an Agilent N9030A PXA Signal Analyzer, and an Agilent E8257D Signal Generator were used to perform the measurements. The measurement setup is shown in Fig. 12. In the QVCO measurement, the output of the 54-GHz QVCO is downconverted by a V -band fundamental mixer and its PN is measured by Agilent N9030A PXA through the down-converted signal. Agilent E8257D is used to provide LO signal for the mixer. A bias-T is used to bias the open drain buffer. Fig. 13 shows the PN measurements of the QVCO at 56.2 GHz, which is around −95 dBc/Hz at 1-MHz offset and −119.2 dBc/Hz at 10-MHz offset. As shown in Fig. 14, the measured tuning range of the QVCO over all 4-b digital and one analog (varactor) tuning is from 51.7 to 56.6 GHz (9.1%), and each tuning band overlaps its adjacent ones by ∼30%. The PN at 10-MHz offset across the whole frequency tuning band is within −117.8 to −119.2 dBc/Hz, as shown in Fig. 15. Two V -band calibrated dial-type phase shifters and a V -band balanced phase detector are used to perform the phase error measurement, as shown in Fig. 16. The phase detector

is used to check whether the two input signals have 90° phase difference or not. If the phase difference is 90°, its dc output is 0 V, otherwise, its dc output would be positive or negative voltage. The phase shifters are used to adjust the phase delay of the two cable paths from the I and Q outputs of the QVCO chip to the phase detector to make the dc output of phase

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. XI et al.: LOW-PN 54-GHz TRANSFORMER-COUPLED QUADRATURE VCO AND 76-/90-GHz VCOs

9

Fig. 16. (a) Phase error measurement setup. (b) Diagram of phase error measurement setup.

Fig. 17.

Fig. 18.

Measured and simulated PN results of VCO at 76 GHz.

Fig. 19.

Measured and simulated PN results of VCO at 90 GHz.

Fig. 20.

Frequency tuning and band overlap of the 76-GHz VCO.

Measured phase error across the frequency tuning bands.

detector to be 0 V. We then read the phase difference θstep_1 between the two phase shifters. θstep_1 is the sum of the phase error of the I and Q outputs of the QVCO chip, θerror , and the phase delay difference of the two cable path, θ , represented as follows: θstep_1 = θerror + θ .

(22)

A second step is then performed to neutralize θ , where we keep the same setup but only swap the I and Q outputs connected to the cables. By adjusting one phase shifter’s phase delay and making the output of phase detector to be 0 V, we can read the phase difference θstep_2 as θstep_2 = −θerror + θ .

(23)

Using (22) and (23), we can get the phase error of QVCO, θerror as θerror = 0.5(θstep_1 − θstep_2 ).

(24)

The same measurements are repeated at different frequencies, and the phase error over the frequency tuning range is within 2°, as shown in Fig. 17. In our simulation results, assuming a mismatch of 1% between the inductors in the QVCO, the simulated phase error is ∼1.5°. The power consumption of the QVCO is 24 mW with a 0.8 V supply. In the VCO measurement, the outputs of 76- and 90-GHz VCO are downconverted by a sixth harmonic W -band mixer with 7 dBm, 13.5 GHz (for 76-GHz VCO) and 15.5 GHz (for 90-GHz VCO) signal source inputs. Figs. 18 and 19 display the PN measurements of the 76- and 90-GHz VCO, which are −109.4 and −108.3 dBc/Hz, respectively, at 10-MHz offset. Figs. 20 and 21 show that the measured tuning range are

73.1–78.7 GHz and 87.1–91.7 GHz, respectively. The 76-GHz VCO employs 4-b digital tuning and one pair varactor tuning, and each varactor analog tuning band overlaps its adjacent band by around 50%. The 90-GHz VCO employs 3-b digital tuning and one pair varactor tuning, and each varactor analog tuning band overlaps its adjacent band by around 30%. The power consumption of 76- and 90-GHz VCOs are 12 and 11 mW with a 1-V supply, respectively.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 10

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

TABLE I 54-GHz QVCO P ERFORMANCES S UMMARY AND C OMPARISON W ITH S TATE OF THE A RT

TABLE II 70- AND 90-GHz VCO P ERFORMANCES S UMMARY AND C OMPARISON W ITH P RIOR A RT

The model of transistors, MIM caps, and MOS caps in our design is extended by curve fitting with measured data below 10 GHz, so the accuracy is limited in our operating frequencies and skin effect is not captured. The approximate 1.5∼3 dB difference on PN between the simulation and the measurement of the QVCO and VCOs is mainly due to the

undervalued ac gate resistance of the transistors and the serial resistance of the MIM and MOS caps from the model. The measured tuning ranges of the three oscillators are reduced by about 3%∼5% from the simulated results, which is due to the under-estimated parasitic capacitance from the transistor model. The measured output power of the three oscillators is

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. XI et al.: LOW-PN 54-GHz TRANSFORMER-COUPLED QUADRATURE VCO AND 76-/90-GHz VCOs

Fig. 21.

Frequency tuning and band overlap of the 90-GHz VCO.

around 1∼2 dB lower than the simulated output power, which can also be attributed to the inaccuracy in the device modeling at the operating frequencies. Table I summarizes our QVCO performance in comparison with the state of the art. The proposed QVCO achieves a low PN at 10-MHz offset with better or comparable FOM/FOMT compared with the existing works. Table II summarizes the performance of the 76- and 90-GHz VCO in comparison with the state of the art. V. C ONCLUSION We present the design and measurement results of a new low-noise 54-GHz transformer-coupled QVCO, and 76-/90-GHz VCOs with inductive dividing feedback, all fabricated in a 65-nm digital CMOS process. The measured results demonstrate the effectiveness of the proposed topologies and techniques in realizing low- PN millimeter-wave QVCOs and VCOs. ACKNOWLEDGMENT The authors would like to thank Prof. K. O. Ken, Dr. J. Zhang, and Dr. W. Choi, all with the Texas Analog Center of Excellence, University of Texas at Dallas, S. Huang at Southern Methodist University for providing help in testing the chip, and S. Sankaran and B. Kramer from Texas Instruments (TI) Incorporated for their help and support during the chip tape-out. They would also like to thank TI for chip fabrication. R EFERENCES [1] A. Natarajan, A. Komijani, X. Guan, A. Babakhani, and A. Hajimiri, “A 77 GHz phased-array transceiver with on-chip antennas in silicon: Transmitter and local LO-path phase shifting,” IEEE J. Solid-State Circuits, vol. 41, no. 12, pp. 2807–2819, Dec. 2006. [2] K. Scheir, G. Vandersteen, Y. Rolain, and P. Wambacq, “A 57-to-66 GHz quadrature PLL in 45 nm digital CMOS,” in Int. Solid-State Circuits Conf. Tech. Dig., Feb. 2009, pp. 494–495 and 495a. [3] K. Scheir, S. Bronckers, J. Borremans, P. Wambacq, and Y. Rolain, “A 52 GHz phased-array receiver front-end in 90 nm digital CMOS,” IEEE J. Solid-State Circuits, vol. 43, no. 12, pp. 2651–2659, Dec. 2008. [4] A. Musa, R. Murakami, T. Sato, W. Chaivipas, K. Okada, and A. Matsuzawa, “A low phase noise quadrature injection locked frequency synthesizer for mm-wave applications,” IEEE J. Solid-State Circuits, vol. 46, no. 11, pp. 2635–2648, Nov. 2011.

11

[5] U. Decanis, A. Ghilioni, E. Monaco, A. Mazzanti, and F. Svelto, “A mm-Wave quadrature VCO based on magnetically coupled resonators,” in Int. Solid-State Circuits Conf. Tech. Dig., Feb. 2011, pp. 280–281. [6] X. Yi, C. C. Boon, H. Liu, J. F. Lin, J. C. Ong, and W. M. Lim, “A 57.9-to-68.3 GHz 24.6 mW frequency synthesizer with in-phase injection-coupled QVCO in 65 nm CMOS,” in Int. Solid-State Circuits Conf. Tech. Dig., Feb. 2013, pp. 354–355. [7] S. Kang and A. M. Niknejad, “A 100 GHz active-varactor VCO and a bi-directionally injection-locked loop in 65 nm CMOS,” in Proc. IEEE Radio Freq. Integr. Circuits Symp., Jun. 2013, pp. 231–234. [8] M. Adnan and E. Afshari, “A 105 GHz VCO with 9.5% tuning range and 2.8 mW peak output power using coupled Colpitts oscillators in 65 nm bulk CMOS,” in Proc. IEEE Radio Freq. Integr. Circuits Symp., Jun. 2013, pp. 239–242. [9] B. Sadhu et al., “A linearized, low-phase-noise VCO-based 25-GHz PLL with autonomic biasing,” IEEE J. Solid-State Circuits, vol. 48, no. 5, pp. 1138–1150, May 2013. [10] J.-O. Plouchart, M. Ferriss, B. Sadhu, M. Sanduleanu, B. Parker, and S. Reynolds, “A 73.9–83.5 GHz synthesizer with –111 dBc/Hz phase noise at 10 MHz offset in a 130 nm SiGe BiCMOS technology,” in Proc. IEEE Radio Freq. Integr. Circuits Symp., Jun. 2013, pp. 123–126. [11] T. Xi et al., “Low-phase-noise 54 GHz quadrature VCO and 76 GHz/90 GHz VCOs in 65 nm CMOS process,” in Proc. IEEE Radio Freq. Integr. Circuits Symp., Jun. 2014, pp. 257–260. [12] A. W. L. Ng and H. C. Luong, “A 1-V 17-GHz 5-mW CMOS quadrature VCO based on transformer coupling,” IEEE J. Solid-State Circuits, vol. 42, no. 9, pp. 1933–1941, Sep. 2007. [13] A. M. ElSayed and M. I. Elmary, “Low-phase-noise LC quadrature VCO using coupled tank resonators in a ring structure,” IEEE J. Solid-State Circuits, vol. 36, no. 4, pp. 701–705, Apr. 2001. [14] B. Razavi, “Design of millimeter-wave CMOS radios: A tutorial,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 56, no. 1, pp. 4–16, Jan. 2009. [15] L. Romano, S. Levantino, A. Bonfanti, C. Samori, and A. L. Lacaita, “Phase noise and accuracy in quadrature oscillators,” in Proc. Int. Symp. Circuits Syst. (ISCAS), 2004, pp. I-161–I-164. [16] K.-H. Tsai and S.-L. Liu, “A 43.7 mW 96 GHz PLL in 65 nm CMOS,” in Int. Solid-State Circuits Conf. Tech. Dig., Feb. 2009, pp. 276–277 and 277a. [17] D. D. Kim et al., “An array of 4 complementary LC-VCOs with 51.4% W-band coverage in 32 nm SOI CMOS,” in Proc. IEEE Solid-State Circuits Conf., Feb. 2009, pp. 278–279 and 279a. [18] H. Wu et al., “A current-mode mm-wave direct-conversion receiver with 7.5 GHz bandwidth, 3.8 dB minimum noise-figure and +1 dBm P1 dB,out linearity for high data rate communications,” in Proc. IEEE Radio Freq. Integr. Circuits Symp., Jun. 2013, pp. 89–92. [19] P. Sakian, E. V. D. Heijden, H. M. Cheema, R. Mahmoudi, and A. van Roermund, “A 57–63 GHz quadrature VCO in CMOS 65 nm,” in Proc. IEEE EuMIC, Sep. 2009, pp. 120–123. [20] L. Wu and H. C. Luong, “A 49-to-62 GHz CMOS quadrature VCO with bimodal enhanced magnetic tuning,” in Proc. ESSCIRC, 2012, pp. 297–300. [21] E. Laskin et al., “Nanoscale CMOS transceiver design in the 90–170-GHz range,” IEEE Trans. Microw. Theory Techn., vol. 57, no. 12, pp. 3477–3490, Dec. 2009. [22] B. Parvais et al., “A 40 nm LP CMOS PLL for high-speed mm-wave communication,” in Proc. ESSCIRC, Sep. 2010, pp. 254–257. [23] V. P. Trivedi, K.-H. To, and W. M. Huang, “A 77 GHz CMOS VCO with 11.3 GHz tuning range, 6 dBm output power, and competitive phase noise in 65 nm bulk CMOS,” in Proc. IEEE Radio Freq. Integr. Circuits Symp., Jun. 2011, pp. 1–4. [24] E. Laskin, M. Khanpour, R. Aroca, K. W. Tang, P. Garcia, and S. P. Voinigescu, “A 95 GHz receiver with fundamental-frequency VCO and static frequency divider in 65 nm digital CMOS,” in IEEE Int. SolidState Circuits Conf. Dig. (ISSCC), Feb. 2008, pp. 180–605. [25] Z.-M. Tsai, C.-S. Lin, C. F. Huang, J. G. J. Chern, and H. Wang, “A fundamental 90-GHz CMOS VCO using new ring-coupled quad,” IEEE Microw. Wireless Compon. Lett., vol. 17, no. 3, pp. 226–228, Mar. 2007. [26] Y. J. Shiao, G.-W. Huang, C.-W. Chuang, H.-H. Hsieh, C.-P. Jou, and F.-L. Hsueh, “A 100-GHz varactorless CMOS VCO using source degeneration,” in IEEE MTT-S Int. Microw. Symp. Dig., Jun. 2012, pp. 1–3.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 12

[27] I. Nasr, M. Dudek, R. Weigel, and D. Kissinger, “A 33% tuning range high output power V-band superharmonic coupled quadrature VCO in SiGe technology,” in Proc. IEEE Radio Freq. Integr. Circuits Symp., Jun. 2012, pp. 301–304. [28] J.-C. Chien and L.-H. Lu, “A 32-GHz rotary traveling-wave voltage controlled oscillator in 0.18-μm CMOS,” IEEE Microw. Wireless Compon. Lett., vol. 17, no. 10, pp. 724–726, Oct. 2007. [29] A. Moroni, R. Genesi, and D. Manstretta, “Analysis and design of a 54 GHz distributed ‘hybrid’ wave oscillator array with quadrature outputs,” IEEE J. Solid-State Circuits, vol. 49, no. 5, pp. 1158–1172, May 2014. [30] S. Rong and H. C. Luong, “Design and analysis of varactorless interpolative-phase-tuning millimeter-wave LC oscillators with multiphase outputs,” IEEE J. Solid-State Circuits, vol. 46, no. 8, pp. 1810–1819, Aug. 2011. [31] H.-Y. Chang, Y.-H. Cho, M.-F. Lei, C.-S. Lin, T.-W. Huang, and H. Wang, “A 45-GHz quadrature voltage controlled oscillator with a reflection-type IQ modulator in 0.13-μm CMOS technology,” in IEEE MTT-S Int. Microw. Symp. Dig., Jun. 2006, pp. 739–742. [32] R. Banin, O. Degani, and E. Socher, “V-band low phase noise QVCO in 90 nm CMOS technology using a gate-connected tank,” Electron. Lett., vol. 48, no. 17, pp. 1046–1048, Aug. 2012. [33] J. Yin and H. C. Luong, “A 57.5-to-90.1GHz magnetically-tuned multi-mode CMOS VCO,” in Proc. IEEE Custom Integr. Circuits Conf. (CICC), Sep. 2012, pp. 1–4. [34] A. Chakraborty, S. Trotta, J. Wuertele, and R. Weigel, “A D-band transceiver front-end for broadband applications in a 0.35 μm SiGe bipolar technology,” in Proc. IEEE Radio Freq. Integr. Circuits Symp., Jun. 2014, pp. 405–408. [35] I. Sarkas, J. Hasch, A. Balteanu, and S. P. Voinigescu, “A fundamental frequency 120-GHz SiGe BiCMOS distance sensor with integrated antenna,” IEEE Trans. Microw. Theory Techn., vol. 60, no. 3, pp. 795–812, Mar. 2012. [36] H. Li, H.-M. Rein, T. Suttorp, and J. Bock, “Fully integrated SiGe VCOs with powerful output buffer for 77-GHz automotive radar systems and applications around 100 GHz,” IEEE J. Solid-State Circuits, vol. 39, no. 10, pp. 1650–1658, Oct. 2004.

Tianzuo Xi (GSM’14–M’14) received the B.S. and M.S. degrees in electrical engineering from the University of Science and Technology of China, Hefei, China, in 2007 and 2010, respectively, and the Ph.D. degree from Southern Methodist University, Dallas, TX, USA, in 2015. He was with Actions Semiconductor, Shanghai, China, where he was involved in analog IC design in 2010. He was an Intern with Qualcomm, San Diego, CA, USA, and Samsung, Seoul, South Korea, in 2013 and 2014. He has been involved with power amplifiers with Qualcomm New England, Boxborough, MA, USA, since 2015. He has authored or coauthored 18 peer-reviewed publications and holds one U.S. patent. His current research interests include RF/millimeter-wave and analog IC design.

Shita Guo (GSM’14–M’15) received the B.S. and M.S. degrees from the University of Science and Technology of China, Hefei, China, in 2006 and 2009, respectively, and the Ph.D. degree from Southern Methodist University, Dallas, TX, USA, in 2015, all in electrical engineering. He was with Sychip, Inc., Shanghai, China, from 2009 to 2011, where he was involved in RF integrated passive devices and wireless subsystems for mobile products. He has been with the High-Speed Interface Group, Texas Instruments Incorporated, Dallas, TX, USA, since 2014. He has authored or coauthored 18 peer-reviewed publications. His current research interests include RF/millimeter-wave ICs for wireless communication system, and high-performance analog and mixedsignal ICs for high-speed serial data links.

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Ping Gui (S’03–M’04–SM’09) received the Ph.D. degree in electrical and computer engineering from the University of Delaware, Newark, DE, USA. She joined the Lyle School of Engineering, Southern Methodist University (SMU), Dallas, TX, USA, in 2004, as a faculty member, and where she is currently an Associate Professor of Electrical Engineering. Her current research interests include digital, analog, mixed-signal, and RF IC design for a variety of applications, including high-speed wireline transceivers, wide-band wireless communications using millimeter wave, high-speed ADCs, and circuits and systems for harsh and extreme environments. Dr. Gui was a recipient of the CERN Scientific Associate Award from 2008 to 2010, the IEEE Dallas Section Outstanding Service Award in 2011, and the SMU Ford Research Fellowship Award in 2015. She has been serving as the Technical Chair of the IEEE Solid-States Circuits Society Dallas Chapter since 2007. She has also been serving on the Technical Program Committee of the IEEE Radio-Frequency Integrated Circuits Symposium since 2015.

Daquan Huang (M’03–SM’06) joined the Department of Information and Electronics Engineering, Zhejiang University, Hangzhou, China, as a faculty member in 1985. He served as the Head of the Electronic Information Technology Laboratory with Zhejiang University from 1995 to 2002, where he has been the Vice Chair of the Department of Information and Electronics Engineering since 2000. He was with the University of California at Los Angeles, Los Angeles, CA, USA, as a Research Faculty Member from 2002 to 2007, where he pioneered CMOS millimeter-wave technologies focusing on the 60-GHz short range ultrahigh-speed communications applications and helped the High-Speed Electronics Laboratory. He invented various CMOS millimeter-wave technologies, including the on-chip transformer folded cascode circuit topology (Origami) and the on-chip digital controllable artificial dielectrics (DiCAD), which were recognized as groundbreaking technologies by DARPA. The Origami solves the fundamental issues in low-power, low noise, and high linear actives operation, and the DiCAD addresses the fundamental passives needs for wideband frequency, phase, and impedance tuning. He joined the Group Technical Staff with Texas Instruments (TI) Incorporated, Dallas, TX, USA, in 2007. While focusing on the advanced CMOS RFICs for the next-generation of wireless communications systems, he was involved extensively in digital RF processors (DRPTM), which had been responsible for multibillion dollars of sales. He designed and led the effort in developing 10 20-GB/s FET switches, 14-GB/s LVDS switches, and multiple other wireline communications products for Apple, Cisco, and Huawei. He joined the newly formed research center of TI, Kilby Labs, Dallas, TX, USA, in 2009, where he led the effort in developing ultrahigh-speed short distance communications products, including the 60-GHz TRx and the start-of-the-art 40-GHz PLL with an rms jitter of 100 fs. He joined the 77-GHz radar product development in 45-nm CMOS from 2011 to 2013. He has been a Director of Samsung Research America, Richardson, TX, USA, since 2013. As a Principal Engineer, he leads the activities of CMOS millimeter-wave IC development for 5G communications, including 28-GHz CMOS front-end for cellular applications and 125-GHz transceiver SoC with RF beamformer for next-generation Wi-Fi applications. He has given over 30 technical presentations, lectures, and invited talks in technical conferences, reviewed and has had over 100 journal and conference papers published with over 700 citations, and holds 18 U.S. patents and several pending patents. His current research interests include RFIC and MMIC designs, microwave and antenna theory, optoelectronics, and neural networks. Dr. Huang was the recipient of the 2011 Kilby Labs Innovator Award for millimeter-wave transceiver and was elected to the top 30 TI innovators in 2013. He has served as a Reviewer for various IEEE journals.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. XI et al.: LOW-PN 54-GHz TRANSFORMER-COUPLED QUADRATURE VCO AND 76-/90-GHz VCOs

Yanli Fan (M’08) received the M.S. degree in electrical engineering from the University of Maryland at College Park, College Park, MD, USA. She was with IBM, Armonk, NY, USA, and the Hittite Microware Corporation, Chelmsford, MA, USA. She joined Texas Instruments Incorporated, Dallas, TX, USA, in 2002, where she is currently a Design Engineering Manager and Distinguished Member of the Technical Staff. She has authored many publications in major technical journals. She holds more than 16 patents. Her current research interests include high speed clock data recovery and transceiver circuit design.

13

Mark Morgan (M’98) received the B.S.E.E. degree from the University of Wisconsin–Madison, Madison, WI, USA, and the M.S.E.E. degree from Marquette University, Milwaukee, WI, USA. He was a CTO of the INT Business Unit, managing a design team focused on advanced circuit technologies. He joined Kilby Labs, Texas Instruments (TI) Incorporated, Dallas, TX, USA, in 1997, as an Analog Design Engineer in HPA, where he focused on high speed interface designs. He has served as the Analog Design Branch Manager for the high speed interface product line. He is currently a Manager with Kilby Labs, TI, where he is directing the high voltage isolation technology program. He is a Distinguished Member of the Technical Staff. He holds 25 issued patents.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

1

Ultra-Wideband Quasi-Circulator Implemented by Cascading Distributed Balun With Phase Cancelation Technique Sida-Tang, Chih-Min Lin, Shih-Han Hung, Kai-Wen Cheng, and Yeong-Her Wang, Member, IEEE Abstract— This paper presents the implementation of a broadband quasi-circulator through a cascading distributed balun using a 90-nm CMOS process. The isolation, |S31|, of the proposed three-port quasi-circulator can be acquired through the phase cancelation technique, by connecting an additional distributed amplifier in parallel between ports 1 and 3. The thorough analysis based on an eight-port chain matrix and scattering matrix is presented for refining the circuit parameters in the initial design. Measured results show that the proposed quasi-circulator attains a broad operation bandwidth ranging from 10 to 67 GHz. Moreover, the quasi-circulator also has good insertion gain of 0.5 to 4.8 dB, as well as isolation |S31|, which is better than 23 dB. Consequently, the proposed quasicirculator delivers wide bandwidth performance, good port-toport isolations, good insertion gain, and high linearity based on the cascading distributed balun with phase cancelation technique. Index Terms— Broadband, CMOS, distributed amplifier, phase cancelation, quasi-circulator.

I. I NTRODUCTION ECENTLY, microwave and millimeter-wave communication systems that operate in the Ka- to F-band have been receiving considerable attention for their wide operational bandwidth, high-speed data transfer, low manufacturing cost, and good performance. Given its circuit implementations, the technology of CMOS-integrated circuits is among the candidates because of its low supply voltage, low cost, and high integration [1]. Active circulators are three-port nonreciprocal functional components suitable for integrated system-on-chip applications using monolithic microwave integrated circuit (MMIC) integration [2]–[7]. These circulators serve an important function in MMIC integration because they enable signals to flow in only one direction, such that good impedance is observed at three ports. Thus, the transmitter and receiver can simultaneously translate and receive RF signals with different

R

Manuscript received November 4, 2015; revised April 27, 2016; accepted May 14, 2016. This work was supported in part by the National Chip Implementation Center and National Nano Device Laboratories, National Applied Research Laboratories within the Ministry of Science and Technology, Taiwan, under Contract NSC NSC101-2221-E-006-141-MY3, Contract 102CE01, Contract 103CE03, and Contract CSIST-A7I-V102(104), and in part by the Foundation of Chen, Jieh-Chen Scholarship of Tainan, Taiwan. The authors are with the Advanced Optoelectronic Technology Center, Department of Electrical Engineering, Institute of Microelectronics, National Cheng Kung University, Tainan 701, Taiwan (e-mail: [email protected]; [email protected]; q1897127@mail. ncku.edu.tw; [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TMTT.2016.2573279

frequencies without using a diplexer filter or RF switch. Moreover, active circulators have broad bandwidth, low insertion loss, and small chip dimension. However, active circulators produce problems in signal transfers from port 3 to port 1. The problem of receiver signal leakage into the transmitter port can be solved by using quasi-circulators that support only two power flows in the same direction (port 1 to port 2 and port 2 to port 3) [6]–[16]. Numerous active quasi-circulator module configurations have been introduced in microwave and millimeter-wave system applications; one of the configurations is the separation of transmitted and received signals [11], [12]. A quasi-circulator uses a power coupler to improve isolation [6], but this coupler limits its bandwidth. The common-source/drain configurations in active quasicirculators have been used to enhance the bandwidth at the cost of high insertion loss and noise figure (NF) [7]. A quasicirculator with high insertion gain has been reported [8], but this quasi-circulator has a narrow operation bandwidth. The quasi-circulator using phase compensation has achieved good port-to-port isolation and wideband operation [9]. However, the impedance matching network is complicated and occupies a large chip size. The active balun and current combiner are used to achieve compact size and simple impedance matching [10]. However, these devices will degrade the bandwidth. A quasi-circulator using CMOS technology reportedly results in high gain and good isolation [12]. However, the operation bandwidth of the quasi-circulator is still narrow and power consumption is high. Now, a 30-GHz quasi-circulator uses the current-reuse technique to reduce power consumption [16]; however, the issue of narrow operation bandwidth remains unresolved. The active distributed amplifiers [17]–[20] possess several benefits, including ultra-broadband, good gain flatness, and good insertion gain, which are suitable for wideband and good port-to-port isolation applications. Self-equalization technique has been proposed for the active quasi-circulator [21]. A basic structure consists of a distributed balun cascading with a distributed combiner using GaAs process. The whole circuit achieves wideband frequency without using additional phase shifters. However, they query about the cascade structure operating in high frequency by increasing the numbers because of the gain reduction. It is a big challenge to implement it in a higher frequency and ultra-wideband using 90-nm CMOS technology. This paper presents a novel ultra-broadband quasi-circulator using the cascading distributed architecture implemented by

0018-9480 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 2

Fig. 1.

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Circuit configuration of the proposed distributed quasi-circulator.

90-nm CMOS technology for RF front-end applications. The cascading distributed balun structure with good amplitude/phase error is used to facilitate phase cancelation between ports 1 and 3 for the proposed quasi-circulator design. Moreover, the quasi-circulator offers significant advantages of ultra-broadband, good insertion gain, and good port-to-port isolations with a miniature chip dimension of 0.51 mm2 . The remainder of this paper is organized as follows. Section II presents the design concept and detailed design analysis for circuit parameter optimization. Section III shows the implementation of the proposed quasi-circulator and the experiment results. Finally, Section IV provides a brief conclusion. II. C IRCUIT D ESIGN AND A NALYSIS A. Circuit Description The proposed configuration of the ultra-broadband quasicirculator comprises an active balun constructed from a distributed amplifier with cascading structure and a phase inversion path for isolation enhancement, as shown in Fig. 1. The distributed balun can be regarded as a cascading distributed amplifier without forward drain-line termination at port 2. The incident signal from port 1 propagates down the gate terminal of the distributed balun with a sectional phase constant, and the voltage, Vgs1 , across each gate capacitor of the fieldeffect transistor transconducts a current into the drain terminal. Careful selection of the FET dimension and circuit parameter regulates the phase velocities of the gate and drain terminals to become approximately equal and results in the sum of the forward direction as they arrive at port 2. The inherent phase inversion property of the common-source FET enables the achievement of the antiphase output from port 2. Furthermore, a drain current produced from the cross voltage, Vgs2, feeds into the drain terminal of the cascading stage and consequently to the forward combination at port 3. The double-phase inversion results in the noninverting phase output at port 3. Thus, the characteristic of balun can be accomplished with equal output power and antiphase between ports 2 and 3. In principle, the quasi-circulator is specified as |S21 | = |S32 | = 1 with the other scattering parameters

Fig. 2. (a) Simplified FET model. (b) Equivalent network of the distributed quasi-circulator.

being zero. This condition implies that the transmitting path of the distributed balun from port 1 to port 3, |S31 |, will be redundant. To address this issue, a phase cancelation technique has been adopted in the quasi-circulator design. An additional distributed amplifier was connected in parallel to ports 1 and 3 and is denoted as the phase cancelation path in Fig. 1. This transmitting path adheres to the phase inversion feature and is akin to the S21 path of the distributed balun. By fitting the circuit parameters, the phase difference between port 3 of the distributed balun and the output port of the phase cancelation path can be regulated to a phase degree of 180°. Finally, both output ports can be combined to arrive at a superior |S31 | isolation of the distributed quasi-circulator. As regards the transmission path of |S32 |, a signal excitation from port 2 can impose on the cascading stage of the distributed balun to achieve an output combination of reverse traveling waves from port 3. Moreover, the unilateral characteristic of FET hinders the incident signal of port 2 from traveling down port 1. Further, with the same consideration, extremely low |S12 |, |S13 |, and |S23 | can also be achieved using the proposed distributed configuration. B. Circuit Analysis To analyze the proposed distributed quasi-circulator, the complicated schematic should be simplified as in Fig. 2 for theoretical analysis. The simple unilateral FET model is shown in Fig. 2(a). Cgs and Cds are the gate-to-source and drainto-source capacitances, respectively. Coupling between the gate and drain terminals is facilitated by the FET transconductance gm . Fig. 2(b) demonstrates an equivalent schematic of the distributed quasi-circulator with a three-column by three-row configuration constructed from a terminal resistor, a number of inductors, and simplified FET models. The wellknown parasitic capacitors, C1 , C2 , C3 , and C4 of FETs can be

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. SIDA-TANG et al.: ULTRA-WIDEBAND QUASI-CIRCULATOR IMPLEMENTED BY CASCADING DISTRIBUTED BALUN

3

absorbed into lossy artificial transmission lines with lumped inductors L 1 , L 2 , L 3 , and L 4 . Each artificial transmission line has individual characteristic impedance that is defined as  Z c = L/C. (1) In our case, the parasitic capacitances are as follows: C1 = Cgs1 + Cgs3 , C2 = Cds1 + Cgs2 , C3 = Cds2 , and C4 = Cds3 . The above indicates that the FET dimension dominates the parasitic capacitance and further affects the characteristic impedance of the artificial transmission line. Thus, the relation between the entire performance of the distributed quasi-circulator and the impedance Z c of the artificial transmission line should be considered. The equivalent network of the distributed quasi-circulator can be separated into several eight-port chain matrices, as shown in Fig. 3. These chain matrices can be derived from the relationship between the voltages and currents of the input and output ports and can be expressed as ⎡ ⎤ j ωL 1 0 0 0 ⎥ ⎢1 0 0 0 2 ⎢ ⎥ j ωL 2 ⎢0 1 0 0 0 0 ⎥ 0 ⎢ ⎥ 2 ⎢ ⎥ ⎢ ⎥ j ωL 3 ⎢0 0 1 0 ⎥ 0 0 0 ⎢ ⎥ 2 ⎥ TIND = ⎢ j ωL 4 ⎥ ⎢0 0 0 1 0 0 0 ⎢ ⎥ 2 ⎥ ⎢ ⎢0 0 0 0 ⎥ 1 0 0 0 ⎢ ⎥ ⎢0 0 0 0 0 1 0 0 ⎥ ⎢ ⎥ ⎣0 0 0 0 0 0 1 0 ⎦ 0 0 0 0 0 0 0 1 (2) ⎤ ⎡ 1 0 0 0 0 0 0 0 ⎢ 0 1 0 0 0 0 0 0⎥ ⎥ ⎢ ⎢ 0 0 1 0 0 0 0 0⎥ ⎥ ⎢ ⎢ 0 0 0 1 0 0 0 0⎥ ⎥ ⎢ TFET = ⎢ 0 0 0 1 0 0 0⎥ ⎥ ⎢ j ωC1 ⎢ gm 1 j ωC2 0 0 0 1 0 0⎥ ⎥ ⎢ ⎣ 0 gm 2 j ωC3 0 0 0 1 0⎦ gm 3 0 0 j ωC4 0 0 0 1 ⎤ ⎡ 1 0 0 0 R 0 0 0 ⎢0 1 0 0 0 0 0 0⎥ ⎥ ⎢ ⎢0 0 1 0 0 0 0 0⎥ ⎥ ⎢ ⎢0 0 0 1 0 0 0 0⎥ ⎥. ⎢ (3) Tout = ⎢ ⎥ ⎢0 0 0 0 1 0 0 0⎥ ⎢0 0 0 0 0 1 0 0⎥ ⎥ ⎢ ⎣0 0 0 0 0 0 1 0⎦ 0 0 0 0 0 0 0 1 In Fig. 3, the chain matrix, Tseg , can be considered as a T-type segment of the artificial transmission line with transconductance gm . The chain matrix, Tseg , can then be calculated using TSeg = TIND · TFET · TIND .

(4)

An eight-port chain matrix with a large number of variables will be disadvantageous to the numerical analysis. For simplicity, the variable can be simplified as C1 = C2 = C3 = C4 = C and L 1 = L 2 = L 3 = L 4 = L. Subsequently, after

Fig. 3. Definition of the chain matrix for equivalent network of the distributed quasi-circulator.

manipulation by (4), the detail chain matrix, Tseg , can be computed as (5), as shown at the bottom of the next page. Two new qualitative variables, normalized frequency  and characteristic impedance Z c , which absorbed the circuit parameters of C and L, have been involved in (5). The two qualitative variables exhibit the feature of an artificial transmission line over specific impedance and operating frequency. Z c can be referred to (1), and the definition of  is given by √ (6)  = ω LC. Furthermore, the entire chain matrix of the 3 × 3 array distributed quasi-circulator is designated as TDQC and is derived from (3)–(5) as

A B TDQC = TSeg · TSeg · TSeg · Tout = . (7) C D Comparing Figs. 2(b) and 3, the ports 2, 3, 5, and 7 should be terminated by a short circuit along the entire eight-port chain matrix. An admittance matrix, YDQC , can be used to facilitate adequate matrix reduction. The conversion relationship between admittance matrix and chain matrix is given by

D B −1 C − D B −1 A YDQC = . (8) B −1 A −B −1 After crossing out the rows and columns 2, 3, 4, and 5 of YQDC , the reduced admittance matrix can be denoted as YQDC−R , while simultaneously redefining the port number of Fig. 3 from the original ports 4, 6, and 8 to ports 2, 3, and 4, respectively. According to the conversion relationship between the scattering matrix and the admittance matrix expressed as (9), the four-port scattering matrix can be obtained as SDQC = (U − YDQC−R · Z 0 )(U + YDQC−R · Z 0 )−1 . 4−port

(9)

The output of the phase cancelation path is clearly associated with port 4 of the four-port distributed quasi-circulator. Port 4 must be connected to port 3 to implement the phase cancelation technique. Assuming that a fully matched three-port

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 4

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

combiner has been interconnected with ports 3 and 4 of the four-port distributed quasi-circulator, the complete scattering matrix of the three-port combiner is briefly expressed as √ √ ⎤ ⎡ 0√ − j/ 2 − j/ 2 SPC = ⎣ − j/√2 (10) 0 0 ⎦. − j/ 2 0 0 After some manipulations, the scattering parameters of the three-port distributed quasi-circulator can be derived as (11), shown at the bottom of this page. The isolation of the distributed quasi-circulator is presented below. Considering (12), the variables x, y, z, s, t, u, and v can be expressed as x y z s t u v

= = = = = = =

6 − 64 + 92 − 2 5 − 43 + 3 7 − 85 + 193 − 12 35 − 163 + 19 8 − 106 + 364 − 562 + 35 2 − 2 33 − 8.

(12)

The unilateral characteristics of the FET cause the reversed transmitting paths to exhibit a perfect zero transmission and DQC DQC DQC = S13 = S23 = 0. While operating are expressed as S12 at  = 1, the return losses and insertion gains/losses of (13) can be simplified as R − Z0 DQC S11 =1 = R + Z0 DQC DQC S22 =1 = −S33 =1 = 1 3gm 1 Z c2 DQC S21 =1 = R + Z0 √ 3 2gm 2 Z c2 DQC S32 =1 = . 2Z 0

(13)

Port 1 has been terminated by resistor R for impedance matching and consequent on the perfect matching with R value of 50 . However, ports 2 and 3 exhibit open- and shortcircuited states because of the terminal-less design. In fact, a terminal resistor can be added to the distributed architecture for port matching. Considering the design complexity, we selected the tradeoff. From (13), the transmitting gain/loss



TSeg



2 2 1 − 0 0 0 j Z  1 − 0 0 0 c ⎢ ⎥ 2 4 ⎢ ⎥

⎢ j gm Z  ⎥ 2 2 2 2  −gm 1 Z c   1 c ⎢ ⎥ 1− 0 0 j Z c 1 − 0 0 ⎢ ⎥ 2 2 4 4 ⎢ ⎥

⎢ ⎥ 2 2 2 2  −gm 2 Z c   j gm 2 Z c  ⎢ ⎥ 1− 0 0 j Z c 1 − 0 0 ⎢ ⎥ ⎢ ⎥ 2 2 4 4

⎢ ⎥ ⎢ j gm 3 Z c  2 −gm 3 Z c2 2 2 ⎥ ⎢ ⎥ 0 0 1 − 0 0 j Z  1 − c ⎢ 2 2 4 4 ⎥ =⎢ ⎥ ⎢ ⎥ 2  ⎢ ⎥ 0 0 0 0 0 0 1− j ⎢ ⎥ Zc 2 ⎢ ⎥ 2 ⎢ ⎥   j gm 1 Z c  ⎢ ⎥ 1− 0 0 gm 1 j 0 0 ⎢ ⎥ Zc 2 2 ⎢ ⎥ 2 ⎢ ⎥   j gm 2 Z c  ⎢ ⎥ 0 gm 2 1− 0 j 0 0 ⎢ ⎥ Zc 2 2 ⎢ ⎥ 2 ⎣ ⎦  j gm 3 Z c   0 0 1− gm 3 0 0 j Zc 2 2 (5)

DQC

= 1−

DQC

=

S11 S21

DQC

S22

DQC

S31

DQC

S32

4x Z c Z 0 − j 8y R Z 0   2x Z c (R + Z 0 ) + j z Z c2 − 4y R Z 0

8(sZ c − j 6y R)gm 1 Z c2 Z 0    (2x Z 0 + j z Z c ) 2x Z c (R + Z 0 ) + j z Z c2 − 4y R Z 0 4x Z 0 DQC = −S33 = −1 2x Z 0 + j z Z c √    j 16 2Z c Z 0 [(sc − j 6y R)gm 1 Z c ][(sc − j 6y Z 0)gm 2 Z c ] + (2x Z 0 + j z Z c ) (6u R + j v Z c )gm 1 gm 2 2 Z c2    = (2x Z 0 + j z Z c )2 2x Z c (R + Z 0 ) + j z Z c2 − 4y R Z 0 √ 4 2(sc − j 6y R)gm 3 Z c2 Z 0    −j (2x Z 0 + j z Z c ) 2x Z c (R + Z 0 ) + j z Z c2 − 4y R Z 0 √ − j 4 2tgm 2 2 Z c2 Z 0 = (11) (2x Z 0 + j z Z c )2

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. SIDA-TANG et al.: ULTRA-WIDEBAND QUASI-CIRCULATOR IMPLEMENTED BY CASCADING DISTRIBUTED BALUN

DQC

5

DQC

of S21 and S32 can be determined by regulating transconductance gm1 and gm2 , respectively, with fixed R and Z c . Furthermore, the distributed quasi-circulator was designed to deliver equal output power between the transmitting paths DQC DQC S21 and S32 . A criterion for achieving perfect amplitude imbalance can thus be defined as follows: S DQC 21 = 1. (14) DQC S 32 =1 DQC

By substituting the normalized frequency of  = 1, S21 , DQC and S32 of (13) into criterion (14), the relationship between gm1 and gm2 can be characterized in terms of the terminal resistor, R, and port impedance, Z 0 , and can be further expressed as √ 2Z 0 gm 1 . gm 2 = (15) (R + Z 0 ) Based on observation, relation (15) only varies with terminal resistor R while fixing Z 0 at 50 . The circuit design is more precise without the influence of the parasitic effect. Moreover, the optimized circuit parameters, R and Z c , should be surveyed with different normalized frequencies for perfect DQC amplitude imbalance, assuming that an insertion gain S21 of DQC 0 dB has been provided. According to S21 of (13), gm1 is DQC derived by asserting S21 = 1 and R = 50 , varies with the characteristic impedance Z c , and can be expressed as gm 1 =

R + Z0 . 3Z c2

(16)

This condition implies that gm2 is also relative to the characteristic impedance Z c in relation to (15). Based on relations DQC DQC of (15) and (16) and the S21 and S32 of (11), Fig. 4 shows the theoretical result of the amplitude imbalance as the functions of normalized frequency  and characteristic impedance Z c of the artificial transmission line with different terminal resistance R. With the contribution of relation (15), the distributed quasi-circulator clearly achieves amplitude imbalance of 0 dB given any values of R and Z c while operating at  = 1. Fig. 4(b) shows that the terminal resistor R of 50  results in the smoothest amplitude imbalance ranging from −1 to 1 dB within Z c of 45–100 . Nevertheless, the operating bandwidth is conflicted with the increase in the value of Z c . In addition, the perfect input matching for port 1 can also be obtained by setting the terminal resistor R to 50 . DQC DQC Fig. 5 shows the theoretical insertion gains, S21 and S32 , of the distributed quasi-circulator versus normal frequency  sweeping with different Z c at a fixed R value of 50 . A small characteristic impedance Z c results in a high insertion gain of DQC DQC S21 and S32 with a wider operating bandwidth. However, the band flatness should be considered in the distributed architecture. To guarantee a wide operating bandwidth with smooth flatness, the refinements of the circuit parameters can be summarized as follows: 60  ≤ Z c ≤ 70  R ≤ 50 .

(17)

Fig. 4. Theoretical amplitude imbalance contours versus normalized frequency  and characteristic impedance Z c of the artificial transmission line with terminal resistor R of (a) 35, (b) 50, and (c) 75 . DQC From S31 of (10), the theoretical isolation of the distributed quasi-circulator can be calculated by sweeping transconductance gm3 from 0.1 to 4 mA/V at different characteristic impedance values Z c , as shown in Fig. 6. The gm1 and gm2 should meet the (13), (15), and (16), and also vary with Z c . The results suggest that high isolation is associated with large characteristic impedance Z c . Given the design criterion (17), the tradeoff between the operating bandwidth and isolation is inevitable for the distributed quasi-circulator. The optimum range of the transconductance gm3 ranging from 2.5 m to 3 mA/V reveals efficient validity for isolation improvement. Subsequently, a frequency response of isolation as a function of Z c with a fixed gm3 of 2.6 mA/V is shown in Fig. 7. Z c with an impedance value of 80  presents the best isolation compared with Z c of 70 and 60 . However, to balance the operating bandwidth with the isolation,

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 6

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Fig. 5. Theoretical insertion gain as a function of normalized frequency  with different characteristic impedance Z c .

Fig. 7. Theoretical isolation versus normalized frequency  with different characteristic impedance values Z c at a fixed gm3 of 2.6 mA/V.

Fig. 6. Theoretical isolation versus transconductance gm3 at different characteristic impedance Z c .

optimized for excellent performance with the aid of criterion (18), an iterative electromagnetic simulation and TSMC kit. In the practical design, the characteristic impedance Z c of the artificial transmission line for the distributed balun and the phase cancelation path has different values. Altering the FET dimension and lumped inductor will optimize the characteristic impedance Z c . The detailed circuit parameters are described as follows. For M1 , M2 , and M3 , the six-finger nMOS device with a gate width of 18.1 μm was used. The seven-finger nMOS device with a gate width of 14.5 μm was used for M4 , M5 , and M6 to ensure good amplitude imbalance with ultra-broad bandwidth. The devices, M7 , M8 , and M9 , for the phase cancelation path used the six-finger nMOS device with a gate width of 19.3 μm to achieve superior isolation from ports 1 to 3. The gate bias voltage (VGS ), as shown in Fig. 1, is set to 1 V through the inductor Lg3 (0.2 nH) and the terminal resistor R (53.6 ). To maintain the good amplitude imbalance and input matching for port 1 within a broad operation bandwidth, the resistor R was selected by design criterion (16). The drain voltage (VDS ) was set to 1.5 V through inductors L d1 (0.22 nH), L d3 (0.23 nH), and L d5 (0.21 nH). Four capacitors, C1 , C4 , C6 , and C8 , were the dc-blocking capacitors of ports 1, 2, and 3. Furthermore, four capacitors, C2 , C3 , C5 , and C7 , were the bypass capacitors of the VGS and VDS . The passive components, namely, inductors, capacitors, and resistors, were also simulated by an ADS RF momentum for the estimation of the parasitic effect. The devices and circuit parameters used by the proposed quasicirculator are listed in Table I.

a moderate choice for the circuit parameter of Z c can be set to 70 . Finally, the entire circuit parameters based on the design analysis for achieving wider operating bandwidth and superior isolation are summarized below gm1 ∼ = 6.8 mA/V, gm2 ∼ = 4.8 mA/V, gm3 ∼ = 2.6 mA/V R = 50 , Z c = 70 . (18) III. C IRCUIT I MPLEMENTATION AND M EASURED R ESULTS A. Ultra-Broadband Quasi-Circulator Implementation The proposed MMIC ultra-broadband quasi-circulator was fabricated using a TSMC standard 90-nm 1P9M CMOS process. Fig. 8 shows an image of the fabricated quasicirculator. The chip size, including the testing pads, is 0.9 mm × 0.57 mm. Nine nMOSFETs with f T higher than 142 GHz were used to construct the proposed quasicirculator. A distributed balun with cascading structure was selected to achieve wideband performance and smooth gain flatness. With the benefit of the reverse phase path constructed from the distributed amplifier, the isolation of the proposed quasi-circulator can be accomplished by phase cancelation. The practical FET’s dimension and circuit parameters can be

B. Experimental Results The proposed quasi-circulator was attached to the carrier plates for testing. The measurement system limitation caused the overall measured bandwidth of the quasi-circulator to be implemented in the range of dc to 67 GHz. Measurement signals were obtained from the coplanar ground-signalground (G-S-G) on the wafer probe measurement system. The VDS and VGS were set to 1.5 and 1 V, respectively,

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. SIDA-TANG et al.: ULTRA-WIDEBAND QUASI-CIRCULATOR IMPLEMENTED BY CASCADING DISTRIBUTED BALUN

7

TABLE I C IRCUIT PARAMETERS OF THE A CTIVE Q UASI -C IRCULATOR

Fig. 10. Measured and simulated port-to-port isolations of the proposed quasi-circulator at RF bandwidth range of 10–100 GHz.

Fig. 8. Image of the fabricated quasi-circulator with chip dimensions, including the contact pads (0.9 mm × 0.57 mm).

Fig. 11. Measured and simulated return losses of the proposed quasicirculator at RF bandwidth range of 10–100 GHz.

Fig. 9. Measured and simulated port-to-port insertion gain and isolations of the proposed quasi-circulator at RF bandwidth range of 10–100 GHz.

to achieve optimum performance with a dc power consumption of 67.8 mW. Fig. 9 shows the measured and simulated insertion gains, |S21 | and |S32 |, and the isolation |S13 |, of the quasi-circulator within RF bandwidth ranging from 10 to 100 GHz. The insertion gains, |S21 | and |S32 |, range from 0.5 to 4.8 dB and from 1.1 to 4.3 dB, respectively, within the RF bandwidth of 10–67 GHz. At the same operation bandwidth, the isolation |S13 | reveals excellent performance from 23.4 to 31.3 dB mainly because of the unilateral characteristic of the FET.

Fig. 10 shows the measured and simulated isolations, |S12 |, |S23 |, and |S31 |, of the proposed quasi-circulator within RF bandwidth ranging from 10 to 100 GHz. The distributed quasi-circulator attained superior isolations at |S12 | and |S23 | from 20.5 to 27 dB and from 25.2 to 28 dB, respectively, within the RF bandwidth of 10–67 GHz. The employment of the distributed amplifier led to good insertion gain and isolation (unidirectional characteristic) with ultra-broadband operation. The isolation |S31 | also gained significant benefit from the phase cancelation path to produce an antiphase with equal amplitude to further perform the phase cancelation. The measured isolation |S31 | ranged from 25 to 30.6 dB. The deviation between simulation and experiment may be partly due to the bypass parasitics and parameter variation. Fig. 11 shows the measured and simulated return loss within the RF bandwidth ranging from 10 to 100 GHz. The return loss of |S11 |, |S22 |, and |S33 | were −4.9 to −12 dB, −1 to −15 dB, and −5.2 to −15.5 dB, for the bandwidth range of 10–67 GHz. The use of the dc-blocking capacitor at each port raised the impedance at a lower band and further affected the return loss.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 8

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

TABLE II C OMPARISON OF R EPORTED Q UASI -C IRCULATORS

Fig. 12. Measured and simulated 1-dB compression point at RF frequency of 20, 40, and 60 GHz, respectively.

Fig. 13. Measured and simulated NF at RF bandwidth range of 10–100 GHz.

Fig. 12 shows the measured and simulated 1 dB compression point at RF frequencies of 20, 40, and 60 GHz. Ports 1 and 2 exhibited a 1 dB compression point at 7 and 6.5, and more than 6 dBm when the RF was fixed at 20, 40, and 60 GHz, biasing at VDS of 1.5 V. The measured and simulated NF as functions of RF frequency is shown in Fig. 13. The NF of ports 2 and 3 ranges from 7.9 to 12.8 dB in the range of 10–100 GHz. The low NF of the distributed structure caused the proposed quasi-circulator to have good NF with ultrabroad operation bandwidth. The deviation between the simulated and experimental results can be partly attributed to the process variation of CMOS, which causes the shift in the matching point in the circuit. Another cause is the accuracy of the nonlinear large signal transistor model, which results in large deviations. Table II compares the performance of the ultrawideband quasi-circulator with the other published literature.

The proposed quasi-circulator has the widest bandwidth performance with better insertion gain, lower NF, and higher frequency of operation compared with other reported MMIC quasi-circulators. IV. C ONCLUSION A monolithic quasi-circulator with an ultra-broadband ranging from 10 to 100 GHz was implemented using 90-nm CMOS technology. With the distributed architecture, the proposed quasi-circulator has a broadband performance, as well as good insertion gain with smooth flatness, high linearity, and low NF. Based on the measured results, the proposed quasicirculator possesses the advantages of good insertion gain, ranging from 0.5 to 4.8 dB and superior isolation |S31 | ranging from 25 to 30.6 dB for an RF bandwidth between 10 and 100 GHz with a miniaturized chip dimension of 0.51 mm2 . Therefore, the proposed ultra-broadband quasi-circulator with

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. SIDA-TANG et al.: ULTRA-WIDEBAND QUASI-CIRCULATOR IMPLEMENTED BY CASCADING DISTRIBUTED BALUN

a distributed architecture is very suitable for RF front-end applications. R EFERENCES [1] A. A. Abidi, “CMOS microwave and millimeter-wave ICs: The historical background,” in Proc. IEEE Int. Symp. Radio-Freq. Integr. Technol., Aug. 2014, pp. 1–5. [2] S. Hara, T. Tokumitsu, and M. Aikawa, “Novel unilateral circuits for MMIC circulators,” IEEE Trans. Microw. Theory Techn., vol. 38, no. 10, pp. 1399–1406, Oct. 1990. [3] C. Kalialakis, M. J. Cryan, P. S. Hall, and P. Gardner, “Analysis and design of integrated active circulator antennas,” IEEE Trans. Microw. Theory Techn., vol. 48, no. 6, pp. 1017–1023, Jun. 2000. [4] A. Gasmi, B. Huyart, E. Bergeault, and L. Jallet, “Quasi-circulator module design using conventional MMIC components in the frequency range 0.45–7.2 GHz,” Electron. Lett., vol. 31, no. 15, pp. 1261–1262, Jul. 1995. [5] G. Carchon and B. Nanwelaers, “Power and noise limitations of active circulators,” IEEE Trans. Microw. Theory Techn., vol. 48, no. 2, pp. 316–319, Feb. 2000. [6] C. E. Saavedra and Y. Zheng, “Active quasi-circulator realisation with gain elements and slow-wave couplers,” IET Microw. Antennas, Propag., vol. 1, no. 5, pp. 1020–1023, Oct. 2007. [7] S.-C. Shin, J.-Y. Huang, K.-Y. Lin, and H. Wang, “A 1.5–9.6 GHz monolithic active quasi-circulator in 0.18 μm CMOS technology,” IEEE Microw. Wireless Compon. Lett., vol. 18, no. 12, pp. 797–799, Dec. 2008. [8] A. Gasmi, B. Huyart, E. Bergeault, and L. Jallet, “Noise and power optimization of a MMIC quasi-circulator,” IEEE Trans. Microw. Theory Techn., vol. 45, no. 9, pp. 1572–1577, Sep. 1997. [9] S. W. Y. Mung and W. S. Chan, “Novel active quasi-circulator with phase compensation technique,” IEEE Microw. Wireless Compon. Lett., vol. 18, no. 12, pp. 800–802, Dec. 2008. [10] Y. Zheng and C. E. Saavedra, “Active quasi-circulator MMIC using OTAs,” IEEE Microw. Wireless Compon. Lett., vol. 19, no. 4, pp. 218–220, Apr. 2009. [11] S. K. Cheung, T. P. Halloran, W. H. Weedon, and C. P. Caldwell, “MMIC-based quadrature hybrid quasi-circulators for simultaneous transmit and receive,” IEEE Trans. Microw. Theory Techn., vol. 58, no. 3, pp. 489–497, Mar. 2010. [12] H.-S. Wu, C.-W. Wang, and C.-K. C. Tzuang, “CMOS active quasicirculator with dual transmission gains incorporating feedforward technique at K -band,” IEEE Trans. Microw. Theory Techn., vol. 58, no. 8, pp. 2084–2091, Aug. 2010. [13] M. Palomba, A. Bentini, D. Palombini, W. Ciccognani, and E. Limiti, “A novel hybrid active quasi-circulator for L-band applications,” in Proc. 19th Int. Conf. Microw. Radar Wireless Commun., Warsaw, Poland, May 2012, pp. 41–44. [14] D.-J. Huang, J.-L. Kuo, and H. Wang, “A 24-GHz low power and high isolation active quasi-circulator,” in IEEE MTT-S Microw. Symp. Dig., Montreal, QC, Canada, Jun. 2012, pp. 1–3. [15] S. He, N. Akel, and C. E. Saavedra, “Active quasi-circulator with high port-to-port isolation and small area,” Electron. Lett., vol. 48, no. 14, pp. 848–850, Jul. 2012. [16] C.-H. Chang, Y.-T. Lo, and J.-F. Kiang, “A 30 GHz active quasicirculator with current-reuse technique in 0.18 μm CMOS technology,” IEEE Microw. Wireless Compon. Lett., vol. 20, no. 12, pp. 693–695, Dec. 2012. [17] A. H. Baree and I. D. Robertson, “Monolithic MESFET distributed baluns based on the distributed amplifier gate-line termination technique,” IEEE Trans. Microw. Theory Techn., vol. 45, no. 2, pp. 188–195, Feb. 1997. [18] M. Ferndahl and H.-O. Vickes, “The matrix balun—A transistorbased module for broadband applications,” IEEE Trans. Microw. Theory Techn., vol. 57, no. 1, pp. 53–60, Jan. 2009. [19] S.-H. Hung, K.-W. Cheng, and Y.-H. Wang, “An ultra wideband quasicirculator with distributed amplifiers using 90 nm CMOS technology,” IEEE Microw. Wireless Compon. Lett., vol. 23, no. 12, pp. 656–658, Dec. 2013. [20] S.-H. Hung, K.-W. Cheng, and Y.-H. Wang, “An ultra-broadband subharmonic mixer with distributed amplifier using 90-nm CMOS technology,” IEEE Trans. Microw. Theory Techn., vol. 61, no. 10, pp. 3650–3657, Oct. 2013.

9

[21] S. W. Y. Mung and W. S. Chan, “Self-equalization technique for distributed quasi-circulator,” Microw. Opt. Technol. Lett., vol. 51, no. 1, pp. 182–184, Jan. 2009. [22] J.-Y. Hsieh, T. Wang, and S.-S. Lu, “A 1.5-mW, 2.4 GHz quasi-circulator with high transmitter-to-receiver isolation in CMOS technology,” IEEE Microw. Wireless Compon Lett., vol. 24, no. 12, pp. 872–874, Dec. 2014. [23] M. Porranzl, C. Wagner, H. Jaeger, and A. Stelzer, “An active quasicirculator for 77 GHz automotive FMCW radar systems in SiGe technology,” IEEE Microw. Wireless Compon. Lett., vol. 25, no. 5, pp. 313–315, May 2015. [24] J.-F. Chang, J.-C. Kao, Y.-H. Lin, and H. Wang, “Design and analysis of 24-GHz active isolator and quasi-circulator,” IEEE Trans. Microw. Theory Techn., vol. 63, no. 8, pp. 2638–2649, Aug. 2015.

Sida-Tang was born in Huai’an, China, in 1987. He received the B.S. degree in electronic information and electrical engineering from the Dalian University of Technology, Dalian, China, in 2010, the M.S. degree in electronic and computer engineering from the Hong Kong University of Science and Technology, Hong Kong, in 2011, and is currently pursuing the Ph.D. degree at the Institute of Microelectronics, National Cheng Kung University, Tainan, Taiwan. He is currently involved in research on microwave millimeter-wave circuits, RF integrated circuits, and monolithic microwave integrated circuit design.

Chih-Min Lin, photograph and biography not available at the time of publication.

Shih-Han Hung, photograph and biography not available at the time of publication.

Kai-Wen Cheng, photograph and biography not available at the time of publication.

Yeong-Her Wang (M’89) was born in Tainan, Taiwan. He received the B.S., M.S., and Ph.D. degrees from National Cheng Kung University, Tainan, Taiwan, in 1978, 1980, and 1985, respectively, all in electrical engineering. He is currently a Distinguished Professor with the Institute of Microelectronics, Institute of Electro-Optical Science and Engineering, and the Department of Electrical Engineering, National Cheng Kung University. In 2007, he joined National Applied Research Laboratories, Taipei, Taiwan, as an Executive Vice President. He provides consultancy services to semiconductor companies for the development of new technologies and products. He has authored or coauthored over 370 internationally refereed SCI journal papers and 280 conference papers. He has also contributed four book chapters. He holds 110 Taiwan patents, 23 U.S. patents, and 15 patents from other countries. He has been the Mentor of more than 50 Ph.D. students. His current research interests include semiconductor materials, devices and physics, and monolithic microwave integrated circuit design and fabrication.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

1

Variable 360° Vector-Sum Phase Shifter With Coarse and Fine Vector Scaling Mohammad-Mahdi Mohsenpour, Member, IEEE, and Carlos E. Saavedra, Senior Member, IEEE

Abstract— A CMOS vector-sum phase shifter covering the full 360° range is presented in this paper. Broadband operational transconductance amplifiers with variable transconductance provide coarse scaling of the quadrature vector amplitudes. Fine scaling of the amplitudes is accomplished using a passive resistive network. Expressions are derived to predict the maximum bit resolution of the phase shifter from the scaling factor of the coarse and fine vector-scaling stages. The phase shifter was designed and fabricated using the standard 130-nm CMOS process and was tested on-wafer over the frequency range of 4.9–5.9 GHz. The phase shifter delivers root mean square (rms) phase and amplitude errors of 1.25° and 0.7 dB, respectively, at the midband frequency of 5.4 GHz. The input and output return losses are both below 17 dB over the band, and the insertion loss is better than 4 dB over the band. The circuit uses an area of 0.303 mm2 excluding bonding pads and draws 28 mW from a 1.2 V supply. Index Terms— Active phase shifter, active summing junction, clock and data recovery, CMOS, IEEE 802.11n, LTE, monolithic microwave integrated circuit (MMIC), operational transconductance amplifiers (OTAs), phased array, quadrature generation, radar, RFIC, root mean square (rms) error, WiMAX.

I. I NTRODUCTION

T

HERE is continued interest in finding new methods to improve the resolution and accuracy of monolithic microwave integrated circuit (MMIC) phase shifters. That interest is motivated by the critical role that phase shifters have in multiple-input multiple output radio links and phased arrays. Design advances over the past decade have led to significant improvements in the fractional bandwidth of the phase shifters and a reduction in the footprint area of the chips. MMIC phase shifters covering the full 360° using different techniques, such as delay lines [1], [2], signal reflection [3], highpass/low-pass networks [4], all-pass networks [5], and vector summation [6]–[12]. In vector-sum phase shifters, there often appear unreachable phase angles (phase gaps) at the quadrant edges that limit the phase-step (bit) resolution of digital phase shifters. The objective of this paper is to explore the issue of phase gaps in vector-sum phase shifters and to propose a solution to mitigate them. The general approach taken here is to use a twostep vector-scaling procedure. First, a coarse scaling is carried

Manuscript received April 6, 2015; revised August 18, 2015, February 9, 2016, and April 25, 2016; accepted May 21, 2016. This work was supported by the Natural Science and Engineering Research Council of Canada. The authors are with the Department of Electrical and Computer Engineering, Queen’s University, Kingston, ON K7L 3N6, Canada (e-mail: [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TMTT.2016.2574843

out in the current domain using operational transconductance amplifiers (OTAs), and subsequently, fine scaling is done on the signal vector in the voltage domain using a resistive network before the signal vectors are added together. A prototype phase shifter was designed for the 5.4-GHz band and was fabricated using the 130-nm CMOS technology. Experimental test results are presented, which validate the concept. II. P HASE S HIFTER C ONCEPT Fig. 1 shows the block diagram of the proposed MMIC phase shifter, where the shaded area shows the on-chip circuitry. The phasor diagrams above the shaded area illustrate how a representative input signal is modified as it propagates through the phase shifter. The external 180° power splitter converts the RF input voltage signal, v RF , into a differential waveform. A quadrature generator then produces four equalamplitude orthogonal voltage signal vectors, ±v I and ±v Q , for the I and Q paths, respectively. A pair of identical OTAs are used to scale the magnitude of the four voltage signal vectors and to convert them into current signals: ±i I,Q = ±G m v I,Q , where G m is the transconductance gain of the OTAs, which is tuned through the analog control voltages Vtune,I and Vtune,Q . Two single-pole double-throw (SPDT) switches are used to select which I -path vector (the 0° or the 180°) and which Q-path vector (the 90° or the 270°) will be summed together at the output to produce the desired output phase angle. The SPDT switches are controlled using two digital bits, a2 b2 . While the minimum gain, G min , of the OTAs can ostensibly be reduced to zero, the problem with doing so is that the phase response of the OTAs at zero gain can be quite different than at moderate to high gain levels, thereby compromising the root mean square (rms) phase and amplitude error performance of the circuit. As a result, there is a practical limit to how small G min should be and that value can be found by observing the phase response of the OTA as a function of its gain. Suppose now that G min has been established and that the highest gain setting of the OTAs is denoted by G max , then the smallest output angle that the phase shifter would produce in quadrant I is   G min,Q rad (1) θmin = tan−1 G max,I as shown in Fig. 2(a). It is straightforward to see that there will be a range of phase angles that the phase shifter cannot produce between quadrants I and IV and at every other quadrant boundary, as shown in Fig. 2(b). These unreachable output phase angle regions are the so-called phase gaps. The size of the gaps is θgap = 2θmin and they place a limit on the bit

0018-9480 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 2

Fig. 1.

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

Block diagram of the 360° vector-sum phase shifter proposed in this paper. Components outside the shaded area are off-chip.

floor function. With the aid of (1), the expression for the maximum bit resolution, n, of the phase shifter as a function of the amplifier gain tuning range is ⎤ ⎡ π  ⎦ . (2) n = log2 ⎣ G min,Q −1 tan G max,I

Fig. 2. (a) Minimum output phase angle, θmin , that can be produced in quadrant I. (b) Phase gaps equal to 2θmin occur at the quadrant boundaries.

resolution of digital vector-sum phase shifters, since the circuit cannot have a phase step smaller than θgap. The relationship between the phase gap and the phase-step resolution of the phase shifter is θgap < 2π/2n , from which the maximum bit resolution is n = log2 (2π/θgap), where · denotes the

To reduce the size of θgap and thereby increase the bit resolution of the phase shifter, the proposed system in Fig. 1 employs the second vector-scaling step after the SPDT switches. This second scaling step is done with a resistive passive network. The final I and Q vectors are added using a summing junction to produce the desired phase-shifted signal. III. RFIC D ESIGN This section provides design details of the phase shifter’s building blocks in sequence from left to right. All circuit components were designed for a center frequency of 5.4 GHz.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. MOHSENPOUR AND SAAVEDRA: VARIABLE 360◦ VECTOR-SUM PHASE SHIFTER

Fig. 3.

3

Schematic of the all-pass quadrature signal generator.

A. Quadrature Signal Generator The circuit shown in Fig. 3 is used to generate differential quadrature basis vectors for the I and Q signal paths. It is an all-pass network that yields signals with tight amplitude and phase balance over wide bandwidths with a low return loss at the input port [6]. Using the component values shown in Fig. 3, the simulation results predict phase and amplitude imbalances less than 1° and 0.35 dB, respectively, and an input return loss below 16 dB for the quadrature generator over a 1-GHz band centered at 5.4 GHz.

Fig. 4. Schematic of the tunable feedforward-regulated OTA for coarse vector scaling.

B. OTA Vector-Scaling Stage (Coarse Scaling) The OTAs convert the incident voltage signals into currents. These signal currents are then scaled by varying the transconductance, G m , of the OTAs. The OTA schematic is shown in Fig. 4 and is a variant of the circuit reported in [7] and [13]–[15]. Thus, only a basic description of the OTA is given here followed by the information relevant to the phase shifter design. The input signal feeds to M1 /M2 and M5 /M6 through C1 /C2 and C5 /C6 , respectively. Transistors M3 and M4 are cross-coupled to provide feedforward regulation to the OTA for broadband operation and increased linearity. Tuning of the G m is done by changing the gate voltage of M3 /M4 at the node labeled Vtune in Fig. 4. Triple well nMOS devices are used here to provide source-body isolation for all OTA’s devices and better isolation from substrate. Capacitors C3 and C4 are for dc blocking, and resistors R1 –R6 have a large value and are used for dc biasing. A key design goal for the OTA for the application at hand is for its gain, G m , versus Vtune relationship to have a linear response, so that the vector scaling also exhibits a linear dependence on the tuning voltage. A simulation of G m versus control voltage, Vtune , at a frequency of 5.4 GHz is shown in Fig. 5. The magnitude of G m varies linearly from 5.5 to 32 mS as Vtune is swept from 0.45 to 0.85 V. Therefore, θgap = 2tan−1 (5.5/32) = 0.34 rad = 19.5° and (2) predicts that the highest resolution that the phase shifter could produce is 4 b which corresponds to a phase step of 22.5°. To improve

Fig. 5. Simulation results for amplitude and phase variations of the OTA versus control voltage.

the resolution of the phase shifter, the second scaling circuit is used to reduce the value of G m . That circuit is described further below after the SPDT switches. C. SPDT Switches Two SPDT switches, connected to the OTAs’ outputs, choose between the four quadrants. To keep the insertion loss of the switches below 1 dB, two series nMOS transistor, M7 and M8 , are used to provide a low channel resistance.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 4

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES

the output path, and only one switch is turned ON at any given time. If there are M identical resistors in the network and each has a value R/M, then the output voltage when the switch in the kth branch is activated is

    kR kR Z in i I ≈ iI v¯I = (3) M M and the approximation holds if R  Z in , which is easily satisfied if R is in the hundreds of ohms or a few kilohms, because the input terminals to the summing junction are nMOS gates. Recalling that i I = G m v b,I , the overall scaling factor between the signals v¯I and v b,I when the switch in branch k is activated is   R Av = k (4) G m V/V M Fig. 6.

Schematic of series–shunt SPDT switch.

and the expressions for the maximum and minimum vectorscaling factors are Av,max = R G max k = M   R Av,min = G min k = 1. M

(5a) (5b)

Using (2) and (5), the new bit resolution, n  , of the phase shifter due to the fine-scaling resistive scaling network is ⎤ ⎡ π  ⎦ n   = log2 ⎣ (6a) (R/M)G min −1 tan RG max

πM (6b) ≈ log2 G min /G max

Increasing the size of M7 and M8 , consequently, degrades the isolation of the switch in the OFF states. As shown in Fig. 6, two shunt transistors are utilized here to form a series–shunt SPDT switch and compensate for the isolation degradation, due to larger parasitic capacitors of series switches. Furthermore, deep n-well nMOS transistors, designed with body floating technique [16], are used to isolate the transistors’ bodies from substrate and boost the switch performance in terms of lower insertion loss. The simulated results show that the insertion loss of the SPDT switch is 0.65 μm) by the SiO2 compressive stress (∼150 MPa). After the whole process is completed, the deflection of the overall structure (SiO2 6.5 μm/PR 3 μm) is τSPDT ), where τSPDT is the time constant of SPDT. For low loss, two 40-μm/300-nm devices are cascaded in series. As shown in Fig. 9, the insertion loss can be bounded by 1.8 dB up to 10 GHz. In simulation, the switching time from receive mode to transmit mode is 80 ps, while switching from transmit mode to receive mode requires 26 ps. Since SPDT is implemented with thick oxide devices, a 2.5 V digital control bit is designed to configure the chip to work in the Tx mode or in the Rx mode. B. PCR Transmitter and QPSK Modulator Fig. 8 describes the proposed PCR Tx and QPSK modulator. The I/Q channels are summed in current domain. In the PCR mode, 3/5/7-b Barker codes modulate the 5.9-GHz IF clock to be the Tx pulse signal. Inductor peaking techniques extend the BW >10 GHz and compensate the SPDT loss at high frequency. In the communication mode, the Barker codes are replaced by QPSK data. Then, the PCR Tx becomes a conventional IF modulator.

Fig. 10.

7-b Barker code modulator and correlator.

Fig. 11.

Die photo of proposed IF-correlation system.

C. IF Correlator and QPSK Demodulator Since the pulse rate is up to 1.5 Gb/s, the demodulator/correlator input ports are designed with a BW of more than 10 GHz for step response settling (Ts > 6 τSPDT ). Here, 6τ ensures that at least 7-b periods settle for the pulse signal, which leaves margin for accurate correlation and 5-b analogto-digital converter (ADC) detection. Note that the input capacitor of the demodulator/correlator increases the time constant of SPDT. The demodulator and analog correlator are reconfigurable, as shown in Fig. 8. First, the analog correlator is a block with IF inputs but low-frequency outputs. This allows the signal processing in later stage such as after ADC.

Second, the correlator is working as the second downconversion mixer if the template signal is pure IF clock. Therefore, we propose a two stage analog correlator in this paper. In the

LI et al.: 3-Gb/s RADAR SIGNAL PROCESSOR USING IF-CORRELATION TECHNIQUE

Fig. 12.

2177

(a) Tx test setup. (b) Rx single tone test. (c) Transceiver link test setup with two chips.

PCR mode, the I/Q clock signal is modulated by a Barker code, and the template signals are generated with the carrier frequency of 5.9 GHz. In the communication mode, the Barker code ports in template generator are biased with VDD and GND and the circuit becomes a clock buffer. Then, the second stage becomes a conventional downconversion mixer. Fig. 10 shows the simulations of modulator/correlator for a 7-b Barker code. The I/Q baseband data Tx-I and Tx-Q are 111–1–11–1 sequences. The repetition rate is 50 ns, and the symbol period is 1 ns. The transmitted pulse Tx-IF has 250 mVpk–pk signal swing, while the SPDT insertion loss is 1.8 dB at IF frequency. Assuming the link loss is compensated by variable gain amplifier (VGA) and VGA provides 6 dB more gain. The received signal has 500 mVpk–pk swing and is correlated with the upconverted template signals TEMP-I-IF and TEMP-Q-IF to generate OUT-I and OUT-Q. Each channel of the correlator outputs has −160 to 160-mV differential signal range. This sets the ADC LSB to be around 16 mV. The integration BW is adaptive to the signal BW. For communication, the template generator is disabled and the correlator works as a demodulator. The integrator is configured with the smallest capacitor to increase the BW back to 1.5 GHz.

IV. M EASUREMENTS The proposed system circuits are implemented with 90-nm CMOS devices. The chip microphotograph is shown in Fig. 11 and has a measured area of 1.74 mm2 . The circuit operates with a 1.3 V supply for the transceiver and a 2.5 V supply for the SPDT switch. The proposed system consumes 79 mW, including 25 mW for the modulator, 26 mW for the correlator, 28 mW for the current mode logic (CML) divider, and clock buffers in PCR mode. In the communication mode, the correlator is reconfigured as a demodulator and consumes 20 mW, while the transceiver system consumes 69 mW. An additional 4 mW comes from reducing the bias currents of I/Q template generators, clock buffers, and CML divider circuits. Since the system is working in TDD mode, the maximum power consumption is 54 mW in the PCR mode and 49 mW in the point-to-point communication mode. The system measurement setups are shown in Fig. 12(a)–(c). Two chips are configured in Tx mode and Rx mode. An Agilent 81134 pulse pattern generator (PPG) provides the Barker code and I/Q modulation signal. A signal generator (E4438C) supplies an 11.3 or 11.9-GHz clock. The system is evaluated in three steps. Fig. 12(a) describes the test setup of the Tx mode. The transmitted Barker code and QPSK

2178

Fig. 13.

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 64, NO. 7, JULY 2016

Fig. 14.

Total channel loss and IF amplifier gain.

Fig. 15.

I/Q correlations of 7-b Barker code at 1 Gb/s.

Received 7-b Barker code at 1 Gb/s.

signal are measured with Agilent spectrum analyzer E4448A. Fig. 12(b) shows the baseband test setup of the Rx mode. The Rx gain is measured with a single tone input. An Agilent real-time oscilloscope MSO8104A is used to evaluate I/Q gain and phase mismatch. Fig. 12(c) describes the transceiver setup with two chips. Two 180° hybrid couplers and one IF amplifier are inserted to build the Tx/Rx link. One coupler is 2–8 GHz (Narda model 4343), and the other coupler is 2–18 GHz (Krytar model 402180). Since this link introduces phase imbalance, the received I/Q signals have gain and phase error. In the QPSK transceiver mode, template I/Q signal is biased with VDD or GND. In the PCR mode, another Agilent PPG 81134 generates the Rx template Barker codes (3/5/7-b length) with speeds from 200 Mb/s and 1.5 Gb/s. A. Pulse Compression Radar Mode For PCR measurements, the IF carrier frequency is tuned to 5.93 GHz. Fig. 13 shows the 7-b Barker code BPSK at 1 Gb/s. The I/Q channel data share the same code, but the signals are filtered by a low-pass 1.05-GHz filter (Mini-Circuits VLFX-1050+). As shown in Fig. 13, a greater than 500 mVpk–pk signal is received for the correlator. Fig. 14 shows the total channel loss and IF amplifier gain. The loss includes bonding wire, 5.5-cm PCB routing, hybrid coupler, connectors, and cables. As shown in Fig. 14, the total loss is 12 dB at 6 GHz. So the de-embedded output swing is around 200 mVpk–pk . The local upconverted template signal has 400 mVpk–pk swing in simulation. The maximum output voltage from the correlator ranges between −160 and 160 mV, which means 16-mV ADC resolution is required. A 4-b binary coded capacitor bank is designed for different data rate. Therefore, the integrating capacitor can be adapted to the signal BW. Due to the bonding wire, output PCB routing, hybrid coupler, and IF amplifier, dc wander does exist and could be eliminated through common mode circuit techniques to prevent distortion in the transmitted Barker code.

Fig. 16. SLR measurement for a 7-b Barker code at 1 Gb/s in the presence of phase misalignment.

An Agilent 81134A PPG generates the template code for the Rx chip. Fig. 15 describes the 7-b Barker code correlation with β of 0° and 30°. Due to the mismatch of the I/Q conversion gain, the amplitude has around 3% error.

LI et al.: 3-Gb/s RADAR SIGNAL PROCESSOR USING IF-CORRELATION TECHNIQUE

2179

Fig. 17. PCR mode with barker code. (a)–(c) 200-Mb/s and (d)–(f) 200-Mb/s I/Q correlations. (g)–(i) 1.5-Gb/s and (j)–(l) 1.5-Gb/s I/Q correlations. (m)–(o) SLR performances.

2180

Fig. 18.

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 64, NO. 7, JULY 2016

Modulated IF spectrum generated at 1.6 and 3 Gb/s.

In addition, high-frequency ripple is present in the correlation result due to LO leakage. Fig. 16 describes the SLR measurement for a 7-b Barker code in the presence of phase misalignment. With the proposed bandpass I/Q correlator, the SLR amplitude drops by 8 % in peak detection. The transmitted 3- 5-, and 7-b Barker codes at 200 Mb/s and 1.5 Gb/s are shown in Fig. 17(a)–(c) and (g)–(i), respectively. The I/Q channel correlation results are measured as Fig. 17(d)–(f) and (j)–(l). With the proposed I/Q architecture, the SLR peak calibration in postsignal processing is relaxed. Fig. 17(m)–(o) shows the SLR performance with 3-, 5-, and 7-b code length at 200 Mb/s and 1.5 Gb/s. The amplitude and the phase are calculated according to (9). With a 1.5-Gb/s Barker code, the integration results are more sensitive to the template data switching. Due to the low gain of Rx, 600 mVpk-pk template swing is set in the measurements. Lower swing would reduce the ripples and switching effects. B. Demodulator Mode Two proposed baseband circuits are connected back-to-back, where one circuit is configured as the modulator while the other one is configured as the demodulator. An amplifier is inserted between the two circuits to model the VGA and provide 450 mVpk–pk signal swing for the demodulator. As shown in Fig. 12(c), there are two hybrid couplers inserted between Tx/Rx chips and IF amplifier. The proposed circuits are measured at the data rate of 1.6 and 3 Gb/s with the IF carrier frequency at 5.65 GHz. Figs. 18 and 19 show the measured single-ended spectrum of the Tx and Rx with QPSK waveforms. The I/Q signals are generated by two 231 −1 pseudorandom binary sequence channels. The Tx spectrum shows the additional loss from 6.1 to 7 GHz. The Rx spectrum has more loss at high frequency due to the link setup, PCB parasitic capacitors, bonding wire inductor, and SPDT ON-state resistance. Additional high rejection filter to suppress the aliasing spectrum and clock

Fig. 19.

Demodulated baseband spectrum at 1.6 and 3 Gb/s.

Fig. 20.

Normalized small-signal conversion gain on the I and Q channels.

feedthrough is required if QAM modulation is supported in data communication mode. In order to determine the 3-dB BW of the Rx, a small-signal measurement is shown in Fig. 20. The template generator is configured to provide a sinusoidal IF clock and the conversion gain of the IF to baseband is measured. We find that both channels behave consistently with a 3-dB BW of 1.4 GHz, but a 1-dB amplitude mismatch exists between the channels. Fig. 21 shows the I/Q mismatch of the demodulator. The single-ended 0.5-GHz demodulated I/Q signals are plotted when IF signal is at 5.15 GHz and LO is at 5.65 GHz. The phase error is 3.5° but 2.5° is accounted from PCB routing and hybrid coupler. The wideband 180° hybrid coupler has 0.5-dB amplitude error and 1° phase error. The PCB routing at IF input and I/Q outputs results in the other 1.5° error. The power consumption at a peak data rate of 3 Gb/s is 25 mW in modulator mode and 20 mW in demodulator mode.

LI et al.: 3-Gb/s RADAR SIGNAL PROCESSOR USING IF-CORRELATION TECHNIQUE

2181

TABLE II P ERFORMANCE S UMMARY AND C OMPARISON

Fig. 21.

Measured 500-MHz I and Q signals.

Since the hybrid couplers have large amplitude and phase errors over wide BW, the error vector magnitude (EVM) in pass-through mode is evaluated with 100 MHz. Fig. 22 shows the QPSK 100-MS/s performance with Fig. 12(c) link setup. The demodulated signal has 60-mV swing for each channel. I/Q calibration is applied in the MATLAB according to [18]. This EVM includes the noise from Tx, Rx, and two IF LO signal. Wideband EVM is mainly limited by the channel performance of PCB design. For fully integrated system, the IF signal would be upconverted to 77/79 GHz on chip.

Fig. 22.

QPSK 100-MS/s EVM performance.

Then, 3 GHz is only of the carrier frequency and the channel can be assumed to be flat. Table II shows the performance of this paper. The signal BW is reconfigurable from 200 MHz to 1.5 GHz for PCR. The QPSK modulation supports a data rate up to 3 Gb/s. Compared with the previous work [13], the proposed double downconversion system uses ac coupling between VGA and baseband circuits so offset calibration algorithm is not required. The 5.65- and 5.93-GHz LO signals in the proposed frequency plan can be implemented with a wideband fractional-N frequency

2182

IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, VOL. 64, NO. 7, JULY 2016

synthesizer [19]. Compared with [12], [20], and [21], higher data rates have been demonstrated. V. C ONCLUSION This paper presents a dual-mode radar signal processor using an IF-correlation technique. The baseband circuitry supports PCR mode and communication mode, which interfaces with time-division duplex frontends. The signal processing circuit includes an I/Q clock generator, a highly linear SPDT, a QPSK modulator, a template generator, and a reconfigurable QPSK demodulator/IF correlator. This chip is implemented with 90-nm CMOS devices and it consumes 54 mW with BW of 1.5 GHz for 10-cm range resolution. In communication mode, it consumes 49 mW at a peak data rate of 3 Gb/s. Since analog correlation reduces ADC sampling rate to 1/N in the PCR mode, significant power and area are saved by the proposed architecture, which leads to a low cost solution. Moreover, IF correlation removes the complex calibration and improves the detection latency. The emergence of dual-mode sensing and communication in mmWave bands could provide an enabling layer of network intelligence. ACKNOWLEDGMENT The authors would like to thank C. Levy for the discussion and H. Wang for the PCB assembling. The authors would also like to thank the California Institute for Telecommunications and Information Technology (Calit2), UCSD, for providing measurement equipment. R EFERENCES [1] V. Jain and P. Heydari, Automotive Radar Sensors in Silicon Technologies. New York, NY, USA: Springer, 2012. [2] J. Lee, Y.-A. Li, M.-H. Hung, and S.-J. Huang, “A fully-integrated 77-GHz FMCW radar transceiver in 65-nm CMOS technology,” IEEE J. Solid-State Circuits, vol. 45, no. 12, pp. 2746–2756, Dec. 2010. [3] T. Mitomo, N. Ono, H. Hoshino, Y. Yoshihara, O. Watanabe, and I. Seto, “A 77 GHz 90 nm CMOS transceiver for FMCW radar applications,” IEEE J. Solid-State Circuits, vol. 45, no. 4, pp. 928–937, Apr. 2010. [4] H. Jia et al., “A frequency doubling two-path phased-array FMCW radar transceiver in 65 nm CMOS,” in Proc. IEEE Asian Solid-State Circuits Conf. (A-SSCC), Xiamen, China, Nov. 2015, pp. 1–4. [5] D. Guermandi et al., “A 79 GHz binary phase-modulated continuouswave radar transceiver with TX-to-RX spillover cancellation in 28 nm CMOS,” in IEEE Int. Solid-State Circuits Conf. (ISSCC), Dig. Tech. Papers, San Francisco, CA, USA, Feb. 2015, pp. 1–3. [6] J.-L. Kuo et al., “60-GHz four-element phased-array transmit/receive system-in-package using phase compensation techniques in 65-nm flipchip CMOS process,” IEEE Trans. Microw. Theory Techn., vol. 60, no. 3, pp. 743–756, Mar. 2012. [7] H. J. Ng, R. Feger, and A. Stelzer, “A fully-integrated 77-GHz UWB pseudo-random noise radar transceiver with a programmable sequence generator in SiGe technology,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 61, no. 8, pp. 2444–2455, Aug. 2014. [8] T. Bryllert, V. Drakinskiy, K. B. Cooper, and J. Stake, “Integrated 200–240-GHz FMCW radar transceiver module,” IEEE Trans. Microw. Theory Techn., vol. 61, no. 10, pp. 3808–3815, Oct. 2013. [9] T. Kijsanayotin, J. Li, and J. F. Buckwalter, “A 70 GHz bidirectional front-end for a half-duplex transceiver in 90-nm SiGe BiCMOS,” in Proc. IEEE Compound Semiconductor Integr. Circuit Symp. (CSICS), New Orleans, LA, USA, Oct. 2015, pp. 1–4. [10] L. Kuang et al., “A fully integrated 60-GHz 5-Gb/s QPSK transceiver with T/R switch in 65-nm CMOS,” IEEE Trans. Microw. Theory Techn., vol. 62, no. 12, pp. 3131–3145, Dec. 2014.

[11] L. Kuang, B. Chi, L. Chen, M. Wei, X. Yu, and Z. Wang, “An integrated 60 GHz 5 Gb/s QPSK transmitter with on-chip T/R switch and fullydifferential PLL frequency synthesizer in 65nm CMOS,” in Proc. IEEE Asian Solid-State Circuits Conf. (A-SSCC), Singapore, Nov. 2013, pp. 413–416. [12] J. Li, H. Mukai, M. Parlak, M. Matsuo, and J. F. Buckwalter, “A 1 Gb/s reconfigurable pulse compression radar signal processor in 90 nm CMOS,” in Proc. IEEE Custom Integr. Circuits Conf., San Jose, CA, USA, Sep. 2013, pp. 1–4. [13] J. Li, M. Parlak, H. Mukai, M. Matsuo, and J. F. Buckwalter, “A reconfigurable 50-Mb/s-1 Gb/s pulse compression radar signal processor with offset calibration in 90-nm CMOS,” IEEE Trans. Microw. Theory Techn., vol. 63, no. 1, pp. 266–278, Jan. 2015. [14] H. Darabi, H. J. Kim, J. Chiu, B. Ibrahim, and L. Serrano, “An IP2 improvement technique for zero-IF down-converters,” in IEEE Int. Solid State Circuits Conf.-Dig. Tech. Papers, San Francisco, CA, USA, Feb. 2006, pp. 1860–1869. [15] J. He, Y.-Z. Xiong, and Y. P. Zhang, “Analysis and design of 60-GHz SPDT switch in 130-nm CMOS,” IEEE Trans. Microw. Theory Techn., vol. 60, no. 10, pp. 3113–3119, Oct. 2012. [16] M. Uzunkol and G. Rebeiz, “A low-loss 50–70 GHz SPDT switch in 90 nm CMOS,” IEEE J. Solid-State Circuits, vol. 45, no. 10, pp. 2003–2007, Oct. 2010. [17] T. M. Hancock and G. M. Rebeiz, “Design and analysis of a 70-ps SiGe differential RF switch,” IEEE Trans. Microw. Theory Techn., vol. 53, no. 7, pp. 2403–2410, Jul. 2005. [18] P.-Y. Wu, A. K. Gupta, and J. F. Buckwalter, “A dual-band millimeterwave direct-conversion transmitter with quadrature error correction,” IEEE Trans. Microw. Theory Techn., vol. 62, no. 12, pp. 3118–3130, Dec. 2014. [19] Y. Sun et al., “A 2.74–5.37 GHz boosted-gain type-I PLL with