Reliability of High-Power Mechatronic Systems 2: Aerospace and Automotive Applications Issues,Testing and Analysis [2] 1785482610, 9781785482618

This second volume of a series dedicated to the reliability of high-power mechatronic systems focuses specifically on is

870 164 14MB

English Pages 302 [293] Year 2017

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Reliability of High-Power Mechatronic Systems 2: Aerospace and Automotive Applications Issues,Testing and Analysis [2]
 1785482610, 9781785482618

Table of contents :
Cover
Reliability of High-Power
Mechatronic Systems 2:

Aerospace and Automotive Applications:
Issues, Testing and Analysis
Copyright
Foreword 1
Foreword 2
Preface
1 Accelerated Life Testing
2 Highly Accelerated Testing
3 Reliability Study for Cuboid Aluminum
Capacitors with Liquid Electrolyte
4 The Reliability of Components:
A New Generation of Film Capacitors
5 Reliability and Qualification Tests
for High-Power MOSFET Transistors
6 Fault Diagnosis in a DC/DC
Converter for Electric Vehicles
7 Methodology and Physicochemical
Characterization Techniques Used for
Failure Analysis in Laboratories
8 Reliability Study of High-Power
Mechatronic Components by Spectral
Photoemission Microscopy
9
Index
Back Cover

Citation preview

Reliability of High-Power Mechatronic Systems 2

Series Editor Abdelkhalak El Hami

Reliability of High-Power Mechatronic Systems 2 Aerospace and Automotive Applications: Issues, Testing and Analysis Edited by

Abdelkhalak El Hami David Delaux Henri Grzeskowiak

First published 2017 in Great Britain and the United States by ISTE Press Ltd and Elsevier Ltd

Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address: ISTE Press Ltd 27-37 St George’s Road London SW19 4EU UK

Elsevier Ltd The Boulevard, Langford Lane Kidlington, Oxford, OX5 1GB UK

www.iste.co.uk

www.elsevier.com

Notices Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary. Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility. To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein. For information on all our publications visit our website at http://store.elsevier.com/ © ISTE Press Ltd 2017 The rights of Abdelkhalak El Hami, David Delaux and Henri Grzeskowiak to be identified as the authors of this work have been asserted by them in accordance with the Copyright, Designs and Patents Act 1988. British Library Cataloguing-in-Publication Data A CIP record for this book is available from the British Library Library of Congress Cataloging in Publication Data A catalog record for this book is available from the Library of Congress ISBN 978-1-78548-261-8 Printed and bound in the UK and US

Foreword 1

Predicting and then guaranteeing the reliability of an electronic system is a major challenge for manufacturers in the automotive, aerospace and defense sectors in addition to those of railway, telecommunications, nuclear and health amongst others. However, above all it is important for us, the daily users of such equipment, who must have absolute confidence in the information being transmitted and decisions made in real time. The increasing development of connected objects (autonomous vehicles, home automations, etc.) will lead to a drastic reduction of human intervention in favor of an intervention by mechatronic systems. These systems will only be able to deploy if the users have absolute confidence in the reliability of the equipment. This equipment will be decomposed according to its two major features. Firstly, in terms of the hardware which is mainly composed of electronic boards (coupled with the mechanical systems), and secondly, the real-time software that allows for the implementation of the said equipment and the achievement of the tasks expected of it. Predicting and ensuring the reliability of electronic equipment is a task that is both immense and without end. On the one hand, the number and diversity of components used to achieve these cards is very high, on the other hand, the new features of such innovative equipment require multiple tests on their inherent reliability and robustness. Before this immense work, a few industrialists (such as Thales Air Systems, Valéo, Safran, NXP) of SMEs (the likes of Areelis, MB Electronique, Ligeron, statXpert, Lescate, Serma, PAK), supported by private laboratories (including CEVAA, Analyses et Surface) and public entities (together with LNE, GPM, LAMIPS, INSA Rouen) have embarked on a

xii

Reliability of High-Power Mechatronic Systems 2

process of setting up resources and skills dedicated to the reliability of highpowered mechatronic components and systems. This association of complementary partners made its debut in the framework of the first program dedicated to reliability, AUDACE or “Analyse des caUses de DéfaillAnce des Composants des systèmes mécatronique Embarqués” (or its English equivalent: the analysis of the causes of defective components embedded in mechatronic systems). This initial project was a great success. It has made it possible to create strong links between the various partners and to set up methods of analysis and measurement that perform extremely well. However, at the same time, it also highlighted the immense scale of the task and the diversity of components and technologies to be mastered. At the end of the first contract, the collective decided to continue the groundbreaking work through a second program: FIRST-MFP or “Fiabiliser et Renforcer des Systèmes Technologiques mécatroniques de forte puissance” (which translates in English as: improved reliability and strengthening of high power, technological, mechatronic systems); in order to address the components specific to electronic power. In effect, the concept of power (ranging from a few KW to several hundred KW). The electronics must be able to cope with the stresses that could otherwise lead to fatigue failures not commonly encountered in low power electronics. The digital modeling, multi-physical testing and the consideration of multiple variables of uncertainty, have led to the development of this follow-up research program: FIRST-MFP. With competitiveness clusters of Astech and MOV'EO, the Aéronautique Normande NAE, the regions of Normandy and Ile de France, as well both the Chambers of Commerce for Rouen and Versailles, this program was able to be implemented and has since achieved exceptional results. In order to share these results with not only the economic actors involved in the reliability of systems, but also with students in the fields of electronic, mechanical and material research, it was decided to record all of the results of this program and publish them in the format of a book. In fact, given the richness of the results, it was decided that two books would be better suited to the task, and with this in mind, I would like to thank very warmly Misters Abdelkhalak EL HAMI, David DELAUX and Henri GRZESKOWIAK for their remarkable work in the implementation of these two volumes, as well as all the participants in the FIRST-MFP program who

Foreword 1

xiii

spent many hours collating their results into a format that could be more easily presented in this production. As such, it should go without saying, that the essential information presented here does not remain the property of a few, but rather is shared by numerous engineers, technicians, researchers and students. Volume 1 is devoted to the presentation of various issues and deals with the modeling and simulation aspects that are essential to the prediction of the performance reliability of future electronic systems. Volume 2 is the compilation of aggravated and accelerated tests carried out on different types of components and high-power subsystems. Together in these two volumes you will find information that is essential and indispensable for the innovation of future equipment that will be integrated into the cars, planes and helicopters of tomorrow. I would like to thank all the contributors of this program as well as the financiers (both national and regional) without whom this project could not have succeeded. It is my deepest wish that the solid alliance which came about as a result of these two programs, Audace and FIRST-MFP, continue their association in view of the many emerging technologies whose reliability must be evaluated. Philippe EUDELINE

Foreword 2

The World of Harsh Environments and High Reliability Demands: Challenges and Solutions The importance of quality and reliability to a system can hardly be disputed. Product failures in the field inevitably lead to losses in the form of repair costs, warranty claims, customer dissatisfaction, product recalls, loss of sale, and in extreme cases, loss of life. Along with continuously increasing electronic content in vehicles, airplanes, trains, appliances and other devices, electronic and mechanical systems are becoming more complex with added functions and capabilities. Needless to say that this trend is making the jobs of design and reliability engineers increasingly challenging, which is confirmed by the growing number of automotive safety recalls. These recalls are triggering an increasing number of changes for preventative measures with OEMs and government regulators producing a number of functional safety standards and other government and industry regulations, all demanding unprecedented levels of quality, reliability and safety in future electronic systems. Besides the human life aspect of safety recalls, these automotive campaigns cost millions or sometimes billions of dollars, which can eventually put a company out of business. The present book Reliability of High Power Mechatronic Systems, edited by A. EL HAMI, D. DELAUX and H. GRZESKOWIAK, is intended to expand our knowledge in the field of reliability in general and in Automotive and Aeronautical applications in particular. New developments in the automotive industry are focusing on three major directions: vehicle autonomy, connectivity and mobility. This brings forward

xvi

Reliability of High-Power Mechatronic Systems 2

further challenges and the need for further advancements in the areas of software reliability, automotive vision systems, vehicle prognostics, driver behavior, cyber security, advanced driver assistance systems, sensor fusion, machine learning and other related fields. On top of that, the ever-increasing demand for “intelligent” safety features and improved comfort in vehicles has led to a corresponding boom in mechatronics. The mechatronic systems (fusion of mechanical, electronic and computer systems) presented in this book are revolutionizing the automotive industry. Application of these devices in the automotive, aerospace, defense and other industries, where products are expected to be subjected to harsh environments such as vibration, mechanical shock, high temperatures, thermal cycling, high humidity, corrosive atmosphere and dust, adds another layer of complexity to the product design and validation process. The goal of meeting the product specifications and the need to assess the future product’s reliability even before the hardware is built, brings forward the importance of understanding the physics of how devices work and especially how they fail. Physics of Failure (PoF) is a necessary approach to the design of critical components which often utilizes accelerated tests based on validated models of degradation. This understanding of failure modes and failure mechanisms is critical to a successful Design for Reliability (DfR) process as opposed to a more conventional test-analyze-and-fix approach which is still often practiced in many industries. DfR is the process of building reliability into the design using the best available science-based methods, which is quickly becoming a must in the age of relentless cost cutting and development cycle time reduction. In the quest to reduce carbon emissions and save energy, the production of hybrid and electric vehicles has been continuously growing, accelerating further development of power electronics systems. This combined with the development of self-driven vehicles, will require more powerful advanced Integrated Circuits (ICs). The large packages and higher power dissipation of these advanced ICs present thermal and thermo-mechanical expansioncontraction fatigue challenges. The continuous trend of ICS’ feature size reduction potentially presents a reverse trend in reliability and longevity of these devices. Smaller and faster circuits with an increasing number of transistors cause higher current densities, lower voltage tolerances and higher electric fields, making ICs vulnerable and more susceptible to wear-out type failure mechanisms. In applications of variable frequency motor drives applications commonly used in hybrid and electric vehicles, Insulated Gate Bipolar Transistor

Foreword 2

xvii

(IGBT) modules are widely used power semiconductor devices. The fast switching characteristics make IGBT-based converters more and more attractive for a variety of power electronics applications. The severe environmental conditions and the stringent requirements in terms of system availability and maintainability impose high reliability levels on single IGBT modules. An important requirement covered in this book (Volume 1: Simulation, Modeling and Optimization) is the ability to withstand power cycles. Hybrid and Electric vehicles experience a large number of power cycles (up to a million) during their life time with high voltage and/or high current and heavy transient loadings which cause temperature changes, leading to mechanical stresses that can result in a failure. Hence IGBTs are susceptible to thermo-mechanics activated failure mechanisms, in particular to the bond wire lift-off mechanism, leaving room for reliability improvement of IGBTs in a number of applications. The reliability of individual components was brought back into focus with the proliferation of functional safety standards, where reliability prediction based on the failure rates of the individual components is required to assess the Safety Integrity Level (SIL) or ASIL for Automotive standards. It is important to note that meeting these SIL requirements often requires the failure rates to be in single or double digits FIT (failure per billion hours), which is 1–2 orders of magnitude lower than what was expected a couple of decades ago. Despite continuous growth in automotive electronics, the consumer electronics industry is still the main driver of the IC market. This presents an additional challenge to the “harsh environment industries”, creating situations where it is often difficult to find automotive grade parts suitable to withstand high temperatures, vibration and other environmental stresses. This forces design engineers to search for new solutions, often adding air or liquid cooling of high power dissipating ICs, hence increasing the complexity of the system and making it difficult to accelerate the testing of these systems. In some of the applications the maximum operating temperatures are now approaching maximum allowable temperatures for the operation of silicon circuits, thus making test acceleration more difficult and sometimes impossible. Therefore, degradation analysis and prognostics may become the way of product testing and validation in the future. Overall, the limitation of physical testing will lead to more reliance on modeling, requiring better understanding of internal design, failure modes and failure mechanisms, many of which are covered in volume 2 of this book (Volume 2: Issues, Testing and Analysis).

xviii

Reliability of High-Power Mechatronic Systems 2

However, these apparent difficulties and the ever growing list of engineering challenges also makes the job of a reliability professional more motivating and exciting. It also facilitates growing influence of reliability professionals on the decision-making process during a product development cycle. However, despite its obvious importance and the continuous expansion of engineering knowledge, quality and reliability education is paradoxically lacking in today’s engineering curriculum. Very few engineering schools offer degree programs or even a sufficient variety of courses in quality or reliability methods. Therefore, the majority of reliability and quality practitioners receive their professional training from their colleagues on the job, professional seminars, journal publications and technical manuscripts, like this one. We hope that the readers will find this book helpful in exploring the expanding field of mechatronics and power devices, understanding how they work and how they fail, and ultimately helping them meet the numerous reliability and design challenges their industries will face for years to come. Andre KLEYNER

Preface

In relation to the perpetual search to improve industrial competitiveness, the development of the methods and the tools for the design of products appears to be a strategic necessity in relation to the crucial need for cost reduction. Nevertheless, a decrease in the cost of design should not impair the reliability of the new systems proposed which also need to progress significantly. This book seeks to propose new methods that simultaneously allow for a quicker design of future mechatronic rupture devices at a lower cost, to be employed in the automotive and aerospace industries, all the while guaranteeing their increased reliability. On the basis of applications for new, innovative products, “high power components and systems”. The reliability of these critical elements is further validated digitally through new multi-physical and probabilistic models that could ultimately lead to new design standards and reliability forecasting. As such, this book subscribes to the field of embedded mechatronics, which can be understood as a key element in the competitiveness of companies located in the automotive and aeronautical sectors. This technology combines mechanics, electronics, software and control-command. The combination of these technologies results in mechatronic systems. A system is a complex set of functions subject to randomness (triggering systematic errors, bit flips, hardware failures), which provides a defined

xx

Reliability of High-Power Mechatronic Systems 2

service regardless of its internal state, the state of its environment and the level of stress applied. The functional structure of systems (software and hardware) has become de facto complex and variable. Preventing and eliminating mistakes is an expected part of the development and verification processes. The potential causes of failure are manifold. They relate to hardware, software and development environments. Non-consistency, combinations of latent or dormant errors, depending on the state of the system and the complexity of the applications, make analysis difficult. The processing of errors (detection and recovery), at the cost of increasing the complexity levels of the system, brings about a better likelihood of good behavior of the system. The evaluation of the reliability performance (part of the RAMS performance of a product) of complex embedded systems requires the development of new approaches. In systems that integrate software, the reliable structure of functions depends on the software. The search for event sequences leading to system failure must therefore involve both software and hardware. The method should contribute to the qualitative and quantitative analysis of the safety of these systems and microsystems. The modes of failure of critical components and mechatronic systems, to date, remain largely uncontrolled. To improve the current level of knowledge we will focus on the study of five mechatronic systems representative of the industrial participants: two inverters, two converters and one tuner. To increase the competitiveness of their mechatronic devices, automotive and aeronautical equipment manufacturers need to innovate both the design and assembly processes in order to reduce product development time. In addition, these innovative products must combine excellent functional and operational performances, including that of reliability, in order to fully meet global market expectations. Expectations for reliability among car manufacturers will certainly increase with the strong market penetration of electric or hybrid vehicles projected on the 2020–2025 horizon to be 10% of the market. To these expectations of operational reliability there is the added need to quickly remove the risks of immaturity associated with product innovations.

Preface

xxi

This need is strongly linked to the minimization of vehicle development times. In the field of aerospace, the requirements primarily relate to the forecasting and control of costs resulting from failures which occur during the commissioning, warranty period and operation of the aircraft. In future, new contracts for the sale of aeronautical equipment will increasingly focus on sales at the time of operation. Although the aerospace sector has relatively low production volumes when compared to that of the automobile sector (in terms of the number of units per product type), the financial stakes are higher and, in fact, aeronautical manufacturers oversize their mechatronic components so as to give a little leeway in the event of misunderstanding the true nature of the problem at hand. Better prediction of the associated failures and risks would help to better address the three main challenges facing the aerospace industry: – improvements on reliability as represented by a decrease in the rate of removal and a reduction in the maintenance cost as represented by the Direct Maintenance Cost (DMC); – improving the detection of reliability problems in order to avoid the problems inherent to detections being made too late in the life cycle of the mechatronic systems, and the associated risk of having to repeat some of the equipment design in the case of structural weakness; – preparing for the future, with expert mastery over reliability, service life and equipment replacement. The intensification of competition in all industrial sectors, particularly in the automotive sector, leads to ever-increasing and increasingly complex technological content of products and thus to the acceptance of an increase in risk-taking in terms of reliability. In the automotive world, several vehicle launches have been compromised as a result of such a problem. Although recall decisions are becoming increasingly commonplace within the industry, they are nevertheless a relatively recent phenomenon. As evidenced by the events of the 1970s concerning Ford and General Motors, an era in which manufacturers were still very reluctant to recall vehicles suspected of having major defects. The cases of the Ford Pinto and the Chevrolet Malibu, whose tanks had serious design flaws that might have caused vehicle fires in the event of a collision, even at very low speeds, finally seem to be a thing of the

xxii

Reliability of High-Power Mechatronic Systems 2

past. The decision to not proceed with massive recalls after the discovery of defects followed a detailed cost-benefit analysis [SCH 91]. This weighed the potential technical cost of the recall with that of the potential cost of judicial damages. The terms of such arbitrations have now changed due to better consumer information and greater severity of court decisions that favor victim compensation, which in turn can lead to severe penalties against the unscrupulous manufacturer. Taking these changes into account, and combined with a surge in vehicle malfunction (in particular due to the massive introduction of electronics), has led to a certain trivialization of recall campaigns. For example: – in 2013, during the month of April alone, some 3.39 million Japanese cars, from brands including Toyota, Nissan, Honda and Mazda, were recalled worldwide because of a potentially faulty passenger airbag; – the American manufacturer Chrysler also announced the recall of 30,000 models of SUV dating from 2012; – furthermore, the Japanese car manufacturer Mitsubishi announced the recall of 4,000 of its electric and hybrid vehicles. The cost of such recall campaigns is difficult to estimate. As an indication, the firm Volkswagen has calculated that the recall cost of 384,000 vehicles equipped with DSG gearboxes could be around $1,500 per vehicle, giving a sum total of $600 million, and this is without even taking into account the harm to the group’s reputation. Faced with the potential costs of such campaigns and their consequences, the precautionary solution is an increase of investments upstream, in the production of systems and components in order to improve their reliability, and as a result, the reliability of their cars. A mechatronic system is a physical action controlled by a smart black-IT box. A simple and well known example is the Electronic Stability Program (ESP) or EBV (Elektronische Bremsen Verteilung), which is offered by all car manufacturers. The range of components that fall within its scope of application is broad: – low and high power autonomous actuators; – various types of sensors (pressure, temperature, imaging, etc.); – energy conversion, storage and management;

Preface

xxiii

– active and passive components; – control laws and embedded software; – communication systems, including wireless technologies etc. The undeniable challenge for these two major strategic industrial sectors is thus: – for the automobile market, according to a study published in the Grandes Ecoles Magazine no. 54, “the world market for carbon-free vehicles is rapidly expanding. With 4.5 million vehicles to be created by 2025, France is expected to generate €12 billion a year, and reduce its CO2 emissions by 3%, and its imports of fossil fuels by 4 million tons of oil. The experts at Mov’eo estimate that components and systems will be directly impacted by the FIRST-MFP research program and will constitute 10% of the cost of an electric vehicle, or €1.2 billion per year; – concerning the aeronautical market, according to Safran estimates, the global market for aeronautical electronics could, in the long term, reach 4 to 5 billion dollars. The firm decision details this impact through the study of the weight of electronics in aircraft: “Electronics represents 6% of the cost of an A320 civil aircraft (€3.7 million) and 10% of the cost of an aircraft like the A380 (€20 million), and the average annual growth of electronics would be more than 6%.” A low-end estimate of the impact from the FIRST-MFP program on aeronautical electronics is 5%, that is 15 to 20 million euros. These two volumes are dedicated to the Reliability of High Power Mechatronic Systems. Volume II is dedicated to Automotive and Aerospace Applications – Experimentation and Failure Analysis. Chapter 1 discusses accelerated life testing. In the case of very reliable materials, failures are rare events, and it is often difficult to find a sample of the product that includes instances of failure merely by analyzing feedback from experience (or performing dedicated reliability tests) in order to estimate the reliability parameters. Success often varies wildly as a function of the size of this sample. Thus, the need to understand the behavior of a long-life product before it is commissioned frequently leads to so-called accelerated life testing during the development, qualification or even production phases. This chapter presents the methodology of accelerated life testing, as well as the methods and tools used to exploit accelerated life tests, and gives a selection of examples.

xxiv

Reliability of High-Power Mechatronic Systems 2

Chapter 2 presents highly accelerated testing. The idea of highly accelerated testing is already a few decades old, and belongs to the family of pro-active testing strategies: it aims to maximally exploit the potential of the technology available at any given moment. Approaching this point means approaching the international state of the art, or in other words what we refer to as product excellency. This chapter describes the methodology of highly accelerated testing at a theoretical level. Readers may refer to the Appendix “HALT-HASS methodology”1 for a detailed description of how the HALT and HASS approaches are executed in a laboratory that offers this type of service. The chapter concludes by comparing accelerated life testing and highly accelerated testing. Chapter 3 discusses the reliability of the components themselves: a new generation of cuboid aluminum capacitors with liquid electrolyte. In many areas, the market for high-power electronic embedded systems requires them to be highly compact and reliable. To strike the right compromise, manufacturers use component technologies that are increasingly compact with well-defined lifetimes in the conditions specified by the supplier. However, in most cases, the reliability of the component depends on the operational profile of the system. In our case study, the technology that was considered is that of aluminum capacitors with liquid electrolyte, built in a compact cuboid case. Properly understanding the reliability of this technology is a necessary step toward ensuring that the highpower electronic system operates properly. To this end, we conduct a reliability study that begins by studying this technology in order to identify the parameters that should be monitored when performing aging tests on this component. The data provided by this study are then used to establish a deterioration model as a function of the operational conditions. Chapter 4 discusses the reliability of the components of a new generation of film capacitors. The reliability of high-power mechatronic systems fundamentally depends on the reliability of film capacitors. This chapter provides the reader with an overview of the various technologies at play in film capacitors and presents the intrinsic and extrinsic parameters that affect the reliability performance of these components. In order to evaluate this performance, a section on accelerated life tests and highly accelerated tests presents a methodology for estimating the expected and experimental reliability. A case study of an accelerated life test is presented to provide an example of a detailed and illustrated scientific approach. The final section engages in a constructive discussion of the prospects of estimating the reliability of these kinds of component.

1 Appendices to this book can be found at ww.iste.co.uk/elhami/mechatronic2.zip.

Preface

xxv

Chapter 5 presents reliability and qualification tests for high-power MOSFET transistors. Silicon (Si) and silicon carbide (SiC) MOSFET transistors have multiple weak points, resulting in multiple failure modes and mechanisms. To study the reliability of these transistors using the nondestructive technique described in Chapter 8, applying an accelerated aging process to these components is essential. This chapter begins by describing the various failure mechanisms of Si and SiC MOSFETs. To allow the most suitable aging test in terms of simplicity and speed to be chosen, this chapter then focuses on the reliability and qualification tests that can be applied by manufacturers. Finally, we present the results of accelerated qualification tests applied to Si and SiC MOSFET transistors. These components are analyzed in Chapter 8 using the technique of spectral photoemission microscopy. Chapter 6 explores the process of diagnosing faults in DC/DC converters. This chapter discusses fault detection and identification (FDI) in a buck DC/DC converter. A robust FDI methodology is proposed for contexts with bounded errors, taking into account the uncertainties in the parameters of a basic converter model. We begin by presenting one method of varying the parameters of a basic DC/DC converter over an interval (linear parameter varying [LPV]). In extreme environments, the components of the systems can undergo changes, and so our proposed LPV model must account for these changes. This chapter presents an interval predictor for detecting and identifying multiple defects in LPV systems, developed by recent work. The proposed methodology is illustrated with the results of simulations. Chapter 7 is dedicated to construction and characteristics analysis. The failure of an electronic component may be described in terms of the deterioration of its electrical characteristics and is caused by modifications in the physical and chemical properties of its constituents and/or the emergence of defects. This chapter begins by discussing an experimental approach that allows these modifications to be analyzed throughout the life of a component. A description of the physicochemical techniques commonly used in laboratories is then given. Finally, Chapter 8 presents failure analysis by spectral photoemission microscopy. Spectral photoemission microscopy is a semidestructive method of failure analysis that aims to locate defects and identify failure modes. Previous generations of equipment used for this type of analysis, such as spectrometers, filters or prisms, had inherent limitations. In this chapter, we present a spectral photoemission system that allows the photoemission spectra to be rapidly derived by means of a diffraction grating placed in the

xxvi

Reliability of High-Power Mechatronic Systems 2

optical path of a photoemission microscope. Using the information obtained from this system, we can identify a spectral signature that is specific to each defect and can be correlated with each failure mechanism. The method of calibration of this system is presented and the results obtained with different transistor technologies are compared.

Abdelkhalak EL HAMI David DELAUX Henri GRZESKOWIAK July 2017

1 Accelerated Life Testing

In the case of very reliable materials, failures are rare events, and it is often difficult to find a sample of the product that includes instances of failure merely by analyzing feedback from experience (or performing dedicated reliability tests) in order to estimate the reliability parameters. Success often varies wildly as a function of the size of this sample. Thus, the need to understand the behavior of a long-life product before it is commissioned frequently leads to so-called accelerated life testing during the development, qualification or even production phases. This chapter presents the methodology of accelerated life testing, as well as the methods and tools used to exploit accelerated life tests, and gives a selection of examples. 1.1. Introduction This chapter begins by giving an overview of the different types of test. This overview is further expanded in one of the appendices1, which additionally discusses virtual testing. The principles, general ideas and implementation of accelerated life testing are then presented, with emphasis on accelerated life tests based on physical models. However, we also define experimental and statistical models and give a wide selection of bibliographical references. We shall now present the methods and tools used to take advantage of accelerated life testing.

Chapter written by Laurent DENIS, Henri GRZESKOWIAK, Daniel TRIAS and David DELAUX. 1 Appendices to this book can be found at www.iste.co.uk/elhami/mechatronic2.zip.

2

Reliability of High-Power Mechatronic Systems 2

1.2. Types of test To justify the compliance of a product definition with the specified needs, and to justify its productability, we can use three following types of demonstration: – calculations; – simulations; – tests. Note that there exists a fourth type based on similarity analysis (which is not discussed in this book). 1.2.1. Calculations This is the manual approach to simulation: when analytical formulations are available in a given situation, we should not hesitate to use them to evaluate the loads that arise from applying the functional or environmental stresses associated with the product. These calculations, provided that they are performed in domains that have been validated by experience, can be used to demonstrate the suitability of the product design. 1.2.2. Simulations The term “simulation” encompasses all demonstrations that use computers to evaluate the expected performance of a product given a specific product definition. The description of this definition is typically given in computational terms and assumes a specific usage environment (for example evaluating the aerodynamic performance of a fighter plane at a certain altitude, the dynamic behavior of a circuit board under the effect of sinusoidal vibrations, etc.). Models can be used to represent: – either the product itself: - partial or full representation of the product; - simplifications of the actual product; - representation types (3D, 2D, 1D, etc.);

Accelerated Life Testing

3

- physical and chemical constitution of the product (materials, chemical composition); - discretization of the product (geometry, mesh, sampling); – or the phenomena experienced by the product: - nature of its interactions with the environment; - models describing the spatial distribution of phenomena (uniform, point-based, variation laws, etc.); - models describing the time-evolution of phenomena (steady state, harmonic, transient, evolution laws, initial conditions); - the boundary conditions of the product; – or the behavior of the product when subjected to stresses: constitutive laws and related equations (linear, nonlinear, laminar or turbulent flow, deterministic or probabilistic, etc.). 1.2.3. Tests The term “test” refers to any demonstration based on specimens that are compliant, to varying degrees, with the definition that is being qualified, imple-mented in conditions that are more or less representative of the operating conditions of the product. Depending on the representativeness of the specimens with respect to the definition that is being qualified, we can distinguish the following classes, among others: – fine-tuning tests using models, which demonstrate the suitability of a certain technological choice, specify the configuration or certain parameters, or identify potential design flaws at the earliest possible opportunity; – qualification tests using prototypes that are representative of the definition being qualified. Depending on the level of the tested component within the product tree, we can distinguish: – system tests, which consider the entire system or some major part of the system; – product tests, which consider one of the system components at any level in the product tree.

4

Reliability of High-Power Mechatronic Systems 2

Depending on the breadth of the functional area being tested, we can distinguish: – basic tests, which are used to verify one or several functional characteristics (for example reliability tests, tests of servo-motor commands as a function of inertial characteristics, etc.); – global tests, which are used to verify the set of all functional characteristics or a major subset of these characteristics (for example functional verification tests, etc.). If the objective of the test is specifically to demonstrate that the product operates properly in certain environmental conditions, it is called an environmental test, which is a subclass of basic tests. Different kinds of test can overlap, and so a product test can be a basic qualification test (since it only considers one element), and a system test can be a global qualification test (since it considers the system as a whole). 1.2.4. Links between three types of demonstration The three types of demonstration are not exclusive, but rather complementary. For example, when validating a computer-based model, the calculations can be enhanced by conducting a few basic tests focusing on the characteristics that were identified as most critical. In extremely broad terms, we should remember that: – calculations should be used in areas that have been validated by experience and when the goal is to show that the rules derived from past experience are properly applied; – simulations should be used when credible computer-based models (validated by experience) are technically possible and when the operating conditions are difficult to reproduce experimentally (for example behavior of a multitarget air-to-air missile system in extremely harsh electronic warfare conditions, technical behavior or worst-case simulations of an electronic system, etc.); – tests should be used for new areas of application and/or products (new design and/or new components and/or implementation procedures) and/or when other types of demonstration are inadequate (i.e. insufficient, etc.). Tests are especially recommended for so-called revolutionary products (innovations in the design or technology or usage conditions, etc.), whereas

Accelerated Life Testing

5

calculations and simulations are more relevant for validating evolutionary products (low innovation) that strongly rely on feedback from experience. REMARK 1.1.– Hybrid simulations combine simulations and tests with real materials. REMARK 1.2.– There is another type of demonstration by analogy (for example demonstration of mushroom resilience by analogy with other products using the same materials). This is not considered in this book. REMARK 1.3.– Tests are also used to validate mathematical models, enabling studies to be performed in areas where testing is difficult. REMARK 1.4.– A more detailed presentation of each type of test is given in the appendix. 1.3. Overview of accelerated life testing In the case of very reliable materials, failures are rare events, and it can be impossible to find a sample of the product that includes instances of failure merely by analyzing feedback from experience in order to estimate the reliability parameters. Success often varies depending on this product sample. Thus, the need to understand the behavior of a long-life product before it is commissioned frequently leads to so-called accelerated life testing during the development, qualification or production phases. The principle of accelerated life testing is to subject the product to environmental and operating stresses that are more intense than those expected during normal operation with the goal of estimating the behavior characteristics (reliability laws, operational performance, etc.) of the product in normal usage conditions from these accelerated usage conditions within time frames compatible with the scheduling constraints of the development phase. Extrapolating the service life in normal conditions from these accelerated (or intensified) conditions may be achieved using a relation known as the acceleration relation (see [NEL 90, OCO 03, CAR 98, VAS 01]). The following conditions are necessary for accelerated life testing: – the applied stress and their values must be chosen in such a way that the stress levels remain below the technology limit values;

6

Reliability of High-Power Mechatronic Systems 2

– the physical deterioration phenomena must be understood (or at least modeled); the accelerated failure mechanisms should be representative of the failure mechanisms in normal usage conditions; – the analytical model governing the relation between the speed of deterioration and the stress amplitudes must be assumed (or modeled); – the values of the parameters of these models must be known (model and parameter estimation). Among other things, accelerated life tests allow us to: – accelerate damage mechanisms: - reduce the time required to estimate some behavioral characteristics of the product in normal usage conditions (and rapidly determine the operational reliability of the product, see Figure 1.1); - measure the effect of environmental and operating stresses on the product over the course of its life cycle; – guarantee design margins.

Figure 1.1. Evolution of the failure rate in function of test duration for the two approaches: accelerated tests vs. normal life For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

There are several different classes of accelerated life models (ALMs) (see Figure 1.1): 1) experimental models determined by experimental designs [DEM 95, DEM 97, DEM 99]; 2) physical models defined using the underlying physics of the deterioration models (chemical, mechanical, etc.) [BON 77, LAL 84, LAL 97, LAL 99, BON 15, AST 09, AST 10, BOI 00, DEL 06, DEW 86, DAS 98] ;

Accelerated Life Testing

7

3) statistical models characterized by parametric, semiparametric and nonparametric approaches [VIG 00, PER 06, TRI 06, GUE 01, GUE 05, TEB 03, NEL 90, BAG 95b, BAG 97, BAG 01, BOM 73, DEV 98, FOR 61, VAL 16, VAS 01, ZHA 02, OWE 98, BAS 82].

Figure 1.2. Different types of accelerated life tests [GUE 05]

Experimental models are based on design of experiments (DOE). A DOE that leads to a reduction in the number of experiments needed to achieve a given objective gives an accelerated path to obtaining the desired knowledge. An example implementation of an experimental design is given in the Appendix “Example implementation of an experimental design to optimize a highly accelerated life testing protocol”. The bibliographical references given above are also useful. 1.3.1. Statistical models Parametric estimation of an ALM involves finding a statistical law to characterize the reliability; in the case of fatigue, the most appropriate law for characterizing reliability is the normal distribution. Consequently, certain validity conditions need to be satisfied in order for the results of a parametric model to be reliable. The semiparametric approach does not assume any specific distribution. Non-parametric models are not based on statistical distributions. They can therefore be used even if the validity conditions of parametric models are not satisfied. Non-parametric models are more robust than parametric models, which in other words means that they can be used in a greater number of situations. Parametric models, on the other hand, are typically more powerful than their non-parametric equivalents, and so are more likely to be successful.

8

Reliability of High-Power Mechatronic Systems 2

The two examples given in section 1.7 use statistical models. Applications to electrolytic capacitors and films are presented in Chapters 3 and 4 of Volume 2, respectively. The bibliographical references given above may also prove useful. 1.3.2. Physical models In this chapter, we shall limit ourselves to the description of accelerated life tests based on physical deterioration models. The bibliographical references given above will continue to be relevant in the following. 1.4. Principles, methodology, implementation of accelerated life testing The goal of this section is to explain and clarify the nature of accelerated life testing, its objectives, how it should be defined beforehand and which types of test should be performed in order to guarantee that a system is reliable in a given application. This chapter is not intended to be exhaustive, and we shall only focus on four mathematical models for accelerated life testing, namely Arrhenius, Coffin–Manson, Norris/Landzberg–Norris and Peck. However, we shall give minimally sufficient explanations of the process of guaranteeing the reliability of a system. 1.4.1. Definition and important concepts The reliability of a system is the probability that an entity fulfills its specified mission over a given period of time and given certain usage conditions. Reliability analysis can be conducted at several different levels, either globally (considering the system as a whole) or partially (considering the constituent components of the system). This leads to two different notions of mathematical probability: – R(t): Reliability (or survival) function; this is the probability that there is no failure at time “t”; – F(t): Failure (or distribution) function; this is the probability that there is a failure at time “t”.

Accelerated Life Testing

9

Figure 1.3. Graphical representation of Reliability and failure functions

This gives a curve of the form: – R(t) leads to another important idea: the failure rate: λ(t); – λ(t) is the probability of failure (number of failures) per unit time. If the failures occur randomly, then λ(t) is constant, which results in an exponential reliability law: R(t) = e Figure 1.4 shows a graphical representation of constant λ(t):

Random failures

Figure 1.4. Graphical representation of probability of failure with random occurence

There are two other cases as follows: – λ(t) is decreasing: we say that there are early failures (or infant mortality);

10

Reliability of High-Power Mechatronic Systems 2

Early failures

Figure 1.5. Graphical representation of probability of failure when infant mortality occurs

– λ(t) is increasing: we say that there are wear-out failures.

Wear-out failures

Figure 1.6. Graphical representation of probability of failure with wear-out failures

A graphical representation of λ(t) known as the “bathtub curve” is commonly used to visualize the failure rate of electronic products. Failure rate

Time Early failures

Random failures

End of life

Figure 1.7. Graphical representation of so called bathcurve

Accelerated Life Testing

11

When λ(t) is constant, we can define the mean time to failure (MTTF) as: MTTF = The MTTF and the mean time between failure (MTBF) are usually given in units of hours or years. The concepts of MTTF and MTBF are not equivalent to the concept of product lifetime. Indeed, when λ(t) is constant, R(t) = e . At time t = MTBF ÎMTBF = Î R(t) = 0.37 or in other words 63% of products will have failed. Note that this should not be confused with the concept of MTBF, which is only applicable to repairable systems. λ(t) is typically expressed in units of failure in time (FIT) corresponding to one failure per billion hours (10–9/h), or in units of parts per million (ppm)/year. Some references use units of failures per million hours (10–6/h). 1.4.2. Evaluating the predicted reliability of a system by performing tests The predicted reliability of circuit boards or electronic systems can be evaluated by performing tests. In industry jargon, these tests are known by various names, such as reliability tests, endurance tests and durability tests. However, they are typically not examples of reliability tests in the sense defined in Appendix A1, but rather accelerated life tests. These methods are commonly used by component manufacturers. They consist of performing tests in accelerated conditions (to reduce the duration of the tests) that are considered to be representative of a certain lifetime in certain other conditions. These methods can also be used for electronic circuits. If we assume constant λ, which corresponds to the case of an exponential mortality distribution, the χ2 distribution allows us to estimate the failure rate. The goal is usually to achieve zero failures, but other options are possible (degrees of freedom). The choice of confidence level, which affects the evaluation of λ, is made as a function of the risk function and often depends on the application (safety-related or not). Indeed, if we choose a confidence

12

Reliability of High-Power Mechatronic Systems 2

level of 90%, there is only a 10% chance that the actual λ is greater than the calculated value. Therefore, this strongly affects the increase in the final value of λ. DEFECT RATE

χ (a, 2r + 2)10 2

λ=

Fit

9

2.Ta × Du × AF

K: χ2 coefficient Ta: sample size Du: Stress duration (h) AF: Acceleration Factor

Chi-Square ( χ ) Distribution Functions Confidence Level (α) d.f. (2n+2) 60% 90% 2 1.833 4.605 4 4.045 7.779 6 6.211 10.645 8 8.351 13.362 10 10.473 15.987 12 12.584 18.549 14 14.685 21.064 16 16.780 23.542 18 18.868 25.989 20 20.951 28.412 22 23.031 30.813 24 25.106 33.196 26 27.179 35.563 28 29.249 37.916 30 31.136 40.256 2

Failures (n) 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Table 1.1. Confidence interval associated with failure rate estimation from tests

Most component manufacturers use confidence levels of 60% or 90% in the automotive industry. Some component manufacturer sites allow this value to be adjusted by the user (online calculators). To obtain a low λ in line with the objectives, the denominator in the formula needs to be as large as possible (number of cumulative equivalent operating hours). We can vary: – Ta: the number of samples to be tested;

Accelerated Life Testing

13

– Du: the duration of the test; – acceleration factor (AF): this parameter depends on the relation between the test conditions and the usage conditions. The test conditions are limited by the specifications of the component. 1.4.3. Accelerated life tests (based on a physical model): example of temperature acceleration In most cases, the lifetime is dictated by the application (e.g. 15 years for cars, 30 years for rail vehicles). However, leaving the system to operate for 10 years in order to verify whether it meets the constraints placed upon its life profile is not a viable strategy. In order to reduce the duration of testing, we can increase the applied stress (without exceeding the specifications of the weakest components to avoid creating unwanted failures) in order to create an AF and therefore “age” the system more rapidly. 1.4.4. Evaluating the predicted reliability of a system for a given lifetime and with given environmental constraints So far, we have seen that performing tests allows us to calculate the predicted failure rate (used by component manufacturers), but that this method requires extremely high numbers of parts. It is relatively easy for component manufacturers to increase the number of parts, since they can simply aggregate monitoring tests during the manufacturing processes in order to obtain a total cumulative time of several millions of hours. But this process is more difficult to implement with electronic circuit boards, and even more difficult to execute at the system level due to the cost of the materials and monitoring processes, as well as the required duration of the tests. For systems with high reliability targets, the difficulty of evaluating the reliability by means of testing increases even further. But does this mean that we should simply do nothing? Even if testing does not tell us about the objective failure rate, tests can nonetheless be used to validate specific technological choices and as reassurance. Of course, any such tests should naturally be integrated into a comprehensive reliability process. Among other things, this process should include a study of the mission profile (analysis of the stresses that will be experienced by the system). Based on this, we can define reliability tests

14

Reliability of High-Power Mechatronic Systems 2

according to this mission profile. Some of these tests can be accelerated, as mentioned above, using the Arrhenius law. The most commonly used accelerated life tests rely on: – thermal cycles; – temperature; – temperature and humidity; – thermal cycling. Thermal cycling targets failure mechanisms related to thermomechanical stress (deformations resulting from differences in the expansion coefficients of materials such as PCB, solders, bondings, etc.). To calculate the AF for thermal cycling, the Coffin–Manson model is often used: AF = (ΔT test/ΔT use)m where: – AF = acceleration factor; – ΔT test = thermal gradient applied during the test; – ΔT use = thermal gradient experienced by the system during operation; – m = fatigue or Coffin-Manson exponent (depends on the material, 2–3 for solders). Depending on the definition of the mission profile, the formula for calculating the thermal cycling period is: number of thermal cycles = ((Fu) × (NJ) × (DVA))/AF where: – Fu = assumed number of cycles per day (e.g. for a car: the engine is stopped 6 times per day => 6 thermal cycles per day); – NJ = number of days in the year; – DVA = lifetime dictated by the application (years). Other more complex formulas derived from Coffin–Manson are also used (for example Norris–Landzberg):

Accelerated Life Testing

15

ACCELERATION IN THERMAL CYCLING

where: – AF = acceleration factor; –*AF1 is the Arrhenius model; – DTs = thermal gradient applied during the test; – DTu = thermal gradient experienced by the system during operation; – Fu = number of cycles per day during operation; – Fs = number of cycles per day under stress. The standard values for q and P are, respectively, 1/3 and 2 (tin/lead solders). The test duration may then be calculated as: Test duration (hours) = lifetime dictated by the application (hours)/AF *The lifetime dictated by the application is obtained by analyzing the mission profile. 1.4.5. Damp heat Damp heat tests are used to reveal failure mechanisms related to humidity with and without condensation (corrosion) and interactions between humidity and voltage (electrochemical migration). The model that is most commonly used to calculate the AF is Peck’s model: HUMIDITY ACCELERATION (PECK / BELL LAB MODEL)

16

Reliability of High-Power Mechatronic Systems 2

where: – P = 2.66: “humidity” exponent; – Ea = 0.8 eV; – RHu: humidity during operation; – RHs: humidity during the test; – k = Boltzmann constant. Once the AF has been determined, the duration of the test can be calculated. Test duration (hours) = lifetime dictated by the application (hours)/AF *The lifetime dictated by the application depends on the mission profile. 1.4.6. Temperature These tests target the failure mechanisms relating to active and passive components (e.g. chip failures). They can be performed either with powered-on or powered-off products (search for data retention issues for memory systems). One of the most commonly used tests is the LIFE TEST (or hightemperature operating life). This test operates the system at a higher ambient temperature than the ambient temperature during usual operation. It uses the Arrhenius aging model to 0

where: – Ea: activation energy of the failure mechanism (in eV). The usual default value is 0.7 eV; – K: Boltzmann constant = 8.6 × 10–5 eV/°k; – Tu: operating temperature of the product (according to the mission profile); – Ts: temperature under stress.

Accelerated Life Testing

17

The test duration is derived from the following formula: Test duration (hours) = lifetime dictated by the application (hours)/AF. 1.4.7. Accelerated life testing in practice Given the assumptions made when calculating the AFs (in particular the assumptions on the activation energy Ea), the results obtained from the calculations given an indication of the duration required for the tests. Ideally, we should choose a margin and maintain it for as long as possible (time and budgetary constraints allowing). Typically, in these reduced-time tests with low sample numbers, we do not expect to observe any instances of failure. If failure does occur, extensive failure analysis should be performed in order to understand the failure mechanisms at play, situate the failures on the bathtub curve (early failures, random failures, wear-out failures, etc.), and undertake the necessary improvements. Depending on the mission profile, other tests can also be defined (durability against vibrations, combined vibration and thermal cycling tests, temperature humidity and thermal cycling tests, etc.). The combination of all of these tests, including both accelerated and non-accelerated tests, gives the “qualification strategy” of the product. 1.4.8. Reliability mechanisms

assessment

to

find

wear-related

failure

Other reliability laws, such as the WEIBULL distribution, allow us to cover each of the different cases portrayed by the bathtub curve. This distribution is typically used for wear-out failure mechanisms, such as aging in mechanical parts and solder joints (for example, qualification of lead-free solders). Each test targets a specific failure mechanism. These tests require large sample numbers (minimum 15 but ideally 30) and need to achieve a minimum 50% failure rate in the products. They do however yield relatively accurate results. F(t) = 1 – exp (–(t/η)β) λ (t) = β /η × (t/η) β – 1

18

Reliability of High-Power Mechatronic Systems 2

The η parameter, or scale factor, represents the duration before 63% of products fail. The β parameter, or shape factor, describes the kinetics of the failure mechanism: – β < 1 => decreasing failure rate => early failures; – β = 1 => exponential distribution => useful life (constant λ); – β > 1 => increasing failure rate => end of life (wear-out failures). The log-normal distribution can also be used to describe wear-out failures such as electromigration in semiconductor tracks.

time

Figure 1.8. Weibull: Example of λ(t) curves obtained with different values of β (source: SERMA). For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

1.4.9. Conclusion This chapter presents the principle of reliability assessment based on the execution of tests, and explains how these tests should be conducted. Furthermore, we discussed how specific types of reliability tests may be more or less suitable for a given need, depending on the considered failure mechanisms. The choice of tests depends on the objectives (estimating a failure rate, estimating the end of life, validating technologies, etc.). Prior analysis of the mission profile is always an indispensable prerequisite.

Accelerated Life Testing

19

1.5. Methods and tools for exploiting accelerated life tests There are increasingly many standards available for designing accelerated life tests to validate the lifetime of various components (especially electronic components) in assembled systems for end-user usage scenarios and predefined operating conditions. The advantage of these standards is that they allow us to compare the results of any given test to a chart that determines whether the results pass the threshold of predefined conditions. The type, technology, quality (manufacturing processes, suppliers, etc.), usage and external conditions are all examples of coactive and mostly mutually independent variables that prevent us from directly applying these standards, especially in the case of mechatronic systems with “nextgeneration” components, which can have multiple competing mechanisms of age-induced failure. Furthermore, each standard or guideline has its own charts, so we must distinguish between them in order to determine which is the most appropriate in any given case. To counterbalance the standardized aspect of guideline-based approaches and more realistically portray the actual situation in the field, it is often advisable to undertake custom accelerated life testing strategies. This can, however, create difficulties with the interpretation or interpretability of the results if the strategy is poorly defined. A list of generic methods for taking advantage of the results of accelerated life testing is given below. We wish to emphasize that the trained eye and judgment of an engineer is highly recommended at every stage of the analysis, since the principles and models are sometimes based on unverifiable or unrealistic hypotheses. 1.5.1. SDA or survival data analysis 1.5.1.1. Homogeneity of the sample 1.5.1.1.1. Sample The objective is to study a data set collected from a population that cannot be exhaustively examined. In our context, the data is obtained by testing products that will be subsequently installed and operated in large numbers. The goal is therefore to collect enough information from the tests to extrapolate the results to all similar products that are soon to be released. Clearly, the richer the data collected beforehand, the smaller the uncertainty

20

Reliability of High-Power Mechatronic Systems 2

in the number of undesirable events identified during the study that will occur in practice. The assumption that the products must be similar leads to the notion of representativeness. 1.5.1.1.2. Representativeness As a simple parallel, it is difficult to justify a model for the occurrence of one failure mode rather than another based only on the occurrence of one single instance of failure. This is the first aspect of representativeness in the chosen sample: the sample must be sufficiently large to test a range of possible models. In general, any fewer than 5 points can lead to highly inaccurate results. Another aspect that we must consider is the usage (i.e. the usage conditions): if the product is intended to be used by a population that can be divided into segments (classification) with different stresses in each segment, the test design needs to reflect the proportions of these different types of end usage. One further important point is the repeatability of the results: the chosen sample must take into account any variability in the production processes experienced by the product. For example, if the prototypes used in the tests are manufactured/qualified under different conditions than the conditions used for future mass production, we need to know how to estimate the difference in order to recalibrate the results. 1.5.2. Data types Characterization of events: – time values: the time after which the undesirable event is revealed and/or the time at which the study ends; – numerical values: the number of events recorded over a given period of time. Censored or uncensored data: Does the event represent a failure or incomplete information (end of testing, etc.)? Failure mode/component: Ideally, we should be able to distinguish each failure mode or mechanism beforehand and possess the ability to identify the subelement responsible for the failure (if applicable) with the goal of

Accelerated Life Testing

21

performing a study for each subset of components and aggregating all information together after testing. This has the advantage of giving a model that is closer to reality, e.g. an assembly whose failure modes compete with each other to ultimately lead to a concrete failure. The distributions most commonly used for survival data are presented below. 1.5.2.1. Survival models 1.5.2.1.1. Exponential distribution The exponential distribution has a single parameter, λ, called the failure rate, with units of inverse time.

R(t ) = exp(−λ .t ) MTTF = 1 / λ

Median = ln(2) / λ = ln(2).MTTF = 0.694 MTTF

⎛ 1 ⎞ B p % = 1 / λ ln ⎜ ⎟ ⎝1− p ⎠ scale factor = 1/ λ

Figure 1.9. Survival model in the case of an exponential distribution. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

22

Reliability of High-Power Mechatronic Systems 2

1.5.2.1.2. Weibull distribution This is a reliability distribution with two parameters: the scale factor η in units of time (e.g. hours) and the dimensionless shape factor β.

⎛ ⎛t⎞ R(t ) = exp ⎜ − ⎜ ⎟ ⎜ ⎝η ⎠ ⎝

β

⎞ ⎟ ⎟ ⎠

⎛ 1⎞ MTTF = η .Γ ⎜ 1 + ⎟ β⎠ ⎝

Median = η.ln(2)1/ β 1/ β

Bp%

⎡ ⎛ 1 ⎞⎤ = η ⎢ln ⎜ ⎟⎥ ⎢⎣ ⎝ 1 − p ⎠ ⎥⎦

scale factor = η

Figure 1.10. Survival model in the case of a WEIBULL distribution. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

Accelerated Life Testing

23

1.5.2.1.3. Log-normal distribution The log-normal distribution has parameters that are defined based on the corresponding normal distribution: If X ~ N ( m, s ) , then T = exp( X ) ~ LN ( m, s ) If T ~ LN ( m, s ) , then X = ln( X ) ~ N ( m, s ) +∞

R (t ) =

∫ t

⎛ 1 ⎛ ln(t ) − m ⎞ 2 ⎞ exp ⎜ − ⎜ ⎟ ⎟⎟ dt ⎜ 2⎝ s ts 2π ⎠ ⎠ ⎝ 1

⎛ ⎛ s2 ⎞ s2 ⎞ MTTF = exp ⎜ m + ⎟ = exp(m).exp ⎜ ⎟ ⎜ ⎜ 2⎟ 2 ⎟⎠ ⎝ ⎝ ⎠ Median = exp(m)

B p % = exp ( m + s.Φ −1 ( p ) ) = exp( m ).exp( s.Φ −1 ( p )) scale factor = exp( m) : If X ~ LN (m, s ) then k . X ~ LN (m + ln(k ), s )

Figure 1.11. Survival model in the case of a Log Normal distribution. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

24

Reliability of High-Power Mechatronic Systems 2

Figure 1.12. Dialog window of the Reliasoft software program. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

1.5.3. Model fitting 1.5.3.1. Case with multiple competing lifetime distributions For any given data set, there are multiple possible coexisting models. Some of these models must be definitively ruled out, but the others can be viewed as existing in competition with each other under certain assumptions. Three complementary methods are available for comparing models: – by P-value: this is the probability that the tested hypothesis H0 (the model fits the datapoints) is wrongly rejected. This is obtained by applying the modified Kolmogorov–Smirnov test. The AVadequat value is its complement, 1 – P-value; – by the difference between the theoretical and empirical survival functions: MGRAPH is a graphical criterion based on the conventional way of calculating the normalized correlation coefficients. Higher values of the criterion indicate better fits; – by the likelihood value. This is the best-known method. In order to apply these methods in such a way that they are complementary, they are composed by assigning relative weights. One example is shown below. The competing model fits are calculated by multiplying each of the three values by their weights and summing then ordering the results.

Accelerated Life Testing

25

NOTE.– The choice of the best estimation method still depends on the representativeness of the available data; it can, however, be useful to combine the majority of the candidate methods to measure the convergence in their results. The same is true for the choice of method to calculate the empirical distribution function when searching for a suitable interpolation of the modeled variable.

Figure 1.13. Dialog window of the Reliasoft software program. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

Figure 1.14. Dialog window of the Reliasoft software program. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

26

Reliability of High-Power Mechatronic Systems 2

1.5.3.2. Principle of the goodness-of-fit test We can apply two goodness-of-fit tests, namely the modified Kolmogorov–Smirnov test and the chi-squared test. 1.5.3.2.1. Modified Kolmogorov–Smirnov test The parameters are unknown and must therefore be estimated from the data set. The goal is to define a function SN(t) describing the fraction of failures recorded per time span, constant between two consecutive recorded events. Here, the variable that we use, D, is given by the maximum of the absolute differences between SN(t) and the adjusted distribution function F(t). D = Max |SN(t) – F(t)| The test calculates the probability that D is larger than a threshold value listed in a (Kolmogorov) table, which depends on the sample size and the choice of confidence level. The higher the value of this probability, the greater the difference between the model and the data set. REMARK 1.5.– This test can be used with small samples. 1.5.3.2.2. Chi-squared test This test is based on grouping the data into batches, i.e. into time intervals like a histogram. This test does not calculate the maximum of the differences, but instead considers the arithmetic mean of the squares of the differences between the number of observed versus modeled events in each batch. The test returns the probability (P-value) that the obtained value is greater than a reference value (chi-squared table): the probability that the model does not fit the data set. Sturge’s rule – the number of classes k of a sample of size N: k = 1 + 3.322 × log(N) – is used to group the data in time. REMARK 1.6.– This test is less powerful than the previous test regardless of sample size. 1.5.3.3. Parameter configuration and confidence interval The confidence intervals of functions of the parameters (reliability function, failure rate, MTTF, etc.) are obtained using Fisher matrices and the Delta method.

Accelerated Life Testing

27

Figure 1.15. Dialog window of the Reliasoft software program. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

Figure 1.16. Dialog window of the Reliasoft software program

1.5.4. Deterioration 1.5.4.1. Theory It is sometimes extremely difficult to establish an accurate representation of the failure distribution, and in industrial environments it can sometimes prove impossible. Indeed, a sufficient quantity of survival data is required in order to establish a good estimate. Deterioration analysis is one way of overcoming this obstacle. It involves measuring a variable that is judged to be representative of the failure phenomenon in question: when its value reaches a certain threshold, the product is considered to have failed. The objective is therefore to predict the moment that this threshold is most likely to be crossed based on an array of prior measurements. This variable, the deterioration factor, must of course be directly involved in the failure mechanism, whether the measurement is taken directly on the object itself (non-destructive testing) or not (destructive testing). A selection of performance characteristics are used to accurately represent the

28

Reliability of High-Power Mechatronic Systems 2

phenomenon (e.g. measurements of the drift in the resistance of a dielectric material). It still remains necessary to collect deterioration data. Multiple units are required to represent the variability in the results since, just like in classical SDA, their confidence levels are intrinsically connected to the size of the test sample. Extrapolating the deterioration up to product failure then becomes relatively simple, and standard SDA can be used to supplement the approach.

Figure 1.17. Dialog window of the Reliasoft software program. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

1.5.4.2. Common parametric distributions The most basic deterioration models are as follows: Linear: y = a.x + b Exponential: y = b.e a.x Power: y = b.xa

Accelerated Life Testing

29

Logarithmic: y = a.ln(x) + b Gompertz: y = a + b c.x Lloyd–Lipow: y = a – b/x where y is the factor, x is time and a–c are model parameters. 1.5.4.3. Stochastic trajectory models The following models are the most commonly used. 1.5.4.3.1 Gamma process

Deterioration

This is a stochastic process used to model deterioration that grows over time as a function of aging. The increase in the average deterioration over time can be either linear or nonlinear, but is always monotonic. Gamma-type processes with non-zero initial values are perfectly conceivable.

Time Figure 1.18. Dialog window of the Reliasoft software program. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

1.5.4.3.2. Wiener process Also known as a Gaussian process, this process is applicable to deterioration trends that increase on average but for which the probability that the deterioration decreases over a given interval may be non-zero.

Reliability of High-Power Mechatronic Systems 2

Deterioration

30

Time Figure 1.19. Dialog window of the Reliasoft software program. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

1.5.4.3.3. Geometric Brownian process For monotonic deterioration processes – i.e. without any decreasing phases – the Gamma process is often employed, which allows systematically positive deterioration to accumulate. One alternative is to consider a transformation of the Gaussian process that makes every value positive: the geometric Brownian process. This process is defined using standard Brownian motion, which is a Wiener process with linear trend m = 0. This can be easily simulated since the increments follow independent normal distributions. 1.5.4.3.4. Multistate Markov or semi-Markov models The application of Markov chains to model deterioration allows us to refine our knowledge of the state of a component beyond the simple distinction between “working” and “failed”. The theory of Markov processes is widely used to model the time evolution of deteriorations that are not quantitative in nature but which take values from a finite set of ordered

Accelerated Life Testing

31

options (“new state” Æ “deteriorated state” Æ “very deteriorated state” Æ “failure state”). Transitions from one state to another

Figure 1.20. Dialog window of the Reliasoft software program. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

1.5.5. ALM or accelerated life models 1.5.5.1. Theory ALMs assume that the reliability distribution depends on the operating and environmental conditions, known as stresses, such as temperature, tension, humidity, vibrations and electrical current etc. 1.5.5.2. Definition We write X = ( X1, X 2 ,..., X p ) to denote the values taken by the stresses that we are analyzing. The ALM gives a relation between the reliability at a given so-called nominal value of the stress vector X nom and a so-called accelerated value of the stress vector X acc . If we write Rnom (.) for the reliability under stress X nom and Racc (.) for the reliability under stress X acc , then there exists a numerical factor written AF ( X nom , X acc ) such that

Racc ( t ) = Rnom ( AF ( X nom , X acc ). t )

32

Reliability of High-Power Mechatronic Systems 2

In terms of the reliability, this means that time t under stress X acc is equivalent to time tacc = AF ( X nom, X acc ).t under stress X acc . Thus, Racc (t ) = Rnom (t ,

α AF

, β ) and therefore α acc =

α AF

In an environment that is accelerated by a certain factor relative to another environment, the MTTF, the median and the quartiles are divided by the same factor. 1.5.5.3. Scale factor distributions If the assumed reliability distribution has a scale factor, there are relations between the parameters: standard MVA specifies how the stress changes the parameters of the reliability distribution when comparing two different stress levels. In any such case, the acceleration of time included in the model is expressed by a transformation of the scale factor between stress levels. The general form is ⎡⎛ t ⎞ β ⎤ R (t ) = R0 ⎢⎜ ⎟ ⎥ ⎢⎣⎝ α ⎠ ⎥⎦

where: – R0 is a standardized survival function, which can potentially take other parameters (also known as shape factors); – α > 0 and β > 0 are two parameters that can be either known or unknown; – the scale parameter α has the following property: If T is a random reliability lifetime R (t, α, β), a change in scale by a factor of k has the following effect on the reliability distribution:

T ~ R (t , α , β ) ⇒ k .T ~ R (t , kα , β ) Thus, applying a multiplicative factor k to the lifetime T does not change the type of the reliability distribution and does not change its shape factor, only changing the scale factor by the same multiplicative factor.

Accelerated Life Testing

33

1.5.5.4. Advantage of distributions with scale factors The MTTF, the median and the quartiles are proportional to the scale factor, so multiplying the scale factor by a number k multiplies each of these quantities by the same number k. The most commonly used families in the context of reliability, namely exponential distributions, Weibull distributions and log-normal distributions, are all reliability distributions with scale factors. 1.5.5.5. Common laws Thermal effect of temperature stress: the Arrhenius law. The most common acceleration law for describing the effect of temperature is the Arrhenius law. This law depends on a parameter called the activation energy (Ea). 1.5.5.6. Arrhenius law

⎡ Ea ⎛ 1 1 ⎞ ⎤ AF(T0 , T1 ) = exp ⎢ ⎜ − ⎟ ⎥ ⎢⎣ k ⎝ T0 T1 ⎠ ⎥⎦ where:

T0 : temperature of base level in Kelvin; T1 : temperature of accelerated level in Kelvin; Ea : activation energy parameter; k : Boltzmann constant = 8.6 × 10−5 eV/K; 1/ k = 11, 605. Typical values for Ea vary between 0.3 and 1. There exist “default” values for some electronic applications that determine the value of Ea for each failure mode. If data corresponding to failure are observed at different stress levels, the parameter Ea is viewed as an unknown parameter that must be statistically estimated from the data. 1.5.5.7. Eyring law

If an Arrhenius acceleration model cannot provide a good fit to the data, another law can be used to model the effect of temperature. This law is

34

Reliability of High-Power Mechatronic Systems 2

known as the Eyring law, which also depends on a parameter called the activation energy (Ea). AF(T0 , T1 ) =

⎡ Ea ⎛ 1 1 ⎞ ⎤ T1 exp ⎢ ⎜ − ⎟⎥ T0 ⎢⎣ k ⎝ T0 T1 ⎠ ⎦⎥

where:

T0 : temperature of base level in Kelvin; T1 : temperature of accelerated level in Kelvin; Ea : activation energy parameter; k : Boltzmann constant = 8.6 × 10−5 eV/K; 1/ k = 11, 605. The Arrhenius and Eyring laws are very similar. To discern the difference between their results, we need to observe their effects over a wide range of temperatures. 1.5.5.8. Inverse power law for acceleration

The inverse power law for acceleration is a general equation that describes the effect of a stress S, which might be a voltage, an electric current or the amplitude of mechanical or thermal cycling. α

⎛S ⎞ AF(S0 , S1 ) = ⎜ 1 ⎟ ⎝ S0 ⎠ where:

S0 : stress value of base level; S1 : stress value of accelerated level;

α : acceleration parameter. 1.5.5.9. Cycling effect: Coffin–Manson law

The Coffin–Manson law models the effect of cycling on reliability. The environment is described by a periodic temperature or load profile (for fatigue) with given frequency and amplitude. p

⎡ Ea ⎛ 1 1 ⎞ ⎤ ⎡ ΔT1 ⎤ ⎡ F0 ⎤ AF(T0 , T1 ) = exp ⎢ ⎜ − ⎟⎥ . ⎢ ⎥ .⎢ ⎥ ⎢⎣ k ⎝ T0 T1 ⎠ ⎥⎦ ⎣ ΔT0 ⎦ ⎣ F1 ⎦

q

Accelerated Life Testing

35

where: T0 : temperature of base level in Kelvin; T1 : temperature of accelerated level in Kelvin; ΔT0 : amplitude of base cycling; ΔT1 : amplitude of accelerated cycling; k : Boltzmann constant = 8.6 × 10−5 eV/K; Ea : activation energy parameter; p : effect parameter of the amplitude difference; q : effect parameter of the frequency difference.

There are also other Coffin–Manson laws, e.g. without an Arrhenius-type component when the stress is given by mechanical cycling, with/without frequency components, or which count durations in terms of cycles instead of operating hours. In the latter case, the Coffin–Manson law reduces to an inverse power law that measures the stress in terms of the cycling amplitude. 1.5.5.10. Joint effect of temperature and humidity: Hallberg & Peck

The model used here to describe the joint effect of temperature and humidity is a model derived from an Arrhenius law for the temperature and an inverse power law for the relative humidity. This model is known as the Hallberg and Peck model. Hallberg and Peck law ⎡ Ea ⎛ 1 1 ⎞ ⎤ ⎡ RH1 ⎤ AF([T0 , RH 0 ],[T1 , RH1 ]) = exp ⎢ ⎜ − ⎟ ⎥ . ⎢ ⎥ ⎣⎢ k ⎝ T0 T1 ⎠ ⎦⎥ ⎣ RH 0 ⎦

where: T0 : temperature of base level in Kelvin; T1 : temperature of accelerated level in Kelvin; RH 0 : relative humidity (in %) of base level; RH1 : relative humidity (in %) of accelerated level; k : Boltzmann constant = 8.6 × 10 −5 eV/K; Ea : activation energy parameter; n : effect parameter of humidity.

n

36

Reliability of High-Power Mechatronic Systems 2

1.5.5.11. Other “multistress” laws

The model used to describe the joint effect of multiple stresses is typically derived from the product of several AFs, each of which is determined by one single stress. For example, the combined effect of temperature and current density on electromigration within electronic components can be modeled using a Black model (Pro. Annual Reliability Physic Symposium, 1967), which couples together an Arrhenius law and a power law. ⎡ Ea ⎛ 1 1 ⎞⎤ ⎡ ΔJ ⎤ AF([T0 , ΔJ 0 ],[T1, ΔJ1 ]) = exp ⎢ ⎜ − ⎟⎥ . ⎢ 1 ⎥ ⎣ k ⎝ T0 T1 ⎠⎦ ⎣ ΔJ 0 ⎦

n

The idea is as follows: a multiple-stress acceleration law is constructed by considering the specific effect of each stress (Arrhenius, power, exponential accelerations) in the form of a transformation of the stress function: – V = 1/T for Arrhenius acceleration; – V = ln(X) for power law acceleration; – V = X, for exponential acceleration; then constructing the general AF as follows: AF ⎡⎣ ( X 1 ,..., X p ) base , ( X 1 ,..., X p ) acc ⎤⎦ = exp ⎡⎣α 1 (V1base − V1acc ) ⎤⎦

(

)

exp ⎡⎣α 2 (V2 base − V2 acc ) ⎤⎦ ...exp ⎡α p V pbase − V pacc ⎤ ⎣ ⎦

This general formulation is called the log-linear model of the AF and includes all of the above models. 1.6. Phases in the construction of a reliability validation plan

The SIA guideline [SIA 16] for estimating and validating the reliability of automobiles – distributed free of cost by SIA – presents a methodology for constructing a reliability validation plan. This methodology consists of five phases: – Phase 1: - Risk analysis (PRA, FMECA, FTA, etc.).

Accelerated Life Testing

37

– Phase 2: - Identification of physical failure mechanisms, damaging factors, and associated simulation methods. – Phase 3: - Recovery and exploitation of available data. - Feedback from customer experience (operational reliability), experimental experience (experimental reliability) and modeling of damage mechanisms and the customer stress profiles associated with damaging factors.

5 PHASES IN THE CONSTRUCTION OF A RELIABILITY VALIDATION PLAN The proposed methodological framework for quantifying the reliability consists of 5 phases and 2 prerequisites. This framework provides a valid way of evaluating the reliability. It can be used as a complement to the ISO 26262 standard for validating safety-related features.

Prerequisite 1: Reference or specification of customer reliability objectives

Prerequisite 2: Database Mission profile Material resistances

Provides feedback to Phase 3 "feedback from experience"

Phase 1: Risk analysis (PRA, FMECA, FTA, etc.)

Phase 2: Identification of physical failure mechaisms, damaging factors, and associated simulation methods

Phase 3: Recovery and exploitation of available data Feedback from customer experience (operational reliability), experimental experience (experimental reliability) and modeling of damage mechanisms; customer stress profiles associated with damaging factors

Phase 4: Dimensioning of the validation plan Number of parts, number of cycles, stress levels, acceptance criteria, confidence levels. Definition of coherent target reliability objectives for the customer

Phase 5: Estimation of the predicted reliability in customer environments Model verification and recalibration

Figure 1.21. Phases in the construction of a Reliability Validation plan (source [SIA 16]). For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

38

Reliability of High-Power Mechatronic Systems 2

– Phase 4: - Dimensioning of the validation plan. - Number of parts, number of cycles, stress levels, acceptance criteria, confidence levels. - Definition of coherent test objectives. – Phase 5: - Estimation of the predicted reliability in customer environments. - Model verification and recalibration;

Figure 1.22. Phase 4 of the method recommended by SIA for constructing a reliability validation plan (source [SIA 16]). For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

1.7. Examples 1.7.1. Example 1

– Stage one: “accelerated” reliability tests A test is said to be “accelerated” if: – its usage rate is greater than the normal usage rate wherever possible (usage rate acceleration);

Accelerated Life Testing

39

– if the stress(es) placed upon it during the test is (are) greater than the “normal usage” stress(es) (overstress acceleration). After conducting repeated tests in the test conditions, we observe that the results can be fitted to a Weibull distribution with parameters β = 1.8; η = 440,000 cycles and γ = 0. How many kilometers of normal usage are equivalent to 440,000 cycles in accelerated test conditions? This question can be answered by analyzing the number of failed units. – Stage two: calculation of number of failed units based on vehicle populations and their ages The number of failed units of a product subjected to wear (β > 1, here 1.8) identified from experience depends on t, and therefore on γ(t), and is a function of the decomposition at time t of the population of vehicles equipped with this product into brackets P1, P2, ..., Pi, ..., Pn corresponding to usage periods t = 1, t = 2, t = 3, ..., t = n. Consider the following example: Suppose that we consider the time t counted in quarters from time 0, the launch date of the product (since for this product t = 0), and assume that t = 5 (5th quarter since launch), the calculated “number of failed units” can be written VRT5 (T for “theoretical”), and is given by: λ(1) P(1) + λ (2)P(2) + λ (3)P(3) + λ(4)P(4) + λ (5)P(5), where P(i) is the number of affected products of age i (at the considered time of t = 5). We chose a Weibull distribution, so:

λ (t) = (β / n β ).(t β−1 ) The number of failed units at time N can therefore be written as:



VRTN = (β / nβ ).

N 1

iβ−1 P(i)

40 0

Reliability off High-Power Mechatronic Systtems 2

Observing thhat, in every case c (for physical reasons), β = 1.8, wee simply needd to statisticallly fit η to thhe data descriibing the acctual number of o failed units (recalling thaat γ = 0) – Stage three: analyzing the t number off failed units 1) Smoothinng of the numbber of failed units u (after-salles data) This smoothhing removes seasonal influ uences and thhe effects of innventory levvels. Assuming that t the feeddback increasees exponentiaally, after takking the loggarithm we obbserve that the sales can indeeed be fitted wiith a straight linne – after thee sixth month. 2) Evaluatioon of the age and a quantity of o vehicles at times t t, t – 1, t – 2, t – 3 equipped witth the considdered productt. We can doo this using tthe sales records. 3) Theorettical calculatioon of the num mber of failed units accordinng to the method explainned above. Here, we giive a diagram m showing thee decompositiion and we recall the t calculatioon, assuming that the prodduction per qquarter is theory behind the mple) figures: givven by the folllowing (exam

At time t = 5, 5 the theoretical number off failed units is: i

Thhese steps alllow us to callculate the VRi, V the actuaal sales at tim me t = i exxpressed in quuarters, and alsso to calculatee VRTi / K. Note that K = ( β / n β ) siince we do not n know η, and we will need to acccurately estim mate it (knowiing that β = 1.8).

Accelerated d Life Testing

We can thereforee easily calcuulate the

41

by analyziing

d sales annd after-sales data. From m which we obbtain the VR. 4) Estimation E of η We propose p to usee the method of simple lineear regression.. We shall construcct the graph with w VR along g the x-axis annd VTR/K aloong the y-axxis. The value of K = ( β / n β ) is giiven by the sllope of the liine found by the regressioon. We now n simply need n to check the t validity off the fit. Inciddentally, in thhis case the caalculated corrrelation coeffiicient is equall to 0.99. 5) Results R and connclusion We have thus shoown that the distribution d off equipment failures f resultiing he field can bee perfectly fittted from actual usage of the considered vehicle in th by a Weeibull distribuution with paraameters: β β=1.8 η = 1110 months γ = 0 The consequencess of this concllusion are very y rich. Indeeed, based on the accelerateed tests, we esstimated the parameters p of the failure distribution d ass: β β=1.8 η = 4400,000 cycles γ= =0 The conclusion is very simpple: 440,000 0 accelerated test cycles are equivaleent to 110 months of actual usage in vehiicles in this caase. The values of β foound in acceleerated testing and real usagee are the samee. – Sttage four – exxecution of acccelerated testts on an imprroved versionn of the equiipment

42

Reliability of High-Power Mechatronic Systems 2

Suppose that the results of accelerated tests on an improved version of the equipment yield a Weibull distribution with the following parameters: β=1.8 η = 800,000 cycles γ = 0 Then, we can conclude that the ratio of the accelerated η, namely 800/ 400 = 2, predicts that the average lifespan will be multiplied by 2, so we predict that the η in actual usage will be equal to 110 × 2 = 220 months. 1.7.2. Development of a climate corrosion test for an automobile heat exchanger

So-called “corrosion” from the climate is often underestimated and poorly understood. The scientific basis for accounting for the environment in accelerated tests often follows the laws of solid mechanics. However, “corrosion” involves complex and non-proportional physical and chemical phenomena. We will begin by defining corrosion, then continue by analyzing the context of this issue within the automotive industry. Finally, we will propose a methodology based on an automotive application. ISO 8044 defines corrosion as a physical and chemical interaction between a material and its environment. This interaction causes the properties of the material to deteriorate. Several families of materials are affected, including metals, plastics and all other organic materials, such as graphite. Humidity, atmospheric pollutants, temperature, liquids or applied stress contribute to the mechanism of this corrosion. The automotive industry suffers particularly strongly from this aggressive environmental effect. Table 1.2 shows an excerpt of vehicle recalls in the United States from the NHSTA government Website. In Table 1.2 shows that every manufacturer, and thus every automotive supplier, is affected by the problem of corrosion regardless of the type of system (mechanical, electrical, electronic, mechatronic). As we can see, the human and financial consequences can be very serious. We will present a methodology for taking into account the climate corrosion.

Accelerated Life Testing

Manufacturers Systems Volvo truck [REC 17a] Mercedes [REC 17b] Ford [REC 17c] Hyundai [REC 17d] Hyundai [REC 17e]

4. Recall campaign in the United States

Issues

Breakage of the suspension ball Suspension joints due to corrosion – loss of 13 October 2015 vehicle direction Corrosion of the air bag control unit Air bags causing the air bags to trigger 27 October 2015 unexpectedly Fuel system Wiper system Electrical system

Fuel tank leak due to corrosion – 27 October 2015 risk of ignition Corrosion of the wiper system. Loss of visibility Corrosion of the connector of an electrical sensor. Possible exacerbated short-circuit

29 February 2016 24 March 2016

Table 1.2. Cases of automobile corrosion

1. Establish the mission profile

6. Integrate the reliability objectives into the accelerated test

2. Deduce the damaging factors

5. Construct the Test-Vehicle equivalence relation

3. Analysis of failure modes

4. Define the accelerated stress to reproduce the same failure

Figure 1.23. Methodology for developing an accelerated test in the context of corrosion. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

43

44

Reliability of High-Power Mechatronic Systems 2

We will apply this methodology to the case of a heat exchanger (evaporator) integrated within the air conditioning circuit of an apartment. – Step 1: Establish the mission profile by developing an understanding of the environment involved in the corrosion mechanisms. The study [PHI 16] gives a description of the search for influencing factors in the field: – pollutants originating from the environment near the exchanger; – pollutants originating from the outside environment (pollution, sand, automobile cleaning products, smoke, atmospheric particulate matter, volcano emissions, open quarry dust, bacterial growth, etc.); – relative humidity and temperature. y

p

Figure 1.24. (source [PHI 16]). For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

– Step 2: Selection of damaging factors according to [PHI 09]: - pollutants originating from the environment near the exchanger (copper, iron, and sulfur); - pollutants originating from the outside environment (chloride and sulfur); - temperatures above 30°C; - relative humidity greater than 80%. The international standard ISO 9223 lists the damaging factors (sulfur and chloride levels) present in the environment (Figure 1.26) in the United States region.

Accelerated d Life Testing

45

Figure 1.25. Corrosion map m (ISO 9223) 3). For a color version mi/mechatronic2.zip of this figurre, see www.isste.co.uk/elham

Notee that the coaasts are particcularly suscep ptible to corroosion, since thhey combinee the factors of o temperaturee/humidity and d atmosphericc pollutants. An identical i phennomenon existts in India, as shown in Figuure 1.27. The concept of seeverity is prim marily determ mined by the concentrations c s of mpounds. Thhe General Guidelines G [A AST 09] for the certain chemical com mate publishhed by ASTE Consideeration of the Environnmental Clim (Association des Scciences et dees Techniques de l’Enviroonnement) givves global atmospheric data on thee maximum concentrationns of differrent chemicaal compounds.

Fig gure 1.26. Con ncentration of chemical com mpound in the atmosphere at the grou und level (sourrce [PHI 09]). For a color ve ersion of this figurre, see www.isste.co.uk/elham mi/mechatronic2.zip

46 6

Reliability off High-Power Mechatronic Systtems 2

Table 1.3 shhows an excerrpt of the valu ues recorded near n industriall sources em mitting chemiccal pollutants in countries with w hot and raainy climates. C Chemical compounds

Maximum on (mg/m3) Concentratio

Average conceentration A 3 ( (mg/m )

Sulfur dioxidee

10.000

5.000

Hydrogen sulffide

10.000

3.000

Chlorine

1.000

0.300

Hydrogen chloride

5.000

1.000

Hydrogen fluooride

2.000

0.100

Ammonia

35.000

10.000

Ozone

0.300

0.100

Nitrogen oxiddes

9.000

3.000

Table 1.3. Maximum m and average e concentratio on of principal ds in the atmosphere at the ground level chemiical compound

Table 1.3 shhows the mostt severe valuess worldwide. The outdooor temperaturee and the humidity are faactors that exxacerbate coorrosion. Figure 1.27 showss the extreme values of thee humidity (% % relative huumidity) and thhe outdoor tem mperature.

Figure e 1.27. Exterm me values of hu umidity and te emperature in some e world regions (source [PH HI 09]). For a color c version of thiss figure, see www.iste.co.uk w k/elhami/mech hatronic2.zip

Accelerated Life Testing

47

According to [PHI 09], Europe has very high relative humidity levels of between 90 and 100%, but relatively low ambient temperatures (15–20 °C). The zones that are most conducive to the activity of corrosion phenomena are located in South America, such as Brazil. The combination of temperature and humidity values in these zones are the most favorable for corrosion. – Step 3: Analysis of failure modes. The heat exchanger that we are studying is a brazed assembly of plates and spacers made from an aluminum alloy. At the JAPIA 2009 conference, the authors of the study [PHI 09] presented the failure modes that can be found on used vehicles. Spessel [SPE 03] explains that the failure mechanism illustrated in Figure 1.30 is a type of corrosion known as “intergranular”. This type of corrosion occurs when the intermetallics present in the material at the joints between the grains in the microstructure begin to dissolve. Used vehicle 27 months, 36,000 km

Used vehicle 27 months, 66,500 km

External exchanger

Internal exchanger

Figure 1.28. Effect of corrosion on external exchanger (source [PHI 09])

– Step 4: Define the accelerated stress to reproduce the same failure. The reliability test is considered to be valid if it satisfies the following criteria: – the corrosion failure is the same in the test as on the vehicle; – the pollutants at the surface of the evaporator are the same as those found in the field; – the test is able to discriminate between different designs.

48

Reliability of High-Power Mechatronic Systems 2

For example, a very low pH solution will give a very high-intensity test. However, this solution will not be able to distinguish the impact of different surface treatments applied to the surface of the product. The stresses applied during the reliability test are determined from: – the mission profile; – the actual temperature and humidity recorded in meteorological data; – the chemical and metallic pollutants that contribute to metal corrosion mechanisms. As an illustration of the importance of metallic pollutants from the atmosphere (mostly created by a small fan placed in front of the exchanger) see Figures 1.29 and 1.30.

Figure 1.29. (source [SPE 03])

Figure 1.30. (source [SPE 03])

Accelerated Life Testing

49

Figures 1.31 and 1.32 clearly show that there is a relation between the product lifetime and the copper concentration. The pollutants that should be added to the laboratory tests are not necessarily the same as those that are found on the vehicle. Reliability and materials experts will need to determine the chemical products that result in the final composition detected on actual damaged parts. Philippe and Casenave [PHI 09] suggest an accelerated reliability test to reproduce the corrosion on the exchanger with the pollutants shown in Table 1.4. Chemical compounds NaCl CuCl2 H2SO3

Unit (g/L) 5 0.042 0.3–0.6 mL/L to obtain a pH of 3.5 ± 0.2 Table 1.4. (source [PHI 09])

What temperature and humidity conditions should the accelerated test have? As illustrated in Figure 1.27, the temperature and humidity levels experienced by the vehicle (in hot and humid climates like those found in Brazil) are 30 °C and over 80% relative humidity. Philippe and Casenave [PHI 09] suggest accelerating the corrosion phenomenon by increasing the temperature to 49°C in a humid phase followed by a drying phase of 20 min also at 49°C. The test cycle is represented in the Table 1.5. Phases Spraying Humidity Drying

Time (min) 20 30 20

Temperature (°C) 49 ± 2 49 ± 2 49 ± 2

Humidity RH % – 95 –

Table 1.5. (source [PHI 09])

Does this reproduce the same failure mode? Figure 1.35 shows that it does.

50

Reliability of High-Power Mechatronic Systems 2

Used vehicle 27 months, 36,000 km

Used vehicle 27 months, 66,500 km

External exchanger

Internal exchanger

Accelerated test after 62 days

Accelerated test after 77 days

Figure 1.31. (source [PHI 09]). For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

– Step 5: Construct the equivalence relation between the accelerated test and normal vehicle usage. The study [PHI 10] performs a statistical comparison of the defective parts of vehicles in the United States and on the test bench. The statistical model used by the study was a Weibull distribution with two parameters. A survey of actual vehicles revealed that 1,715 parts were defective from a population of 223,400 parts. The Weibull model is characterized by a shape parameter β of 2.4 (corrosion mechanisms are characterized by a relatively slow wear phenomenon) with a MTTF (parameter that here is viewed as the average time of first failure) of 15 years. As for the data on the test bench (population of defective parts representative of the defective parts on actual vehicles), the parameters of the Weibull distribution were determined as a β of 8 and a MTTF of 21 days.

Accelerated Life Testing

51

Thus, the equivalence criterion is that 21 test days equivalently represent 15 years of the life of actual vehicles. – Step 6: Integrate the reliability objectives into the accelerated test. In accordance with the international standard SAE J2842, the target objective is 95% reliability (R95) with a confidence level of 50% (C50) after 15 years of vehicle life. If 21 test days represent 15 years of vehicle life, how many test days do we need to perform on six parts in order to validate the reliability of R95C50? One approach using a binomial distribution was proposed by Yang [YAN 07] based on the following relation: = .

(∝) .

( )

where T is the test time that must be performed in order to achieve the required reliability. This is what we seek to establish; t is the test time representing the lifetime on actual vehicles, which in our case is equal to 21 days; α is the confidence level, which here is equal to 50%; R is the reliability requirement of 95%; n is the number of tested parts, which here is equal to 6; β is the shape parameter of the Weibull distribution. This parameter is equal to 8 according to the analysis performed in step 5. Numerical calculation is as follows: = 21.

( . ) .

( .

)

= 23 days

The six test parts must therefore survive 23 days in the accelerated corrosion test without failure in order to demonstrate the reliability target of R95C50. 1.8. Standards

Internationally, the standard IEC 62506 defines the following approaches, among others: – Type B: Quantitative accelerated tests: to predict the failure distribution under normal usage;

52

Reliability of High-Power Mechatronic Systems 2

– Type C: accelerated tests with quantification of failures over time: allows us to predict the failure distribution under normal usage. NOTE.– Test types B and C can reduce the necessary testing time. Type B tests must be performed for specific failure mechanisms and are applicable in general to accelerated life testing. Type C requires research into the usage or the specific conditions of the hypothesis before performing the test. This type of test can be applied to the acceleration failure rate. See presentation of the IEC 62506 standard in Appendix A1. 1.9. Conclusion

Accelerated life testing gives an answer to our need to validate innovative components; it allows us to characterize the lifetimes of these components and recalibrate the parameters of their deterioration models. 1.10. Bibliography [AFN 16] AFNOR, Démonstration de la tenue aux environnements Conception et réalisation des essais en environnement, NFX 50-144-1 to 6, 2016. [AST 06] ASTE, HA-ESS guideline, Guide available at: www.aste.asso.fr, January 2006. [AST 09] ASTE, Guide d’application de la démarche de personnalisaiton en environnement mécanique, PRO NORMDEF 0101 Edition 01 – July, Ministère de la Défense. website (retrieved on 05/20/2017): http://www.aste.asso.fr/fr/pag-488138Guide-climatique-et-mecanique.html, 2009. [AST 10] ASTE, Guide d’application de la démarche de personnalisaiton en environnement climatique, PR ASTE 01-02, available at: www.aste.asso.fr, 2010. [BAG 95a] BAGDONAVICIUS V., NIKULIN M., “On accelerated testing of systems”, European Journal of Diagnosis, Safety and Automation, vol. 5, no. 3, pp. 307– 316, 1995. [BAG 95b] BAGDONAVICIUS V., NIKULIN M., “Semi-parametrics Models in Accelerated Life Testing”, Queen’s Papers in Pure and Applied Mathematics, Queen’s University, Ontario, p. 70, 1995. [BAG 97] BAGDONAVICIUS V., NIKULIN M., “Transfer functional and semi-parametric regression”, Biometrika, vol. 84, no. 2, pp. 365–378, 1997.

Accelerated Life Testing

53

[BAG 00] BAGDONAVICIUS V., GERVILLE-REACHE L., NIKOULINA V. et al., “Expériences accélérées: analyse statistique du modèle standard de vie accélérée”, Revue de Statistique Appliquée, vol. XLVIII, pp. 5–38, 2000. [BAG 01] BAGDONAVICIUS V., NIKULIN M., “Mathematical models in the theory of accelerated experiments”, World Scientific Publishing Co., pp. 271–303, 2001. [BAS 82] BASU A.P., EBRAHIMI N., “Non-parametric accelerated life testing”, IEEE Transactions on Reliability, vol. 31, no. 5, pp. 432–435, 1982. [BOI 00] BOITEUX D., Essais d’endurance de systèmes embarqués, PSA, October 2000. [BOM 73] BOMPAS-SMITH J.H., Mechanical Survival: the use of reliability data, McGraw Hill, London, 1973. [BON 15] BONATO M., DELAUX D., “Synthesis and Validation of Accelerated Vibration Durability Tests”, Reliability and Maintainability Symposium, 2015 Annual, pp. 1– 6, 2015. [BON 77] BONNET D., LALANNE C., Choix des essais – Analyse de l’environnement mécanique vibratoire réel en vue de l’élaboration de spécifications d’essais, Journées ASTE, 1977. [CAR 98] CARUSO H., DASGUPTA A., “A Fundamental Overview of Accelerated Testing Analytical Models”, Reliability and Maintainability Symposium, 1998 Annual, pp. 389–393, 1998. [CEE 09] CEEES, Publication n°9: “Reliability for a Mature Product from the beginning of useful life”, CEEES, 2009. [DEL 06] DELAUX D., “Reliability validation of engine cooling modules with a tailoring tests of Vibration, Thermal Shock and Pressure Pulsation”, Revue Essai& Simulation – #785 special edition, 2006. [DEM 95] DEMONSANT J., Un exemple de plan d’expériences numériques: optimisation du profil de denture des engrenages d’une boite de vitesses, ASTELAB, 1995. [DEM 97] DEMONSANT J., Un plan d’expériences pour conception – ASTELAB, 1997.

valider la robustesse d’une

[DEM 99] DEMONSANT J., Le processus de l’ingenierie robuste – ASTELAB, 1999. [DEV 98] DEVARAJAN K., EBRAHIMI N., “Non-parametric approach to accelerated life testing under multiple stress”, Nava Research Logistics, vol. 45, no. 6, pp. 629–644, 1998. [DEW 86] DE WINNE J., “Equivalence of fatigue damage caused by vibrations”, IES Proceedings, 32nd Annual Technical Meeting, Dallas and Forth Worth, pp. 227–234, 1986.

54

Reliability of High-Power Mechatronic Systems 2

[DOY 91] DOYLE R.L., Mechanical Reliability, RAMS Tutorial Notes, 1991. [FOR 61] FORD D.G., GRAFF D.G., PAYNE A.O., “Some statistical aspects of fatigue life variation, Fatigue aircraft structure”, Proc. 2nd ICAF Symposium, pp. 179–208, 1961. [GUE 01] GUERIN F., DUMON B., HAMBLI R., “Determining the shape parameter of a weibull distribution from mechanical – damage model”, Reliability and Maintainability Symposium, 2001 Annual, pp. 156–160, 2001. [GUE 05] GUERIN F., TEBBI O., DUMAN B., “Estimation de la fiabilité par les essais”, Mecanique & Industries, vol. 6, pp. 155–167, 2005. [HOA 03] HOANG P., Handbook of Reliability Engineering, Springer, Berlin, 2003. [KEC 98] KECECIOGLU D., JIANG M., SUN F.-B., “A unified approach to random fatigue reliability quantification under random loading”, Proceedings of IEEE Reliability and Maintenability, pp. 308–313, 1998. [KIM 06] KIM Y.B., NOGUCHI H. AMAGAI M., “Vibration fatigue reliability of BGAIC package Pb-free solder and PbSn Solder”, Microelectronics Reliability, vol. 46, pp. 459–466, 2006. [LAL 84] LALANNE C., “Maximax response and fatigue damage spectra”, The Journal of the Environmental Sciences, Part I, vol. XXVII, no. 4, September/October 1984, Part II, vol. XXVII, no. 5, July/August 1984. [LAL 99] LALANNE C., Vibrations et chocs mécaniques: Dommage par Fatigue, Hermes, Paris, 1999. [LAL 02] LALANNE C., Mechanical Vibration and Shock, Volume 2: Mechanical Shock, Hermes Penton, 2002. [LEM 96] LEMAITRE J., CHABOCHE J.L., Mécanique des matériaux solides, Dunod, Paris, 1996. [LIT 79] LITTLE R.E., EKVALL J.C., “Statistical analysis of fatigue data”, American Society for Testing and Materials, STP744, 1979. [NEL 90] NELSON W., Accelerated Testing: Statistical Models, Test Plans and Data Analysis, Wiley Series in Probability and Mathematical Statistics, New York, 1990. [OCO 03] O’CONNOR P., “Testing for reliability”, Quality and Reliability Engineering International, vol. 19, pp. 73–84, 2003. [OWE 98] OWEN W.J., PADGETT W.J., “Accelerated test models for system strength based on Birnbaum-Saunders distributions”, Lifetim Data Analysis, vol. 5, pp. 133–147, 1998. [PER 06] PERROUD G., Tenue en service d’un composant à partir du retour d’expérience et suivi en série, PSA PEUGEOT CITROEN Site de Sochaux Belchamp – Voujeaucourt – ASTELAB, 2006.

Accelerated Life Testing

55

[PHA 89] PHADKE M.S., Quality engineering using robust design, Prentice-Hall, Upper Saddle River, 1989. [PHI 09] PHILIPPE M., CASENAVE C., Japan Auto Parts Industries Association (JAPIA) symposium – HEREL, 14 September 2009. [PHI 10] PHILIPPE M., Reliability Approach for Test Development, SIA conference, 18 June 2010. [PHI 16] PHILIPPE M., Impact of car cleaners on brazed aluminium exchangers corrosion resistance, IQPC conference, Feburary 2016. [REC 17a] RECALL, Volvo truck – NHTSA website – retrieved on 31 January 2017, available at: https://www.nhtsa.gov/recalls?nhtsaId=15V651, 2017. [REC 17b] RECALL, Mercedes-Benz – NHTSA website – retrieved on 31 January 2017, available at: https://www.nhtsa.gov/recalls?nhtsaId=15V711, 2017. [REC 17c] RECALL, Ford – NHTSA website – retrieved on 31 January 2017, available at: https://www.nhtsa.gov/recalls?nhtsaId=15V712, 2017. [REC 17d] RECALL, Hyundai – NHTSA website – retrieved on 31 January 2017, available at: https://www.nhtsa.gov/recalls?nhtsaId=16V117, 2017. [REC 17e] RECALL, Volkswagen – NHTSA website – retrieved on 31 January 2017, available at: https://www.nhtsa.gov/recalls?nhtsaId=16V171, 2017. [SHY 99] SHYUR H.-J., ELSAYED E.A., LUXHOJ J.T., “A general model for Accelerated Life Testing with time-dependent covariates”, Naval Research Logistics, vol. 46, pp. 303–321, 1999. [SPE 03] SPESSEL C., PHILIPPE M., Reliability of Evaporators regarding corrosion, 200401-0216, SAE, 2003. [SIA 16] SIA, Guide d’aide à l’estimation et à la validation de la fiabilité automobile, SIA, 2016. [SHI 72] SHIGLEY J.E., Mechanical Engineering Design, McGraw Hill, London, 1972. [TEB 03] TEBBI O., GUERIN F., DUMON B., “Reliability testing of Mechanical productsApplication of statistical accelerated life testing models”, 9th International Conference on Applications of Statistics and Probability in Civil Engineering, University of California, Berkeley, July 6–9 2003. [TRI 06] TRIBOULET F., Fiabilité prévisionnelle des composants de châssis automobile Calcul probabiliste du dommage en fatigue, ASTEFORUM, 2003, 2006. [VAS 01] VASSILIOUS P., METTAS A., “Understanding accelerated life-testing analysis”, Annual Reliability and Maintainability Symposium, Tutorial Notes, pp. 1–21, 2001.

56

Reliability of High-Power Mechatronic Systems 2

[VIG 00] VIGIER M.G., Methodepredictive des defaillances en service d’un produit nouveau, SIA, 2000. [YAN 07] YANG G., Life Cycle Reliability Engineering, John Wiley & Sons, New York, 2007. [ZHA 02] ZHANG C., “Mechanical component lifetime estimation based on accelerated life testing with singularity extrapolation”, Mechanical Systems and Signal Processing, vol. 16, pp. 705–718, 2002.

2 Highly Accelerated Testing

The idea of highly accelerated testing is already a few decades old, and belongs to the family of pro-active testing strategies: it aims to maximally exploit the potential of the technology available at any given moment. Approaching this point means that we are approaching the international stateof-the-art, or in other words what we refer to as product excellency. This chapter describes the methodology of highly accelerated testing at a theoretical level. Readers may refer to the Appendix “HALT-HASS methodology” for a detailed description of how the highly accelerated life tests (HALT) and highly accelerated stress screening (HASS) approaches are executed in a laboratory that offers this type of service1. The chapter concludes by comparing accelerated life testing and highly accelerated testing. 2.1. Introduction to highly accelerated testing A highly accelerated test is a short test during which the applied stresses are gradually increased well beyond the specification values. The primary objective of these types of test is to explore the operation and destruction limits of a product in order to increase them by undertaking suitable improvements until the limits imposed by the underlying technology or its components are reached.

Chapter written by Henri GRZESKOWIAK, Tony LHOMMEAU and David DELAUX. 1 Appendices to this book can be found at www.iste.co.uk/elhami/mechatronic2.zip.

58

Reliability of High-Power Mechatronic Systems 2

2.1.1. History The origin of highly accelerated testing goes back more than 40 years ago, when Mr. Hewlett and Mr. Packard, the founders of the company HP, asked their engineers if they could significantly improve the operational reliability of their workstations. For those among us who are old enough to remember such times, the annual maintenance cost of a workstation typically represented 10% of its purchase value due to poor reliability in the field. The method adopted by engineers of HP became known as “STRIFE”, a contraction of “strengthen” and “life”. It involved applying stimuli (electrical: ON/OFF cycles, clock frequency variations, power voltage variations, etc.; or mechanical: vibrations, shocks, static loads, etc.) exceeding the actual usage conditions of the devices. When one of these stimuli produced a defect, an attempt was made to correct the defect by improving either the design or the technology, or by restricting the usage conditions. The testing process was then resumed until the technological limitations intrinsic to the product (i.e. the technologies of its components) were effectively reached. These limitations were considered to be reached when large numbers of defects developed that could clearly not be corrected within reasonable time or resource constraints. The modern conception of highly accelerated tests is closely linked to STRIFE, as they similarly aim to exploit the state of the art of the implemented technologies as much as possible. Therefore, they are not intended to replace other types of more traditional testing: development tests, reliability growth tests, accelerated life tests, qualification or design approval tests, etc., but instead complement these tests by helping to reveal every assignable cause of defects, with the goal of correcting them (where by assignable we mean a defect that can be explained in terms of a deviation from the state of the art). The protocol of a series of highly accelerated tests often depends on the customer (e.g. AIRBUS places special requirements on its suppliers) or the manufacturer, e.g. wishing to improve margins (design or production). Such tests usually require a prior analysis that identifies areas of performance with insufficient margins for which there is sufficient motivation to improve them. ASTEFORUM 2002, 2004 and ASTELAB 2003 hosted sessions dedicated to highly accelerated tests, with discussions and information sharing on every issue relevant to these kinds of test. The ASTE committee on “Operational environment and product dependability”, also known as the GTR25 (work and discussion Group) of the IMdR-ISdF (Risk Management Institute, formerly the RAMS Institute), was behind Project 4/99 “Recommendations for the industrial usage of highly

Highly Accelerated Testing

59

accelerated tests” conducted under the supervision of the RAMS Institute (now Risk Management Institute), which in particular served as the basis upon which recommendation Bureau de Normalisation Aéronautique et Espace (BNAE)– RG0029 “Highly accelerated testing” was prepared. In 2006, an ad hoc group of the ASTE commission published a leaflet called “Application of Highly Accelerated Environmental Stress Screening to Electronic Equipment”, which is still available in the list of publications sold by ASTE (also available in English). At the international level, the IEC 62506 standard gives definitions for HALT and HASS approaches (see the Appendix). 2.1.2. General approach There are a number of factors that motivate companies to improve their performance by increasing their responsiveness to market demands: – a demanding customer base in terms of product quality and durability; – a growing number of contractor specifications, especially in terms of quality and reliability; – a pressing need to reduce new product development cycles to remain competitive; – reviews to reduce development, production and after-sales costs to remain competitive on pricing. As well as these basic factors, which are relevant to all business activities (retail, transport, energy, space, military, etc.), there are other challenges that are specific to the arms and space industries. In these two sectors, the (military or space) standards used over the last few decades by suppliers are increasingly being abandoned by contract-givers. Instead, emphasis is being placed on the common practice followed by civil fields and on proof of expertise, regardless of the chosen approach or resources. In addition to this development, especially in electronics, military-grade components are gradually being abandoned, as well as sealed-housing components, which now represent less than 1% of the total production volume. This creates a new challenge for the military and space sectors; in the absence of applicable reference standards, these industries must now develop more customized methods for their products, and need to use industry-grade components beyond their specifications while continuing to guarantee that operational

60

Reliability of High-Power Mechatronic Systems 2

safety and durability criteria are met. In this context, the traditional approaches used by manufacturers to accelerate performance maturation and increase the reliability of new products are of little use. The various conventional forms of development tests are no longer suitable for the new state of the market: prohibitive timescales, slow to reveal design flaws (which are sometimes only found after delivery to the customer), incomplete maturation of reliability in late development stages, etc. The classical “reliability growth” strategies, which are based on the implementation of tests over long periods (often several months) with low stresses placed on the materials, are no longer capable of meeting the challenges required of them, simply because they take too long. As for the burn-in processes performed on electronic devices (or their components) directly after production, they are often not aggressive enough to allow manufacturing defects to be rapidly revealed. Indeed, the burn-in profiles that are used are often derived from the traditional profiles used by the company for other types of product, and thus involve applying stresses to postproduction materials over long periods of time. Ultimately, burn-in operations can heavily reduce the profitability of production operations, while yielding inadequate returns. Thus, there seems to be a need to move toward other approaches, such as highly accelerated testing. This approach, which has been explored in the United States for the past 20 years, has clearly proven to be effective, judging by the articles published by a range of different American companies that use it. What does this approach involve? First and foremost, the idea is to encourage development teams to “think differently”: rather than thinking in terms of compliance with a specification (which is often not representative of the actual life profile of the product) and remaining within the confines of conventional validation tests, they should instead attempt to push the product to its limits (often to the point of failure) using environmental stresses or various other stimuli at levels well beyond those described by the specification. The objective is to take maximum advantage of the currently available technologies in order to approach the state of the art by eliminating every assignable defect. This requires the development team to explore any available margins, then improve these margins by undertaking suitable measureson the product design itself, or its manufacturing processes or its conditions of usage. In short, properly executed highly accelerated testing allows us to significantly increase the robustness of a new product immediately after the production of the earliest prototypes at the beginning of the development phase, thus accelerating the early maturation of the product. Furthermore, knowledge of the available margins on a “mature” product allows us to configure its future burn-in profile in a more customized manner

Highly Accelerated Testing

61

by increasing the severity of the applied stresses to the optimal level, which leads to substantial improvements in the effectiveness of burn-in operations. Ultimately, an intelligent application of highly accelerated testing should allow us to achieve the following objectives simultaneously: – accelerate the maturation of a product in development and thus be able to deliver mature products from the beginning of the series; – improve the reliability and durability of the product in the field by increasing its robustness and eliminating defects in procedures; – reduce turn-around times and development costs; – equip burn-in processes with the ability to more effectively reveal manufacturing defects; – substantially reduce the number of changes postqualification; – better meet customer expectations; – improve the company’s brand image. Figure 2.1 illustrates the accelerated maturation of the reliability performance that can be obtained at the end of development by executing highly accelerated tests, relative to the traditional approach.

Reliability level

With highly accelerated testing

Traditional approach

Development

Production

Time

Figure 2.1. Comparison of reliability growth during development and production between Highly Accelerated Tests vs. traditionnal approach

62

Reliability of High-Power Mechatronic Systems 2

2.1.3. Robustness and reliability The failure of any product, regardless of its complexity, always happens once some form of stress (mechanical, electrical, climate, etc.) exceeds the resistance of the product to this stress. If we consider a sample of specimens of the same product in operation, we see that both this stress and the resistance to the stress are statistical in nature. This statistical character can be explained by:

− variability in the stresses associated with different life profiles and within the same life profile (e.g. variation in the applied loads, climate variations, variations in the voltage of the power supply, etc.); − variability of the “stress resistance” of each specimen of the product, due to variability in the properties of the materials themselves and the inherent variability of production processes. Thus, if we consider Figure 2.2, and we voluntarily limit ourselves to the case of a single stress, we see that there exist three statistical distributions: one for the stress, one for the product stress resistance at the beginning of its usage and the third for the stress resistance after a certain period of operation, once the material properties have gradually begun to lose their original qualities. Resistance to stress after aging (R2)

Probability Stress (S)

Resistance to stress at the beginning of life (R1)

Stress Resistance to stress

Figure 2.2. Stess/Strengh approach highlighting the aging effect

We can easily see in this figure that the operational reliability of the product in the field decreases as the intersection of the S and R1 (or R2) curves increases (shaded area; in fact, the failure zone is a subregion of the

Highly Accelerated Testing

63

shaded area and does not cover it completely). Thus, when the applied stress cannot be reduced, which is usually the case, maintaining high operational reliability throughout the entire product life profile requires: – a sufficient margin between the centers of the S and R1 distributions; – as small a spread as possible for the R1 distribution; – the leftward drift over time of the R1 distribution toward the R2 distribution to be as slow as possible. The first condition, which is related to the size of the margins, is the one that we are most interested in here, since it directly influences the robustness of the product. The larger this margin, the smaller the failure probability of the product over its lifetime, despite the fact that this margin necessarily decreases over time. The second condition, which is related to the spread of the R1 distribution, depends on the mastery of manufacturing processes. This mastery can be acquired by implementing DOEs prior to industrialization and by performing statistical production control. The third condition, related to the development of the properties of the materials and assemblies at every level, depends on choosing materials that are suitable for their usage profiles and on the care that is taken when assembling these materials. Note also that, in terms of the operational reliability of a product, achieving sufficient operating margins makes it possible to simultaneously: – reduce the failure rate of the product throughout its useful lifetime: this is a direct consequence of reducing the intersection area between the stress distribution (S) and the original stress resistance distribution (R1); – extend the “useful lifetime” of the product: this follows from increasing the time required for the intersection area between the stress distribution (S) and the stress distribution (R2) after gradual drift due to cumulative damage to become significant. In other words, as shown in Figure 2.4, increasing the operating margins of the product helps to both lower and extend in time the flat section of the “bathtub curve” describing the evolution of the failure rate of the product over its life profile.

64

Reliability of High-Power Mechatronic Systems 2

REMARK.– We can also improve the realism of the product resistance curve in Figure 2.2 by adding a second “failure mode” to represent the possibility of latent defects in the product. Figures 2.3(a) and Figure 2.3(b) below show the benefit offered by applying Highly Accelerated Environmental Stress Screening (HA-ESS) to the product, which will increase the likelihood of triggering unexpected types of latent defect (conventional burn-in is used to cover normal types of defect). Stress screening consumes some of the life potential of the product (but not too much), causing the resistance distributions to shift toward the left, but acting more strongly upon the distribution that represents latent defects. This phenomenon is known as the fuse effect. Products with this defect can therefore be identified more easily.

a)

b)

Figure 2.3. a) Classical ESS doesn’t precipitate all the failures of the latent failures distribution (source: [AST 06]); b) HASS is more likely to reveal all the failures of the latent failures distribution

Highly Accelerated Testing

65

2.1.4. Types of products for which highly accelerated testing is relevant Even though most recent works and papers seem to only consider applications of highly accelerated testing in the specific cases of electronic or perhaps electromechanical devices, it would be a mistake to consider that highly accelerated tests cannot apply to mechanical devices. Indeed, given that, by definition, highly accelerated tests aim to push a product to its limits by applying increasingly high stresses to explore its operating margins and improve them if necessary, there is no immediate reason to restrict ourselves to the cases of electronics or electromechanical objects. A few brief historical reminders can further support this view. Nobody would dispute that the concept of “robustness” itself was formalized and used for many decades as a purely mechanical concept, most commonly when considering the margin between the resistance limit of a mechanical material or an assembly with respect to a given stress (e.g. traction, torsion, flexion, etc.) and the maximum applied level of this stress during operational usage of the material or assembly. Mechanical engineers have long used the concept of “safety margin”, which is simply the ratio between the two average values of S (stress) and R (stress resistance). Thus, and given the statistical properties associated with these parameters, robustness is a function of both the “safety margin” and the statistical spread of the stress and the stress resistance of the material. Therefore, in a purely mechanical context, highly accelerated tests can be conducted in static mechanics by subjecting a material or an assembly to a static stress of increasing strength until breakage is achieved. In dynamics, highly accelerated tests can be performed by subjecting the material or assembly to repetitive stress cycles (e.g. traction/compression cycles, repeated shocks, etc.) in order to generate cumulative damage leading up to a breakage phenomenon. In the latter case, the severity can be increased using different criteria: the magnitude of the stresses (typically used in simple cases where classical principles of fatigue-related damage can be effectively applied), the frequency and duration of the stresses, etc. In light of this, we can say without exaggerating in the slightest that the concept of highly accelerated testing has long been understood and applied in mechanics (the testing of aircraft wings until breakage during development gives one concrete example). The concept of “robustness” is however much more recent in electronics. It was initially approached by way of derating coefficients, which are applied

66

Reliability of High-Power Mechatronic Systems 2

to families of electronic components. Thus, although the majority of the literature on highly accelerated testing over the past two decades focuses much more on applications to electronics or electromechanics than mechanical applications (e.g. HALT, STRIFE, etc.), this should be seen as the result of the delay in introducing the concept of “robustness” in electronics, whereas this concept has long been familiar in mechanical engineering. In conclusion, highly accelerated testing during design/development are applicable to every category of materials, provided that the most relevant types of stress (mechanical, climate, electrical, etc.) with respect to the expected modes of failure of these materials are used. Operational failure rate With ESS Without ESS Useful lifetime Conventional approach With margin improvement procedure Useful lifetime Usage period of the product

Figure 2.4. (source: [INS 01])

j-th stress

i-th stress

Technological resistance limit

Margin potential Specified domain

Assignable defect (to be identified and resolved)

Stresses gradually increased

Figure 2.5. (source: [INS 01]). For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

Highly Accelerated Testing

Concept

Burn in

Thermal cycling

Objective/characteristics

67

Justification of concepts and era in which they appeared

Accelerate chemical reactions.

Predominant role of the reliability of components in the product reliability (1960s/1970s).

Eliminate defects that cause early failures.

Component packaging issues (1980s).

Eliminate defects that cause early failures in components.

Eliminate some types of failure during “useful life”. Accelerate chemical reactions. STRIFE

Rapidly improve the robustness of products.

Increased need for product robustness (1970s/1980s).

Achieve product maturation before it begins production. HALT

Rapidly improve the robustness of products.

Increased need for robustness.

Need for better and faster development according to a formalized procedure Reduce market introduction turn-around (1980s). times. Achieve product maturation before it begins production.

Reduce warranty costs. Determine destruction limits for HASS specifications. HASS

HASA

Use high stress levels (sized using HALT) to rapidly reveal early defects.

Conventional burn-in insufficiently effective.

Manage the defect detection process.

Need to reduce testing times (1980s).

Same as HASS + statistical control (by Hewlett-Packard) on samples drawn from production

Ensure absence of quality drift (1980s/1990s).

Table 2.1. Summary of different concepts (history, objectives and/or characteristics) (source: [INS 01])

68

Reliability of High-Power Mechatronic Systems 2

Choice of stress Starting level Increase the level

No Sufficient margin?

No

Failure?

Yes

Yes

Failure analysis

End of testing

Yes

Technological limit?

No

End of testing (*)

Resolution possible?

Yes

Apply resolution or reparation

No

End of testing (*)

(*) Decision taken on the basis of the margin revealed by the test

Figure 2.6. Flowchart of a highly accelerated test (source: [INS 01])

2.1.5. Example in the aerospace sector 2.1.5.1. Effect of aging on the safe operating area Suppose that an aerospace supplier wishes to increase the robustness of an on-board airliner device for converting energy between 115V-AC/230VAC/270V-DC/540V-DC with currents ranging from 15 to 600 A. The

Highly Accelerated Testing

69

environmental conditions are severe in terms of the temperature (constant and cycles), air pressure, humidity and vibrations. The durability requirements are high: – 25–35 years of service lifetime; – 80,000 flight cycles; – 100,000 h of flight. The questions that must be answered by the operational safety experts are the following: – evaluate the robustness of the system with respect to the combined stresses (temperature/humidity/voltage/pressure); – make the failure trees of the converters reliable; – estimate the reliability of the converter: - choice of technologies; - justification of margins; - definition of tests for estimating the reliability. Physical analysis of parts after failure

Validaon of hypotheses

Evaluation of the robustness of the component. Determination of the boundary between the functional domain and the failure domain

Definion of tesng condions

Definition of the functional domain, as well as the stresses that limit the lifetimes of the components

Funconal study

Physics of failure

Execution of the failure mode study (FMECA) to characterize their influencing factors

Figure 2.7. Mastering the reliabilty on the aerospace equipments (source: Safran)

The test program consists of three phases: – one phase to characterize the robustness in terms of the influencing factors: voltage, current, temperature, pressure and humidity;

70 0

Reliability off High-Power Mechatronic Systtems 2

– one agingg phase for test t vehicles (TV) in therrmal storage, passive thermal cyclingg, active therm mal cycling; – one phasee to recharacteerize the robu ustness in term ms of the inffluencing factors: voltagee, current andd temperature,, after allowinng the aging phase to acct. The phase 1 tests are pressented in Tablee 2.2. These teests were perfoormed on “teest vehicles”, which are simplified s butt representativve versions oof actual eqquipment.

Table 2.2. Tessts dedicated to o characterisattion of robustne ess inflencing ffactors

Highly Accelerrated Testing

71

The second phase characterizes the robustnesss of the TV as a function of six a the robuustness of the TV T is measureed to evaluate the types off aging. After aging, o this aging. impact of

Table 2.3.. Characterisa ation of the rob bustness of the e test vehicle e after applicattion of differen nt types of agin ng

The hypotheses shhall be validatted based on: – failure analysiss to assess thee electronic siignature of faailures (using an oscilloscope, etc.); – phhysical failurre analysis, especially e wh hen the originn of the failuure appears to be related to an assemblly/non-semico onductor mateerial failure. 2.1.6. Types T of deffects triggerred by HALT T tests Beloow, we give a list of a few examples of failures f that were w triggered by HALT tests. These tests were peerformed on multiple diffe ferent electronnic, mechatrronic and micrromechanical products. 2.1.6.1. Example 1:: interface ca ard The first example of equipmentt under test (E EUT) is an inteerface card.

72

Reliability of High-Power Mechatronic Systems 2

The EUT is fixed on a few spacers (tightened to 1 nm) and are fixed on an array of aluminum bars; these bars are fixed to the table, tightened to 15 nm. An air duct is used to apply a slight airflow directly to the EUT.

Figure 2.8. Overview of the card in testing conditions. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

Regulation by thermocouple AIR Vibration Stress test, level of vibration of the table compare to E.U.T ( 5khz )

Graph 7 Test July 10, 2013

25 accel1 Ch1-Y accel1 Ch2-X accel1 Ch3-Z

20

Accel 3-Z-interface Connector

Vibration E.U.T Grms

Accel 4-Z-PCB 15

10

5

0

Figure 2.9. Relation between the RMS of the acceleration on the table and the response of the EUT. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

15

10

5

Vibration of the Table Grms

Highly Accelerated Testing

Time

Temperature

gRms

Test

Note

10:02:00

20

0

Ok

NTR

10:03:00

20

5

Ok

NTR

10:20:21

20

10

Ok

NTR

10:30:00

20

15

Ok

NTR

10:35:00

20

0

No ok

Loss of transformer

10:47:30

20

5

No ok

Loss of transformer

73

Table 2.4. Incident record table

Figure 2.10. Acceleration spectral density (ASD) at some points of the EUT. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

Conclusion of example 1 The test was stopped after breakage of the interface card: tracks broken and transformer disconnected.

74

Reliability of High-Power Mechatronic Systems 2

2.1.6.2. Example 2: electronic circuit “gauge controller”

Figure 2.11. Evolution of the RMS values of the vibration and temperature. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

The following combined test was applied: – initial temperature: 20 °C; – the thermal profile ranged from +80 °C to −45 °C with transitions of 60 °C/min, with dwell times of 15 min at +80 °C and 10 min at −45 °C; – this thermal profile is applied in parallel to vibrations with RMS values of 0, 5, 20, 30, 34 and 45 gRms with durations of 10 min per application; Four vibration cycles are performed.

Figure 2.12. EUT (Equipment Under Test) For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

Conclusion of example 2 A capacitor detached after breakage of a connection and a solder.

Highly Accelerated Testing

2.1.6.3. Example 3: EUT embedded rangefinder.

Figure 2.13. Overview during HALT test. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

– Graph of applied temperature:

Figure 2.14. Evolution of the applied temperature. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

75

76

Reliability of High-Power Mechatronic Systems 2

Nothing to Report - NTR NTR NTR NTR

Table 2.5. Incident record table

Figure 2.15. Graph of the vibration spectrum at various points of the EUT. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

Conclusion of example 3 Unit stopped functioning, slim card: out; breakage of capacitor connections, fuse came loose from its casing.

Highly Accelerated Testing

77

These three examples illustrate the defects observed after performing HALT tests. The question that we must ask after a failure is triggered in a HALT-style test is the following: is the fault assignable?

Figure 2.16. Photos of the EUT. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

78

Reliability of High-Power Mechatronic Systems 2

Figure 2.17. Findings at the end of testing. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

Since there is no widely established definition of this term, we will use the following definition in this book: “Assignable defect: Defect or flaw inherent to the design of the product or its manufacturing processes whose correction, which is assumed to be economically feasible, takes the form of an improvement of the operating and/or destruction margins of the product (e.g. component leads too long, loose screws, unsuitable fixtures, poorly calibrated components, etc.). Any such defect, which is always the result of a deviation from the state of the art, is by definition not due to the resistance limits imposed by the technologies used in the product”. The first thought that comes to mind when examining the failures observed in the three examples presented above is that the deviation from the state of the art is not obvious in any of them. If it were, the decision to correct would be de facto immediate. One example that illustrates this situation is the following: suppose that, at the end of the development of a new missile program, test runs reveal errors in the launch timing. The project manager responds by calling a meeting of all persons responsible for the missile components one Friday afternoon to devise a “highly accelerated” test program over the course of the

Highly Accelerated Testing

79

weekend with the aim of pinpointing the flaw(s) in the components. After this meeting, the engineer responsible for the specifications of environmental tests increases the severity of the qualification tests (already very severe) by factors of up to two. These tests are half-sine impact tests applied to electronic devices: it is observed that the casing contains accumulated static stresses that are released under impact, causing the surface of one of the sides of the casing to snap like when the bottom of a tin is pressed. The new equilibrium position of the casing created an electrical short circuit with the leads of some components, which were slightly too long. This reveals two deviations relative to the state of the art: – a static stress stored in the sides of the casing; – component leads that are slightly too long. After these tests, which overall had only revealed a few small deviations such as those described above, the executive responsible for this company concluded: we are confident in the robustness of our components. We simply have a residual design problem, which we shall promptly resolve. So, they did. Let us now return to our three examples: given that the defects do not reveal an obvious deviation from the state of the art, we need to perform more extensive analysis. Since this requires some resources, we will need to balance the cost of this analysis and the cost of correcting this defect. One possible such analysis is presented in the Appendix “Comparison of HALT and ALT testing”. In the specific example considered in the appendix, the defect triggered by the HALT test was found to be non-assignable. 2.1.6.4. Types of defects found by HALT tests Typical defects triggered by HALT: – poor-quality welds/solders; – failures of connecting components (sockets, connectors, etc.); – component failures; – flexion (due to vibration) of component wires; – unsuitable components;

80

Reliability of High-Power Mechatronic Systems 2

– components that are poorly placed on the PCB; – problems with manufacturing tolerances; – detection of changes within the components during production; – detection of changes during manufacturing processes. The affected components are typically: – connectors; – sandwich-mounted circuits; – ball bearings; – ruby settings; – hydraulic hoses; – screws; – terminals; – coils; – ferrite cores; – integrated circuit pads; – multilayer printed circuits; – transformers; – capacitors; – cooling radiators; – ungrouped or unfixed bulky components. Other examples of defects found by HALT are presented below. The bonds of chips in integrated circuits have a resonant frequency of between 8 and 10 kHz. The repetitive shocks (RS) table excites at frequencies of up to 30 kHz, and so can reveal chip bonding defects, which is not possible with an electrodynamic vibration generator. Surface-mounted components such as resistors and capacitors have resonant frequencies between 5 and 10 kHz, which is also within the ranged covered by the RS table.

Highly Accelerrated Testing

81

Fiigure 2.18. Efffectiveness of stresses for triggering defe ects. For a colorr version of th his figure, see www.iste.co.u uk/elhami/mecchatronic2.zip

2.1.7. Analysis of o tests by a HALT machine m wiith pneuma atic hamme ers: speciall features an nd inherent precautions p s Twoo test campaiigns with a Qualmark Q HA ALT machinee (10 pneumaatic hammerrs) were perrformed by a HALT serv vice providerr. The analyysis presenteed below is baased on the meeasurements th hat were takenn.

point2 po oint5

poin nt3

point1 Z Y

point4 X

Figure 2.19. Overvie ew during the second campaign of te ests; the scre ews nt the four po oints of entry of the equipm ment from the e first campaign represen of testss. For a collor version of o this figure, see www.isste.co.uk/elhami/ mechatrronic2.zip

The most importaant findings caan be summariized as follow ws: 1) The T shock ressponse spectruum along the Z-axis in thee range of 1000– 2,000 Hz H is 10 timees stronger, up u to 10 kH Hz, than alongg the other tw wo transverrsal axes, as shown in Figgure 2.21. Th his simplifies any subsequent

82

Reliability of High-Power Mechatronic Systems 2

modeling work, since the excitation is essentially unidirectional. Also, the SRS along Z at 45 geff reaches 1,000 g at 1,000 Hz and 150 g at 100 Hz, which is very severe. The same is true in the range of up to 2 kHz. If a mechanical system resonates in this range (almost guaranteed to happen), we might begin to approach the elastic limit. Having said that, the test medium was not loaded during these tests, and this level will likely decrease in the presence of load. 2) The excitation generated by the repeated impacts is strongest between 10 and 30 kHz (as shown in Figure 2.21), which is not surprising for metalon-metal impacts; however, the levels are not likely to cause structural damage, except in special cases: for example, if the wire of a hybrid component with a diameter of around 10 µm has a resonant frequency between 10 and 30 kHz, it could conceivably break due to mechanical fatigue as a result of HALT or HASS stimuli. 3) The coherence function is very low between the reference point along Z and each of the measurement points P1 to P4; this is less true for P5, for good reason, since it is located opposite the reference point (placed on the table but below and opposite the point P5). This means that the responses at P1 to P4 are not totally related to the vibration at the reference point, which is intuitively easy to understand: the vibrations at P1 to P4 come from all 10 excitations and the individual paths followed by each of them (see, for example, the coherence function between 3Z and the fixed reference point in Figure 2.23). In other words, the simplified notion of transfer function (which assumes an input and an output that is linearly related to this input) is no longer applicable in this context, since there is no casual relation between an input and an output. Recommendation for a possible model: directly record the measured and filtered times at the four corners of the equipment. The more the acceleration distributions deviate from the normal distribution, the more this is justified: the kurtosis is very high and can change over the course of a given application (see table in Figure 10). The crest factor itself is very high, with values exceeding 15: this means that for a given RMS value (in the present context) of 40 geff, the crest value can reach 600 g, compared to 120 g for vibrations following a random normal distribution with the same RMS. 4) As we can see in Figure 2.24, the normal probability plot that we obtained deviates substantially from the ideal line; if the curve was a straight line, it would be possible to assume that the distribution of instantaneous

Highly Accelerated Testing

83

values is normal: but we are very far away from this. In these conditions, we must take the necessary precautions when talking about the RMS; namely, we should not be too quick to compare the it to the RMS of a normally distributed random variable; indeed, the slope of the normal probability plot gives the RMS value, and, in the case of a normal distribution, the slope is constant. Here, we can see that the slope constantly changes, and it would be an error to assume that there is a unique RMS value (equivalently, a unique value of this slope).

Figure 2.20. Readings showing the evolution over time at different measurement points. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

We observe that there is consistently a large deviation from this straight line, which shows that we deviate strongly from the case of a normal distribution. The shape of the curve shows that the tail ends of the distribution are extremely large, too large for a normal distribution. This is not particularly

84 4

Reliability off High-Power Mechatronic Systtems 2

suurprising givenn the time chharacteristics of the signals, which consist of a sequence of trannsients, introdducing the ten ndency to leadd to extreme vvalues in o characterize the damage inn the case thee distribution. This degradess our ability to off extreme responses (image of maximum stress) and daamage due to ffatigue if wee wished to deescribe the daamage based on o the Accelerration Spectrall Density (A ASD). Furtherm more, the signnals are often n non-stationaary, which preevents us froom characterizzing an ASD. The errors th hat would be inntroduced by doing so annyway would be extremelyy high: in on ne example, the effect off damage caalculation baseed on an ASD (called a specctral approach)) was more thhan 200% for the maximum m stress and a deviation fro om several hunndred to over 11,000 for m suitable ap pproach in succh a situation, which is thee fatigue comppared to the most thee time-based approach. a 5) The senssitivity of the vibration leveel to temperatture does not seem to bee very high, ass shown by Figgure 2.25. In summaryy, the RS pneumatic hamm mer testing meedium is presented by itss advocates ass being extrem mely simple (in n principle: haammers hittinng a table creating stimulii in the equippment), but analyzing a the vibration datta that it geenerates is higghly complex.

Table 2.6. Ku urtosis (flattening) and crestt factor

Highly Accelerated Testing

TestName

TestDate

MeasurementPointName

Type

20deg-45gRMS_envelope = X

ShockSpectrum

20deg-45gRMS_envelope = Z

ShockSpectrum

TestName

TestDate

MeasurementPointName

Type

20deg-45gRMS_envelope = along Y

ShockSpectrum

Figure 2.21. Comparison of extreme response spectra in all 3 directions: Z is predominant. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

TestName

TestDate

MeasurementPointName

Type

20deg-45gRMS_envelope = X

ShockSpectrum

20deg-45gRMS_envelope = Z

ShockSpectrum

TestName

TestDate

MeasurementPointName

Type

20deg-45gRMS_envelope = along Y

ShockSpectrum

Figure 2.22. Comparison of extreme response spectra in all three directions: Z is predominant. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

85

86

Reliability of High-Power Mechatronic Systems 2

ncon point 3Z / reference point

TestName

TestDate

MeasurementPointName

Type

20deg-45gRMS_ch16_Reference_+Z

Coherence

Figure 2.23. Coherent function between a central point on the table and point 3 of the table (along Z). For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

Test: Type: Measurement Point: 20deg-45gRMS_ch12_point4_+Z Max Value: 18.2738

Figure 2.24. Normal probability plot of the acceleration measured at P4Z. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

Highly Accelerated Testing

87

Shock Response Spectrum

TestName

TestDa te

MeasurementPointName

Type

M50deg-35gRMS_ch3_point1_+Z

ShockSpectrum

20deg-35gRMS_ch3_point1_+Z

ShockSpectrum

TestName

TestDate

MeasurementPointName

Type

P150deg-35gRMS_ch3_point1_+Z

ShockSpectrum

Figure 2.25. Influence of temperature: 20, 50 and 150 °C. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

2.2. Comparison of HALT versus ALT testing by fatigue The last few sections attempted to explain the essence of highly accelerated testing. Chapter 1 discussed accelerated life testing. The reader will no doubt have understood that the nature and objectives of these two types of testing are clearly distinct. Thus, it seems problematic to draw parallels between them. However, designers are rapidly faced with a dilemma. How should the relevance of a failure revealed by a highly accelerated test be interpreted? In other words, if a failure appears in a HALT test, what is the risk that the defect will occur over the actual lifetime of the product? Do we need to modify the design, the manufacturing process, the usage conditions, etc.? Design modifications or changing the supplier have significant technical and business ramifications, and so the authors wish to propose a methodology to support the reader’s decision making. This methodology is based on a damage equivalence approach. Even though this approach might at first glance seem rough or simplistic, it is in

88

Reliability of High-Power Mechatronic Systems 2

fact an innovative method that will provide designers with an indication of the risk, and assist them in making an informed decision. The method is based on four analysis stages: 1) Calculation of the fatigue damage spectrum (FDS) of vibration tests by repetitive shock stimuli, also known as HALT [AST 06]. 2) Calculation of the FDS of vibration tests that are representative of the real environment experienced by the product. 3) Comparison of these two FDSs, focusing on the frequency band of the defect triggered by the HALT test. 4) Application of a fatigue criterion to determine the criticality of the failure observed in the HALT test. We will begin by defining the notion of FDS, and then illustrate our method in the example case of a high-power automobile system. 2.2.1. The fatigue damage spectrum The guidelines for applying the customization approach in mechanical environments [AST 09], established by the Association of Sciences and Environmental Technology (ASTE) in July 2009, give an in-depth explanation of the method and relevant tools. The fatigue damage equivalence method was developed and implemented in France in the 1970s. Originally developed at CEA, CESTA (Atomic Energy Commission, Center for Scientific and Technical Studies Aquitaine), it then spread to many other institutions. The method involves finding the characteristics of the vibration that produces the largest observed stress on a linear system with one degree of freedom throughout the full period over which the environmental stress is applied, as well as the resulting fatigue damage. Thus, in this case, the equivalence between real/specification environments is based on the two main damage mechanisms of mechanical systems: – exceeding a stress limit (elasticity limit, breaking limit); – fatigue damage created by the accumulation of stress cycles over the full period of application of the environmental stress.

Highly Accelerrated Testing

89

At thhe specification drafting or o comparison n phase, it is unlikely u that the dynamicc characteristiics of the maaterial are kno own, or that calculating c theese parametters is possibble. The compparison is theerefore conduucted not on the basis off the actual struucture, but ratther on a simp ple mechanicaal model, a linear system with w one degrree of freedom m (Figure 2.26 6), whose natuural frequencyy f0 is variedd over a sufficciently large domain d to cov ver all resonannt frequenciess of the futuure structure. This T simplifieed system doees not attempt to represent tthe actual structure, s althoough as a rouugh approxim mation it oftenn gives an inittial idea of the t responses of the structuure.

Figu ure 2.26. Mech hanical modell of a system with w one degre ee of freedom

It iss simply a reference sysstem that alllows the efffects of seveeral environm ments on a relatively simple s system m to be compared agaiinst mechannical damage criteria. We then assume that any twoo vibrations thhat producee the same efffect on this “bbenchmark” will w have the same s severity on the actuual structure in i question, which w in geneeral is not lim mited to a single degree of o freedom, annd may not bee linear. Variious studies have h shown thhat this hypoth hesis is not unnrealistic ([LA AL 02] for shocks, s or forr fatigue resisttance in the caase of vibrations [DEW 86])). The calculation off the FDS makkes the follow wing assumptioons: – Thhe Wöhler cuurve follows Basquin’s B law w, which gives an analytiical represenntation of the linear part of the Wöhler curve. This curve c relates the number of cycles N required to break a given material test specimen to the a to this material: amplitude of the sinuusoidal stress applied N sb = C

[22.1]

90 0

Reliability off High-Power Mechatronic Systtems 2

whhere b and C are a characterisstic constants of the materiaal: – the damagge follows the definition giv ven by Miner; – the damagge is linearly cumulative c (M Miner’s rule). We define the t “fatigue damage d spectrrum” as the curve c represennting the vaariation in the damage D as a function off f0 for fixed x and b (b = innverse of the slope of the Wöhler curvee) [BON 77, LAL L 84]. Let ( ) bee a vibration defined d by an n acceleration as a functionn of time appplied to a linear system wiith one degreee of freedom (f0, Q: qualitty factor) for a duration T. T We assume that: 1) The mateerial from which the system m is made hass a Wöhler cuurve that caan be described analyticallyy using a Basquin-type law (Figure ( 2.27).

Fig gure 2.27. Rep presentation of o a Basquin fa atigue law

2) The stress-strain relatioonship is lineaar of the form σ=K

[2.2]

3) Miner’s law may be appplied (linearly y cumulative damage). Byy definition:

D=∑

ni Ni

Kb D= C

∑n z

i

b i i

i

[2.3]

Highly Accelerated Testing

91

where ni and zi are given by the histogram of the peaks of the relative displacement response z(t). Cases where the load history (order in which the high stresses and low stresses are applied) affects the outcome are not considered in this book. The histogram of peaks of the responses z(t) can be determined by counting with the rainflow method. This method allows us to identify the domains where the relative displacement varies, their average values and thus the amplitude and average of each cycle. Figure 2.28 presents the method for calculating a FDS [BON 15] by applying rainflow counting to a time signal. 0.015

Excitation signal recorded by the equipment carrier

0.01 0.005 0 -0.005 -0.01 -0.015

Mechanical models with 1 degree of freedom with natural frequency f0i

f01

f0n

f02

Model responses

Counting of stress cycles

Cumulative damage (Miner's law)

D1 :

Damage (Di) Maxima (zsup i)

DN,

D2,

Fatigue damage spectrum D1

0.0006

FDS

0.0005

0.0004

D

D2

0.0003

0.0002

Dn

0.0001

0

0

5

10

15

20

f0

25

30

35

40

45

Figure 2.28. Rainflow method to calculate the FDS [BOI 00]

92 2

Reliability off High-Power Mechatronic Systtems 2

The idea is to calculate the t time respo onses of the N mechanicall models wiith one degreee of freedom, which have natural n frequenncies ranging from f01 to f0N. The streess domains are a counted using u the rainnflow methodd at each m of the dam mage associateed with each domain naatural frequenncy. The sum yields the damaage Di at the natural frequency f0i. The set of N valuues of Di givves the FDS. The averagee damage expeerienced by th he system withh natural freqquency f0 is therefore giveen by [LAL 022]:

[2.4] We will illustrate the methodology m by b considerinng a case stuudy of a coonverter/invertter system forr a high-powerr automotive application. a 2.2.2. Automo obile case study: s break kage of a con nverter/inve erter 2.2.2.1 Descrip iption of test To study thhe reliability of the systeem, a highly accelerated test was peerformed in thhe form of a HALT H test (Fig gure 2.29).

Figure 2..29. HALT tesst of an automo obile converte er/inverter. Forr a color version n of this figure e, see www.istte.co.uk/elham mi/mechatronicc2.zip

Highly Accelerrated Testing

93

As illustrated in Figure 2.30, this test in nvolves applyying a vibratiion m several pneuumatic cylindders stimuluss in the form of repetitive impacts from on a tabble upon whichh the equipmeent is placed.

ng system. Fo or Figure e 2.30. Pneum matic cylinderss of a qualmarrk HALT holdin echatronic2.zip p a colo or version of th his figure, see e www.iste.co.uk/elhami/me

The overall excitaation level is gradually incrreased from 5 to 40 gRms,, as i Figure 2.300. shown in

Figure 2.31. Control prog gram for the HALT H test. For a color elhami/mechatronic2.zip ve ersion of this figure, f see ww ww.iste.co.uk/e

94

Reliability of High-Power Mechatronic Systems 2

The standard HALT procedure is described in the European document [CEE 09] and recommends a vibrational stimulus of 1–10 gRms in the frequency band of 2 to 2,000 Hz or more. In our case, the equipment is tested until the first instance of failure is observed. A control voltage/current is also applied to the EUT, as we can see in Figure 2.31.

Figure 2.32. Control of the bench and the EUT. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

Highly Accelerrated Testing

95

In parallel p to thee severe vibraatory load wh hile the EUT is operated, an environm mental climaate is applied (Figure 2.32 2). This rangges from −70 to +140 °C C.

Figure 2.3 33. Temperatu ure range of th he test. For a color ve ersion of this figure, f see ww ww.iste.co.uk/e elhami/mechatronic2.zip

Meaasurements weere taken at several s points on the circuit by a 3D laaser vibromeeter, which allowed a the critical frequ uency band of o 250–400 Hz (resonannt frequency band b of the caapacitor) to be identified.

Figure e 2.34. 3D lase er vibrometer taking t dynamiic measureme ents of the EU UT. For a co olor version off this figure, se ee www.iste.c co.uk/elhami/m mechatronic2.zzip

96 6

Reliability off High-Power Mechatronic Systtems 2

In this test, a film capacitor (Figure 2.3 34) broke during the 45 gRm ms phase off the HALT test.

Figure 2.35 5. Film capacittor failure at 45 4 gRms of the e HALT test. F For a color version n of this figure e, see www.istte.co.uk/elham mi/mechatronicc2.zip

45 gRms (F Figure 2.35) is i 1,000 times higher thann the actual excitation levvels on the veehicle.

Figure 2.36. Example e of HALT time signal. For a color version n w k/elhami/mech hatronic2.zip of thiss figure, see www.iste.co.uk

We might juudge this failuure as a robusttness limit thaat does not poose a risk to the vehicle. Alternatively, A the designer might decide that the test rreveals a a actual veh hicle. In this case, c he or shhe might deefect that couuld occur on an deecide to modify the designn (materials, manufacturing m g processes, suppliers, etcc.) and condduct longer and a more ex xpensive vibration reliabillity tests (F Figure 2.36) with w acceleratted levels thaat are represeentative of thhe actual

Highly Accelerrated Testing

97

vehicle in the sense of o damage equuivalence (e.g g. ALT defined by the methhod of “test customizationn” [AFN 16])..

Figure 2.37. 2 Vibratory ry specification ns representattive of the veh hicle: life profile e test

The objective in this t context iss thus to judg ge the criticaliity of the failuure t HALT test. that occcurred during the It might m seem tem mpting to applly a Fourier trransform, alsoo known as a ffast Fourier transform (F FFT), to the signal meassured during the HALT ttest s in Figuure (Figure 2.36) after suuperimposing with the specctral density shown H the HALT H signal hhas 2.37 as a way of evalluating the sevverity level. However, h kurtosis, or in other words w statisticaal flattening coefficient c [DEL a very high 06], which reflects itss extremely trransient naturee. The conditions for applyiing FFT aree therefore not satisfied (stationary, errgodic and stoochastic signaal). Calculatting the FDS is i therefore a necessary step p. 2.2.2.2. Estimation of the Basqu uin coefficien nt The brazed joints are marked by red circles as a shown in Fiigure 2.35. Thhey d to made up of SAC305 (ttin, have weeights of less than 6 g, andd are assumed 3% silveer and 0.5% copper). c Figuure 2.38 [KIM M 06] presents a fatigue stud dy of brazed joints made froom SAC3055.

98

Reliability of High-Power Mechatronic Systems 2

Figure 2.38. Fatigue curve of SAC305 brazed joints. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

If we consider a Basquin-type relation of the form N sb = C [2.1], we can read the following points on the graph: – N1 = 104 cycles at σ1 = 4 MPa – N2 = 107 cycles at σ1 = 2 MPa This gives a Basquin coefficient of b = 10. We shall now determine the value of the constant C: 104 ·(4.106)10 = 1070 = C What is the coefficient of proportionality in equation [2.2]? In order to calculate the FDS, we need to know the coefficient of proportionality between the stress and the strain. This coefficient is determined iteratively until a damage value close to 1 is found in the critical frequency band that we identified earlier. In our case, we need to find a value of K that gives an FDS close to 1 in the frequency band 250–400 Hz. This yields the value: K = 2 × 1010 N·m−3.

Highly Accelerrated Testing

99

2.2.2.3. Superpositiion of fatigue e damage sp pectra The methodologyy proposed herre superimposses the FDS of o the HALT ttest A reliabilityy test. and the FDS of the ALT Duriing the HALT test, the ellectronic com mponent was dislodged at an acceleraation of 45 gR Rms. The accceleration wass measured at a a point on tthe electronnic circuit neaar the componnent during th he test. The measurement m w was taken at a 20 gRms and is assum med to represent the stress input of the capacitoor. We will assume a that the t two seveerity levels of 20 gRms aand 40 gRm ms are proporrtional. A coeefficient of 2..25 is appliedd to the damaage spectrum m established at 20 gRms inn order to obtaain the FDS att 45 gRms. The FDS was calcculated with thhe following parameters: p b = 10; K = 2 × 1010 N·m−3; C = 1070 The quality factoor is fixed at 10 (i.e. a dam mping ratio off ξ = 5%); froom 1 Hz to 10 kHz; 48 pooints per octavve. Figuure 2.39 beloow presents the t result obttained by supperimposing the FDSs of the HALT tests t at 25 gR Rms and at 45 gRms (failurre point) and the life proffile.

Figure e 2.39. FDSs of o HALT testss at 20 gRms, 45 gRms and d life profile. Fo or a colo or version of th his figure, see e www.iste.co.uk/elhami/me echatronic2.zip p

10 00

Reliability of o High-Power Mechatronic M Sys stems 2

Note that thhe “life profille” specificatiion corresponnds to the FD DS of the relliability test that is assum med to be rep presentative of o the lifetim me of the veehicle, which is 15 years annd covers the two followinng situations: ttransport phhase and usagee phase. The transpoort phase is assumed a to last 50 h, andd the usage phase is assumed to lastt 8,550 h (i.e. 570 h of ussage per year)). Miner’s sum mmation t obtain a daamage spectru um equivalentt to the lifetim me of the ruule was used to veehicle. As shown inn Figure 2.40, in the case of o our capacittor, which failled at an accceleration RM MS of 45 g annd which has predominantly p y dynamic behhavior in the frequency band b 250–4000 Hz, the FDS S of the HAL LT test at 45 gRms is 10011 times morre severe thaan the test reepresenting thhe vehicle’s life (life prrofile).

Figure 2.40 0. Comparison n of FDS at 45 gRms and off the life profile e. For a color versio on of this figurre, see www.is ste.co.uk/elhami/mechatron nic2.zip

2.2.2.4. Fatigu ue criterion The damagee ratio in the frequency f band that we are interested in, which is 1 11, is not phhysically mean ningful unlesss we choose a criterion off the order of 10 wiith which to juudge the severrity.

Highly Accelerated Testing

101

We shall propose a fixed upper bound as a fatigue criterion in order to derive a first estimate of the risk of failure during the HALT test. This criterion is based on a “stress–resistance” approach (Figure 2.41), which is explored in the references [AST 09, DEL 06, LAL 02, AFN 16, PIE 13].

Figure 2.41. Representation of the stress–resistance approach. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

The idea of this approach is relatively straightforward. First, we assume that the statistical variability of the vibration environment is known, represented by the red curve. Second, we assume that the statistical distribution of the resistance of our product is also known, represented by the blue curve. The surface of intersection of the two probability densities yields the probability that there is a failure. In other words, if a product with a “weak” design (lower values of the blue curve) is placed in a relatively “severe” environment (higher values of the red curve), there is a high chance that a failure will occur. The two probability densities can have various shapes (exponential, normal, log-normal, Weibull, etc.). In general, we will attempt to express the statistical parameters of these distributions (mean and standard deviation) in terms of the “coefficient of variation” (CV). The CV is a dimensionless value describing the ratio of the standard deviation to the mean value. Thus, every distribution has its own CV. For the resistance distribution, we shall write it as CVr (blue curve), and for the environmental distribution, we shall denote it as CVe (red curve). Bearing in mind this concept of “stress–resistance”, we propose the following fatigue criterion: Criterion = (Env × Prod)b where: – Env: value of CVe; – Prod: value of CVr; – B: Basquin coefficient.

[2.5]

102

Reliability of High-Power Mechatronic Systems 2

To obtain an upper bound for the fatigue criterion, the authors propose the following values, which are considered to be all-encompassing, covering virtually every possible situation. – Env: value of CVe= 2; – Prod: value of CVr = 2; – B: Basquin coefficient = 10. If we have more precise knowledge of CVr or CVe, we can adjust the criterion accordingly. In our application, the value of the criterion is therefore: Criterion = (Env. × Prod.)b = (2 × 2)10 = 410 This criterion can be used for any other application (electronics, mechanics, mechatronics) in order to judge the criticality of a failure revealed by a HALT test. In other words, we should consider a failure (caused by the accumulation of fatigue damage) triggered during a HALT test to represent a probable risk to the vehicle (in its actual life) if the ratio of the HALT FDS and the life profile FDS is lower than (Env. × Prod.)b = (2 × 2)10 = 410. As we can see in Figure 2.40, the ratio is FDSHALT/FDSLife profile = 1011 This ratio is significantly higher than our fatigue criterion of 410. This would lead the designer to decide not to initiate design changes, judging the triggered failure as corresponding to a robustness value that is not critical to actual vehicle usage. 2.2.2.5. Conclusion This approach offers an innovative and accessible approach allowing any company to evaluate the criticality of a failure triggered by a highly accelerated test, and enabling designers to make the best possible technical and economic decisions. This approach is powerful by virtue of its simplicity, which can always be adjusted as needed. However, in the words of the French philosopher Paul Valery [PAU 42]: “What is simple is false and what is not is useless”.

Highly Accelerated Testing

103

2.3. Comparison of accelerated life tests and highly accelerated tests Highly accelerated testing and accelerated life testing should not be confused. The objective of accelerated life testing is to predict the lifetime of a product in its operational usage conditions by subjecting it to stresses that are more severe than the values that it is expected to experience over the course of its life profile. Any such tests are necessarily based on analytical models for accelerating the failure modes of the product, which is not the case for highly accelerated testing. Theory

Prerequistes

Expected results

Pay attention to

Accelerated life tests A product is subjected to amplified functional and/or environmental stresses in order to accelerate failure mechanisms and reduce the time required to estimate certain behavioral characteristics of the product under normal usage conditions.

Highly accelerated tests – Apply incremental levels of environmental, mechanical, electrical stress, increasing until a failure occurs – Analyze the failure (is it assignable?) – Undertake the necessary corrective measures (design, technology, or usage conditions) – Resume stimuli at the last point and continue by increments Knowledge of the analytical model – The applied levels are higher than those of the life profile connecting deterioration to the amplitude of applied stresses. – The analysis must be able to show whether the failure is the consequence of a latent flaw (which should then be corrected) or whether the technological limit has been reached (end of testing) – The product will reach the limitations Estimation of the behavioral inherent to the technologies that it uses characteristics of a product (reliability, lifetime, etc.) in normal – Product maturation is achieved before the usage conditions; this should be start of the production cycle achievable within a time frame – The operating margins are improved and compatible with the time constraints mastered of the product development cycle. – The operational reliability of the product is improved (but these tests do not quantify this) – The costs of returned units during the warranty period is reduced The triggered failure mechanisms – The applied stresses must be less than the should be representative of those technological limits of the components of involved in normal product usage. the product Any potential interaction – These tests must be performed before phenomena between two or more qualification of the design and manufacturing processes influencing factors must be taken into account.

104

Reliability of High-Power Mechatronic Systems 2

2.4. Standards At the international level, the IEC 62506 standard discusses highly accelerated tests of type A, which aim to identify potential design flaws as well as any flaws in the manufacturing processes, potentially inducing these flaws at much higher levels than the operational limits. The objective of this type of testing is not to quantify the reliability of the product, but rather to induce or accelerate global performance issues with the product during testing that are likely to occur in the field during the useful life of the product, leading to product failure. These failures can be prevented by improving the design of the product or its manufacturing processes. A more robust product is a product that is more reliable in the field even in the presence of extreme or repetitive stresses, as described by the design specifications. The standard also defines the HALT and HASS approaches (see Appendix). In the United States, the following website: http://www.iest.org/Standards-RPs/Recommended-Practices/IEST-RP-PR 001 offers the following document for sale: – EST-RP-PR001: Management and Technical Guidelines for the Environmental Stress Screening (ESS) process. This document was last updated in 2016. Additionally, the following document can be used as a reference: – MIL-HDBK-781: RELIABILITY TESTING HANDBOOK. A version of this document dating from 1996 may be downloaded from the website: everyspec.com › Library › MIL-HDBK › MIL-HDBK-0700-079. – MIL-HDBK-781 is currently being revised by Working Group 781 of the Product Reliability Division of IEST (WG-PR781). For highly accelerated tests, IEST has developed the following recommended practice: “HALT and HASS Recommended Practice: A Stepping Stone for the Industry” This Recommended Practice (RP) published by IEST defines and describes highly accelerated tests (HALT) and accelerated stress procedures

Highly Accelerated Testing

105

(HASS). IEST-RP-PR003.1 discusses the underlying philosophy of these tests, generic examples of tests, the differences between standard testing equipment and the equipment used for highly accelerated tests, aspects relative to mounting, alternative approaches, supplementary environments and lessons learned. In France, the following two documents are relevant: – RGAéro 0029: this document was developed within the framework of the Standardization Office for Aeronautics and Space (BNAE); title: “Guidelines for defining and conducting highly accelerated tests”. This recommendation has one particularity compared to the IEC 62506 standard; it describes the methodological principles of highly accelerated testing for products with a highly diverse range of technologies, whereas the HALT tests described in IEC 62506 are limited to electronic equipment: – The “Handbook for burn-in of electronic devices – benefits of the highly accelerated approach” developed by ASTE; this document is the application of recommendation RG0029 to the concept of burn-in. 2.5. Bibliography [AFN 16] AFNOR, Démonstration de la tenue aux environnements Conception et réalisation des essais en environnement, NFX 50-144-1 to 6, 2016. [AST 06] ASTE, Guide “HA-ESS guideline”, www.aste.asso.fr, January 2006. [AST 09] ASTE, Guide d’application de la démarche de personnalisaiton en environnement mécanique, PRO NORMDEF 0101 Edition, Ministère de la Défense, available at: http://www.aste.asso.fr/fr/pag-488138-Guide-climatique-et-mecanique html, July 2009. [AST 10] ASTE, Guide d’application de la démarche de personnalisaiton en environnement-climatique, PR ASTE 01-02, 2010. [BAG 00] BAGDONAVICIUS V., GERVILLE-REACHE L., NIKOULINA V. et al., “Expériences accélérées: analyse statistique du modèle standard de vie accélérée”, Revue de Statistique Appliquée, vol. XLVIII, pp. 5–38, September 2000. [BAG 01] BAGDONAVICIUS V., NIKULIN M., Mathematical Models in the Theory of Accelerated Experiments, World Scientific Publishing, 2001. [BAG 95a] BAGDONAVICIUS V., NIKULIN M., “On accelerated testing of systems”, European Journal of Diagnosis, Safety and Automation, vol. 5, no. 3, pp. 307– 316, 1995.

106

Reliability of High-Power Mechatronic Systems 2

[BAG 95b] BAGDONAVICIUS V., NIKULIN M., “Semi-parametrics Models in Accelerated Life Testing”, Queen’s Papers in Pure and Applied Mathematics, Queen’s University, Kingston, Ontario, Canada, p. 70, 1995. [BAG 97] BAGDONAVICIUS V., NIKULIN M., “Transfer functional and semi-parametric regression”, Biometrika, vol. 84, no. 2, pp. 365–378, 1997. [BAS 82] BASU A.P., EBRAHIMI N., “Non-parametric Accelerated Life Testing”, IEEE Transactions in Reliability, vol. 31, no. 5, pp. 432–435, 1982. [BOI 00] BOITEUX D., “Essais d’endurance de systèmes embarqués”, PSA, October 2000. [BOM 73] BOMPAS-SMITH J.H., Mechanical Survival: The Use of Reliability Data, McGraw Hill, 1973. [BON 15] BONATO M., DELAUX D., “Synthesis and Validation of Accelerated Vibration Durability Tests”, RAMS, 2015. [BON 77] BONNET D., LALANNE C., “Choix des essais – Analyse de l’environnement mécanique vibratoire réel en vue de l’élaboration de spécifications d’essais”, Journées ASTE, 1977. [BNA 05] BNAE, Guide pour la définition et la conduite des essais aggravés, available at: http://shop.bnae.asso.fr/products/result/page:4, 2005. [CAR 98] CARUSO H., DASGUPTA A., “A fundamental Overview of Accelerated Testing Analytical Models”, Proceedings Annual Reliability and Maintainability Symposium, pp. 389–393, 1998. [CEE 09] CEEES, Reliability for a Mature Product from the beginning of useful life, 2009. [DEW 86] DE WINNE J., “Equivalence of fatigue damage caused by vibrations”, IES Proceedings, pp. 227–234, 1986. [DEL 06] DELAUX D., “Reliability validation of engine cooling modules with a tailoring tests of Vibration, Thermal Shock and Pressure Pulsation”, Revue Essai & Simulation, 2006. [DEM 95] DEMONSANT J., Un exemple de plan d’expériences numériques: optimisation du profil de denture des engrenages d’une boite de vitesses, ASTELAB, 1995. [DEM 97] DEMONSANT J., Un plan d’expériences pour valider la robustesse d’une conception, ASTELAB, 1997. [DEM 99] DEMONSANT J., Le processus de l’ingenierie robuste, ASTELAB, 1999. [DEV 98] DEVARAJAN K., EBRAHIMI N., “Non-parametric Approach to Accelerated Life Testing under Multiple Stress”, Naval Research Logistics, vol. 45, no. 6, pp. 629–644, 1998.

Highly Accelerated Testing

107

[DOY 91] DOYLE R.L., Mechanical Reliability, RAMS Tutorial Notes, 1991. [FOR 61] FORD D.G., GRAFF D.G., PAYNE A.O., “Some statisticalaspects of fatigue life variation, Fatigue aircraft structure”, Proceedings of 2nd ICAF Symposium, Paris, pp. 179–208, 1961. [GUE 01] GUERIN F., DUMON B., HAMBLI R., “Determining the shape parameter of a Weibull distribution from Mechanical- Damage Model”, Annual Reliability and Maintainability Symposium, Proceedings, pp. 156–160, pp. 372–376, 2001. [GUE 05] GUERIN F., TEBBI O., DUMAN B., “Estimation de la fiabilité par les essais”, Mecanique & Industries, vol. 6, pp. 155–167, 2005. [GUE 06] GUERIN F., BARREAU M., CHARKI A., Déverminage par échantillonnage ASTELAB, 2006. [INS 01] Institut de Sureté de Fonctionnement, Recommandations dans l’usage industriel des essais hautement accélérés, March 2001. [KIM 06] KIM Y.B., NOGUCHI H., AMAGAI M., “Vibration fatigue reliability of BGAIC package Pb-free solder and Pb Sn Solder”, Microelectronics Reliability, vol. 46, pp. 459–466, 2006. [KEC 98] KECECIOGLU D., JIANG M., SUN F.-B., “A unified approach to random fatigue reliability quantification under random loading”, Proceedings of IEEE Reliability and Maintenability, pp. 308–313, 1998. [LAL 84] LALANNE C., “Maximax response and fatigue damage spectra”, The Journal of the Environmental Sciences, Part I, vol. XXVII, n°4, July/August 1984, Part II, vol. XXVII, n°5, September/October 1984. [LAL 99] LALANNE C., Vibrations et chocs mécaniques: Dommage par fatigue, Hermes, Paris, 1999. [LAL 02] LALANNE C., Mechanical Vibration and Shock, Volume 2: Mechanical Shock, Hermes Penton, 2002. [LEM 96] LEMAITRE J., CHABOCHE J.L., Mécanique des matériaux solides, Dunod, Paris, 1996. [LIT 79] LITTLE R.E., EKVALL J.C., “Statistical Analysis of Fatigue Data”, American Society for Testing and Materials, STP744, 1979. [OCO 03] O’CONNOR P., “Testing for reliability”, Quality and Reliability Engineering International, vol. 19, pp. 73–84, 2003. [PHA 03] PHAN H., Handbook of Reliability Engineering, Springer, Berlin, 2003. [PHA 89] PHADKE M.S., Quality Engineering using Robust Design, Prentice-Hall, Upper Saddle River, 1989.

108

Reliability of High-Power Mechatronic Systems 2

[PIE 13] PIERRAT L., DELAUX D., “Analytical improvement of the stress-strength method by considering a realistic strength distribution”, Applied Reliability Symposium, Berlin, 2013. [REC 17a] RECALL Volvo truck – NHTSA website – retrieved on 31 January 2017, available at: https://www.nhtsa.gov/recalls?nhtsaId=15V651, 2017. [REC 17b] RECALL Mercedes-Benz – NHTSA website – retrieved on 31 January 2017, available at: https://www.nhtsa.gov/recalls?nhtsaId=15V711, 2017. [REC 17c] RECALL Ford – NHTSA website – retrieved on 31 January 2017, available at: https://www.nhtsa.gov/recalls?nhtsaId=15V712, 2017. [REC 17d] RECALL Hyundai – NHTSA website – retrieved on 31 January 2017, available at: https://www.nhtsa.gov/recalls?nhtsaId=16V117, 2017. [REC 17e] RECALL Volkswagen – NHTSA website – retrieved on 31 January 2017, available at: https://www.nhtsa.gov/recalls?nhtsaId=16V171, 2017. [SHY 99] SHYUR H.-J., ELSAYED E.A., LUXHOJ J.T., “A general model for Accelerated Life Testing with time-dependent covariates”, Naval Research Logistics, vol. 46, pp. 303–321, 1999. [SIA 16] SIA GUIDELINES, Guide d’aide à l’estimation et à la validation de la fiabilité automobile, SIA, 2016. [TEB 03] TEBBI O., GUÉRIN F., DUMON B., “Reliability testing of mechanical products-application of statistical accelerated life testing models”, 9th International Conference on Applications of Statistics and Probability in Civil Engineering, University of California, Berkeley, July 6–9 2003. [VAL 60] VALERY P., Œuvres II Gallimard, coll. “Bibliothèque de la Pléiade”, p. 864, 1960. [YAN 07] YANG G., Life Cycle Reliability Engineering, John Wiley & Sons, New York, 2007.

3 Reliability Study for Cuboid Aluminum Capacitors with Liquid Electrolyte

In many areas, the market for high-power electronic embedded systems requires them to be both highly compact and reliable. To strike the right compromise, manufacturers use component technologies that are increasingly compact, with well-defined lifetimes in the conditions specified by the supplier. However, in most cases, the reliability of the component depends on the operational profile of the system. In our case study, the technology considered is that of aluminum capacitors with liquid electrolyte, built in a compact cuboid case. Proper understanding of the reliability of this technology is a necessary step toward ensuring that the high-power electronic system operates properly. To this end, we conduct a reliability study that begins by studying the technology in order to identify the parameters that should be monitored when performing aging tests on this component. The data provided by this study are then used to establish a deterioration model as a function of the operational conditions. 3.1. Introduction and objectives This type of capacitor is widely used in embedded mechatronic systems, especially high-power electronic subsystems, because of the large area of their aluminum electrodes and their thin layer of alumina dielectric. Compared to other electrolytic capacitors available on the market, these capacitors offer a good compromise between high capacitance and low Chapter written by Chadia LACHKAR, Moncef KADI, Jean-Paterne KOUADIO, JeanFrançois GOUPY, Philippe EUDELINE, Sébastien BOILEAU and Tarik AIT-YOUNES.

110

Reliability of High-Power Mechatronic Systems 2

operating voltage. However, they are renowned for their high failure rate, which is most notably a consequence of their high sensitivity to extreme temperatures and the high numbers of charge/discharge cycles experienced in operational conditions [KIU 83]. In general, the poor reliability of these capacitors is the result of deterioration in the dielectric or electrolyte leakage. One possible solution to this problem involves using a cuboid-shaped coiled structure that is more waterproof and which is better able to withstand the internal pressure of the electrolyte. This technology makes it possible to improve the reliability of the component while decreasing its bulk, giving the system a more compact structure. However, in a hot and dry environment, these capacitors often develop failures. The criticality of the aeronautical applications in which they are used requires an extremely good understanding of how to estimate their lifetimes. We therefore performed aging tests on these capacitors in order to measure the evolution of their electrical and dimensional parameters. In the context of the considered application, these parameters are thought to represent good indicators of capacitor lifetime. The planned tests were designed to reflect the profile of the system being studied, allowing us to emulate their operational conditions, namely the thermal and electrical operating environment of capacitors. The lifetime is then estimated from the correlation between their electrical and dimensional parameters. These parameters depend on the amount of electrolyte leakage induced by the internal pressure of the capacitor, which results in a sudden drop in the capacitance or an increase in the equivalent series resistance of the capacitor, thus reducing the lifetime of the components and therefore the system as a whole [GAS 96]. This chapter is organized as follows: we begin by giving a precise description of the characteristics of this type of aluminum electrode capacitor. We then discuss their reliability. Finally, we present the aging tests and the deterioration model in the last section of this chapter. 3.2. Characteristics of aluminum capacitors with liquid electrolyte 3.2.1. Basic principle [PER 02] Electrolytic capacitors with aluminum electrodes consist of a cathode and an anode made up of aluminum foil, electrolyte-soaked paper and a layer of alumina (Al2O3) dielectric deposited on the anode foil (Figure 3.1). The oxide

Reliability Study for Cuboid Aluminum Capacitors

111

layer is thin (a few nanometers to a few hundred nanometers), which gives a better capacitance/volume ratio. The constitution of this type of capacitor is as follows:

Figure 3.1. Structure of an aluminum-electrode capacitor with liquid electrolyte

– Dielectric: The aluminum oxide (alumina) layer is formed by means of an electrochemical process. To do this, a voltage is applied to the aluminum foil representing the anode submerged in an acidic solution. This voltage is known as the formation voltage. The dielectric is formed by the following process: 2

+3



+ 3

(

)

[3.1]

The anode foil and the cathode with electrolyte-soaked paper are rolled up (Figure 3.2) in an aluminum case.

Figure 3.2. Capacitor coil winding [PER 02]

112

Reliability of High-Power Mechatronic Systems 2

– Electrolyte: The choice of electrolyte depends on the nominal voltage, the chemical stability of the electrolyte and the pressure of the vapor that it produces in a high-temperature environment. It consists of a conductive solution dissolved in a solvent. In our case study, the solvent is gammabutyrolactone. 3.2.2. Cuboid-shaped geometry The aluminum electrolytic capacitors presented in the literature typically have cylindrical shapes. However, the capacitors that we shall study here are cuboid-shaped, which makes them less bulky and thus allows more compact structures to be achieved. Figure 3.3 shows one such electrolytic capacitor, manufactured by F&T (cf. notice in the Appendix1).

Figure 3.3. Electrolytic capacitor fitted in a cuboid-shaped case

3.2.3. Equivalent electrical circuit [IMA 07] Several equivalent electrical circuits that can be used for electrolytic capacitors have been proposed in the literature. We will use the simplified equivalent circuit as follows (Figure 3.4):

Figure 3.4. Equivalent circuit for an electrolytic capacitor

1 Appendices to this book can be found at www.iste.co.uk/elhami/mechatronic2.

Reliability Study for Cuboid Aluminum Capacitors

113

– CAK: ideal anode-cathode capacitance (main element of the capacitor); – Rp: parallel resistance (leakage) due to the alumina layers; – R1: resistance of the aluminum electrodes and connecting terminals; – R2: resistance of the electrolyte; – L: equivalent series inductance of the connections and windings. The capacitance CAK is a function of the permittivity of the dielectric, ε, the surface area of the aluminum foil, S, and the thickness of the dielectric material, e (equation [3.2]). It mainly depends on the temperature and the voltage. [3.2]

= .

The resistance Rp varies as a function of the voltage applied to the capacitor, the temperature and the polarization time. It represents the leakage current of the capacitor. The inductance L on the other hand varies as a function of the frequency and the temperature, whereas the resistance Rl varies primarily as a function of the temperature and exhibits little frequency-dependent variation. 3.2.4. Electrical characteristics [IMA 07] The above electrical model can be simplified, leading to the following equivalent series model (Figure 3.5):

Figure 3.5. Equivalent series circuit

– C: equivalent series capacitance; – ESR: equivalent series resistance, representing all losses in the capacitor. This is in the order of a few tens of milliohms. Its value depends on the value of the temperature of the capacitor and the frequency;

114

Reliability of High-Power Mechatronic Systems 2

– ESL: equivalent series inductance, identical to L, with a value of a few nH. Placing these three quantities in series gives an expression for the equivalent impedance: =

+

.

[3.3]



The equivalent capacitance and the equivalent resistance depend on the frequency according to the following formulas: . 1+

= =

+

.

+

[3.4]

.

.

.

[3.5]

We can use these two quantities to define another characteristic quantity of the capacitor, the dissipation factor: =

. .

[3.6]

– Leakage current: in our study, the capacitors are used to store electrostatic energy under a DC voltage. This parameter is a useful indicator when evaluating the state of the capacitor. The main electrical characteristics that we shall use to evaluate the deterioration state of the capacitor are the capacitance C, the equivalent series resistance ESR, the dissipation factor DF and the leakage current. 3.2.5. Physical characteristics Although the characteristics listed above allow us to define the state of health of a capacitor, they are not enough to establish a model of its deterioration and lifetime [COU 15]. Indeed, the electrical characteristics of a capacitor can drop abruptly enough that we cannot consider a continuous deterioration model a priori. We must therefore measure certain observable physical characteristics, namely the mass and the dimensions of the capacitor’s case.

Reliability Study for Cuboid Aluminum Capacitors

1 115

3.3. Pa arametric ch haracterization 3.3.1. Measuring M C ESR and DF C, D The electrical characterizatio c on of the capacitors c iss conducted in I standard 60384-4 6 [IEC C 07]. accordaance with the IEC In our study, the capacitance C and the equ uivalent seriess resistance ESR a LCR meterr. After choossing the equivvalent circuit aand are meaasured using an the AC voltage to apply, a calibrattion is perforrmed at 100 Hz H and at rooom i the chosenn reference freequency for measuring m theese temperaature, which is two eleectrical param meters. Afterr measuring C and ESR R, we can thhen determinne the last parrameter, the dissipation facttor DF (equatiion [3.6]). 3.3.2. Measuring M th he leakage current The leakage currrent can be determined by b means of a fairly simp mple mental proceduure. We place the capacitor in series withh a resistor at tthe experim nominall polarization voltage of thhe capacitor. The T purpose of o the resistorr is both to limit the currrent and meassure it (Figuree 3.6). After a transient phaase n the value off the resistor, the that typiically lasts lesss than 5 min,, depending on current tends toward a constant vaalue. The leakaage current (w which varies aas a a 25 °C. If itss value is greaater functionn of the tempeerature) is theen measured at than or equal to the maximum m valuue stated by th he manufacturrer, the capaciitor needs too be replaced.

Figure 3.6. Meassuring the leak kage current

REMARK K.– The nomiinal voltage of o the capacito or is the maxximum operatiing voltage recommendeed by the manufacturer. m Overvoltage can cause tthe dielectriic of the capaccitor to deterioorate.

116

Reliability of o High-Power Mechatronic M Sys stems 2

3.3.3. Measurring the dim mensions of capacitor’s case The length L and the widdth W of the capacitor c casee are measuredd using a caaliper (Figure 3.7).

Figure 3.7. Dimensions D of the capacitor’’s case

3.3.4. Other measuremen m nts and observations The mass of o the capacitoor is measureed using scalees with a preccision of o used to nonn-destructivelyy inspect 0.0001 mg. X-raay microtomography is also the component to detect any internal deteriioration. The images thus ggenerated are ordered alonng the x, y andd z axes. Finaally, we monittor whether electrolyte w liquid) beccomes visible at the termiinals of the ccapacitor leaakage (yellow (F Figure 3.8).

Figure 3.8. Electro olyte leakage at a the capacito or terminals

Reliability Study for Cuboid Aluminum Capacitors

117

3.3.5. Use of measurements The measurements of the characteristics described above are taken as part of aging tests designed to emulate the thermal and electrical operating conditions of the final system. The estimate of the capacitor lifetime is based on any evidence of a significant change in one or several of these characteristics. 3.4. Reliability analysis [PIE 15] In order to estimate the expected reliability of a component subject to deterioration, we must first define a model structure, then characterize its parameters by performing time-constrained tests, which must therefore necessarily be “accelerated”. Given that the expected lifetime of an electrolytic capacitor may be lower than that of the wider system within which it operates (which is why they are viewed as “critical”), regular preventative maintenance should be considered. The goal of such a renewal policy is to prevent deterioration from reaching a stage of advancement that concretely manifests in sudden catastrophic failure, the consequences of which would be harmful to other nearby components (e.g. excessive swelling, electrolyte leakage, possibly an explosion). In these conditions, any desirable model of the expected reliability cannot be based on a sudden failure criterion, but instead should use a parametric drift criterion limited by the maximum threshold deemed admissible. 3.4.1. Specific objectives Speaking in completely general terms, when an electrolytic capacitor is subject to a combination of functional electrical stresses (voltage, current, frequency, etc.) and environmental stresses (temperature, humidity, vibrations, etc.), this can result in a variety of deterioration mechanisms and failure modes. In the context of expected reliability, a useful model must have a minimally complex structure, since uncertainty must be associated with each deterministic parameter in such a way as to allow an expression of the failure probability to be derived. In the case of the system that houses the capacitors studied here, we can limit ourselves to considering two major types of physical stress, which typically act simultaneously:

118

Reliability of High-Power Mechatronic Systems 2

– thermal constraint: temperature of the active parts, which determines the deterioration of the electrolyte under the action of thermodynamic effects; – electrical constraint: the direct voltage applied to the electrode terminals, which determines the deterioration of the dielectric complex. Depending on the mission profile, the influence of each of these constraints must be considered both from the perspective of statics (amplitudes of the permanent regimes of the temperature and the DC voltage) and dynamics (time-based temperature cycles and wave spectrum of the voltage). Given the topology of the system considered here, climate stresses such as humidity can be neglected. The same is true for mechanical stresses (vibrations and shocks) applied to the circuit board, due to the morphology of the capacitors (cuboid shape), which are glued onto the substrate. 3.4.2. Functional mission profile The capacitors considered in this study provide capacitive energy storage for a power supply board. The assembly operates in a pulse mode, in which the functional electrical stresses applied to the storage capacitors are both permanent and transient, characterized by a periodic cycle modeled in the idealized form of “on-off” intervals: – during the “OFF” phase, which is the longer of the two (a few milliseconds), the capacitors are charged by the main power supply of the system via a converter; – during the “ON” phase, they are transiently discharged very rapidly (a few hundred microseconds), delivering a high-amplitude pulse current. This storage function needs to achieve a value of 10–2 F, which is achieved by placing two cuboid-shaped capacitors of 5,000 µF in parallel. The purpose of this redundant configuration is to achieve a sufficient dielectric strength margin (at constant volume, the nominal voltage is inversely proportional to the capacitance) as well as extra capacitance compared to the required storage value in order to compensate for some of the reduction caused by gradual deterioration in the characteristics of the capacitor.

Reliability Study for Cuboid Aluminum Capacitors

119

3.4.3. Environmental mission profile 73 °C

83 °C

88 °C

93 °C

98 °C

103 °C

470 h 130 h 900 h

2760 h

1400 h

3100 h

Figure 3.9. Temperature profile of the capacitor mounted within the system in a hot and dry environment. For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

Figure 3.9 displays the temperature profile of the capacitor within its system, over the period of 1 year. As we can see, according to its mission profile, the capacitor experiences temperatures ranging from 70 to 105 °C within the wider system. It is important to note that the maximum admissible temperature for the capacitor is close to 105 °C, which might accelerate failure during operation. 3.4.4. Deterioration mechanisms The two major types of considered constraint lead to deterioration mechanisms whose effects are approximately observable by monitoring the evolution of certain physical parameters. It important to select parameters that are easily measurable and sufficiently sensitive to deterioration.

120

Reliability of High-Power Mechatronic Systems 2

In decreasing order of importance, the conventional criteria include: – the series resistance of the capacitor; – the value of the capacitance; – the leakage current. Thermodynamic deterioration in the electrolyte caused by temperature distinctively affects the series resistance and, to a lesser extent, the value of the capacitance. By contrast, the leakage current is not very sensitive to this deterioration (except at large time scales beyond the lifetime considered here) but varies randomly and is highly sensitive to the temperature (it is therefore a poor indicator in terms of signal-to-noise ratio). These observations are mostly relevant in conventional usages, such as filtering, which most notably causes thermal deterioration in the electrolyte, thus leading the series resistance to gradually increase. The influence of the voltage profile, on the other hand, which is not well understood and is not quoted by manufacturers, can prove to be decisive. In the device studied here, the specific voltage profile applied to the capacitors (permanent charging periodically interrupted by short discharges) can slow down thermal deterioration. In fact, the resulting polarization–depolarization process has a tendency to regenerate the dielectric complex and partially erase any cumulative prior deterioration (attenuating the memory effect). In these conditions, the criteria listed above are not necessarily relevant, since the operational “ON” phase is not predominant. After conducting the first series of tests, a preliminary analysis of the results tended to confirm this hypothesis. A priori, we can imagine an approach that relates the evolution over time of the deterioration to a dimensional measurement that depends on swelling, then chooses a limit value that can be used to determine the operational lifetime of the capacitor. This aims to strike a compromise between maximizing the preventative replacement interval and guaranteeing a sufficient safety margin in terms of the admissible deformation threshold. 3.5. Aging tests on components The multifactorial nature of the mission profile described in the previous section allows us to design better aging tests as a function of the parameters

Reliability Study for Cuboid Aluminum Capacitors

121

or factors that characterize the operational environment. The severity and the contribution of each factor, in this case a failure over multiple operating phases, accelerate the aging of the component before the lifetime stated by the manufacturer is reached. This can cause the entire system to malfunction. In the case studied here, we shall focus on the factor of temperature. Temperature intervals are defined for each operating phase. There are two distinct sources of heat that cause the internal temperature of the capacitor to increase: the temperature imposed by the system’s operating environment and the temperature gradient emitted by the power dissipated within the component itself, as well as nearby components (Joule effect). In general, the self-heating of the electrolytic capacitor is related to the ripple of the applied current and the equivalent series resistance ESR at a given frequency, described by the well-known formula: =

×

[3.7]

The ripple current reveals two essential factors, depending on the overall scenario in which the electrolytic capacitor is operating. The first factor is the applied electrical voltage Ua that stores electrostatic energy by charging the capacitor. This voltage must not exceed the nominal value specified by the manufacturer. The second factor is the periodicity of the charge/discharge cycle of the capacitor, represented by the frequency, = , and the duty cycle DC of the pulse signal, which in particular makes it possible to control the period required to discharge the energy needed by an electronic subassembly in the form of a current pulse over a time interval tdc = T×DC. For the rest of the cycle, the capacitor recharges in order to recover the energy expended by the current pulse. The root mean square value of the charge/discharge current is the ripple of the applied current. Its value depends on several parameters: the period, the duty cycle and the applied voltage. The voltage profile at the terminals of the capacitor during a charge/discharge cycle is shown on the curve (Figure 3.10). Analyzing the mission profile in this way is an indispensable step when defining the factors that reflect the operational conditions. These factors will be used as parameters to characterize the reliability of the system. The objective of this chapter is to propose a methodology and deterioration model adapted to the technology and mode of operation of the studied capacitors.

122

Reliability of High-Power Mechatronic Systems 2

Figure 3.10. Charge and discharge profile of the capacitor over a half-period

3.5.1. Test series The distribution and the variability of the factors listed below were used to define a concise testing schedule consisting of two distinct types of test: – Thermal tests: the primary objective of these tests is to verify the endurance of the components under high-temperature conditions. However, it is also essential to test the behavior of the components under the thermal cycle generated by powering the system on and off. – Thermoelectric tests: these tests combine the electrical “charge/ discharge cycle” profile of the components with the thermal stresses of the operating environment. Table 3.1 lists three different factors: – Temperature: - Tmax: the maximum testing temperature, - Tmin: the minimum testing temperature,

Reliability Study for Cuboid Aluminum Capacitors

123

- ∆T: the gradient of the applied temperature between the ambient temperature, which is 25°C, and the system temperature when it is powered off under operational conditions; – Voltage: two distinct values were used for the voltage, the rated voltage Ur, and a second voltage Ua representing 86% of Ur; – Duty cycle and ripple of the current: by imposing the same voltage and the same charge value, the value of the ripple current Irms varies as a function of the value of the duty cycle and the period of the applied pulse. The formula [3.8] is valid in both charge and discharge phases. Two different duty cycles can occur, a maximum duty cycle DCmax with a long period and a minimum duty cycle DCmin with a short period. × ∆ =

[3.8]

× ∆

Table 3.1 summarizes the schedule of the aging tests performed on the capacitors.

Type of test

Thermal

Thermoelectric

Tests Temperature Voltage

Duty cycle

Test duration (hours)

Number of aged components

ST1

Tmax





2500

5

ST2

Tmin





5200

5

CT

25°C + ∆T





3050

5

TE1

Tmax

Ua

DCmax

3050

5

TE2

Tmin

Ua

DCmax

6200

5

TE3

Tmax

Ua

DCmin

3700

5

TE4

Tmax

Ur

DCmax

4070

5

Table 3.1. Aging test schedule

During the tests, the deterioration process of the electrolytic capacitors was monitored by measuring the parameters listed earlier, in the second section of this chapter. These measured characteristics were then quantified in order to derive a reliability model.

124

Reliability of High-Power Mechatronic Systems 2

For each aging test, the measurement results are given in the form of a graph showing the evolution of the following quantities as a function of time: – capacitance; – width of capacitor’s case; – mass of the component. For each test, stresses were applied to five capacitors whose characteristics were then calculated at 25°C by taking repeated measurements at predetermined times in the testing schedules. Each component is assigned an index: the name of the test plus the digit 1, 2, 3, 4, or 5. 3.5.2. Thermal tests To perform these tests, three climate chambers were used, allowing three different thermal environments to be modeled.

Figure 3.11. Photo of climate chambers used for the thermal tests

3.5.2.1. Evolution of the capacitance Figure 3.12 shows the evolution of the capacitance values at 100 Hz during the two thermal aging tests ST1 and CT. The measurements are shown on the interval [Cn – 20%, Cn], where Cn is the nominal value of the capacitance of the aged capacitors. The evolution of the capacitance values visible on the two graphs shows that the capacitance did not experience any significant drift. We can conclude that the capacitance value is not affected

Reliability Study for Cuboid Aluminum Capacitors

125

by the applied thermal stress, whether permanent (storage) or periodic (cycling).

Figure 3.12. Evolution of the capacitance value of the capacitors during aging tests ST1 a) and CT b). For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

3.5.2.2. Evolution of the Equivalent Series Resistance, ESR The evolution of the ESR is not shown, since it did not change in any of the three test scenarios. 3.5.2.3. Evolution of the leakage current Similarly, the leakage current had the same evolution profile as the ESR in all three tests. 3.5.2.4. Evolution of the case width The width of the aluminum case of the capacitor increases as the applied thermal stress increases. The difference reached 5% of the initial value in ST1 over 2,500 hours at the maximum stress level but did not exceed 3.5% in ST2 over 6,000 hours. In CT, the variation of this quantity remained negligible and did not exceed 2% by the end of the test (3,000 hours). 3.5.2.5. Evolution of the mass of the component The curves in Figures 3.13 and 3.14 have the same visual appearance. In other words, the capacitor lost more mass in ST1 than in ST2. The mass loss of the capacitors in CT is not shown, since it was negligible.

126

Reliability of High-Power Mechatronic Systems 2

Figure 3.13. Evolution of the width of the capacitor’s case during aging tests ST1 a) and ST2 b). For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

Figure 3.14. Evolution of the capacitor weight during aging tests ST1 a) and ST2 b). For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

3.5.3. Thermoelectric tests To combine both the thermal and electrical aspects of the operational profile, we must set up two installations in order to emulate the operational stresses experienced by the electrolytic capacitors. The thermal factor is integrated into the tests using the climate chambers, similarly to the thermal tests in the previous section. The electrical profile applied to the components within their respective systems on the other hand is generated by a test bench,

Reliability Study for Cuboid Aluminum Capacitors

1 127

which recreates the same charge/ddischarge cyclee as illustratedd in Figure 3.10. 3 shows thee test bench used u to achievee this. Figure 3.15

Figure 3.15. Test bench used u for electrrothermal agin ng tests

3.5.3.1. Evolution of o the capacittance The capacitance only experiennced drift in TE4. It show wed hardly aany change in the other teests: TE1, TE22 and TE3.

Figure e 3.16. Evoluttion of the cap pacitance of the capacitorss during TE1 and a TE4 aging g tests. For a color ve ersion of this figure, f see ww ww.iste.co.uk/e elhami/mechatronic2.zip

3.5.3.2. Evolution of o the Equivalent Series Resistance, R E ESR The value of the ESR E did not chhange significcantly.

128

Reliability of High-Power Mechatronic Systems 2

3.5.3.3. Evolution of the leakage current The value of the leakage current did not change significantly in any of the four test scenarios. 3.5.3.4. Evolution of the width of the capacitor’s case Figure 3.17 shows the relative variation in the width of the capacitor’s case. This quantity reached 14% in the TE4 over 4,000 hours, the most severe scenario. The width variation in the TE1 and TE2 tests only reached 6% over 3,000 hours and 6,200 hours respectively, whereas the width increase in the TE3 test remained below 3% over 3,700 hours. X-ray images (Figure 3.18) were taken of one of the capacitors in the TE4 test (namely TE4-1) to show the appearance of the internal swelling of the case. We can clearly observe that the case detaches, which causes the coil to loosen relative to the initial configuration.

Figure 3.17. Evolution of the width of the capacitor’s case during aging tests TE1 a), TE2 b), TE3 c) and TE4 d). For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

Reliability Study for Cuboid Aluminum Capacitors

(a)

129

(b)

Figure 3.18. Microtomographic snapshot of the capacitor TE4-1 a) before and b) after aging (4,000 h)

3.5.3.5. Evolution of the weight of the component The capacitors used in the thermoelectric tests lost an amount of weight ranging from 30 to 90 mg. Similarly to the width of capacitor’s case, the weight decreased more in TE4 than in TE3.

Figure 3.19. Evolution of the weight of the capacitors during aging tests TE1 a), TE2 b), TE3 c) and TE4 d). For a color version of this figure, see www.iste.co.uk/elhami/mechatronic2.zip

130

Reliability of High-Power Mechatronic Systems 2

3.5.4. Summary of test results Out of all of these tests, the capacitance value only experienced drift in TE4, which represented an environment combining the factors of maximum operating temperature value, maximum ripple current and nominal capacitor voltage. In the other tests, this quantity did not change as a function of aging time. This means that applying the maximum level of one single type of constraint is not sufficient to accelerate component aging. The ESR and the leakage current had identical evolution profiles with no drift. This held true in all seven test scenarios. In other words, these two quantities are not affected by even the most severe thermal and electrical conditions considered in our study. The width of the capacitor’s case and the capacitor mass had a similar evolution profile. The width increased in value, whereas the weight decreased. This means that both quantities have the same degree of sensitivity to thermal and electrical constraints. 3.6. Analysis and modeling We will first begin by analyzing the effect of temperature on the measured parameters of the capacitors. We will then use a regression model to describe the evolution of these parameters as a function of time and of the constraints. 3.6.1. Influence of temperature The influence of temperature can be seen in the shape of the curves of the capacitor’s width and weight. This only applies to the thermal storage tests. The increase in the width of the capacitor’s case is caused by the internal pressure on the case arising from the gases produced by the chemical reactions occurring in the electrolyte under the effect of the applied temperature. This phenomenon is accompanied by the “ductility” of the aluminum from which the case is made, which results in an irreversible deformation of the case.

Reliability Study for Cuboid Aluminum Capacitors

131

The case width model describing the evolution profile can be written as a rational function [MUL 15]: ( )=

×

[3.9]

×

– w = width of case in mm; – t = aging time in hours; – wi = constants depending on the component and the test conditions. To validate this model, we used the capacitors with the largest variation in case width. For ST1 and ST2, we shall consider ST1-4 and ST2-4. ST1-4 ST2-4

w1 8.52 10-3 mm/h 6.09 10-3 mm/h

w2 16.62 mm 16.49 mm

w3 4.75 10-4 h-1 3.53 10-4 h-1

RMSE 0.044 mm 0.023 mm

Table 3.2. Parameters of the deterioration model for the case width of capacitors ST1-4 and ST2-4

The model and the experiments are compared below. Note that the measurements and the model are in excellent agreement, with a root-meansquare error (RMSE) of around 0.04 mm.

Figure 3.20. Comparison of measurements of the width of the capacitors ST1-4 and ST2-4 to the proposed model

Regarding the weight, we shall use a model similar to the model used to describe the evolution of the case width. Thus: ( )=

× ×

[3.10]

132

Reliability of High-Power Mechatronic Systems 2

– M = Weight of the component in mg; – t = aging time in hours; – mi = constants depending on the component and the test conditions. Again, we can see how effective this model is by comparing it to the experimental data (see Figure 3.21). ST1-4 ST2-4

m1 33.85 mg/h 16.57 mg/h

m2 33,416 mg 33,251 mg

m3 2.75 10-3 h-1 4.99 10-4 h-1

RMSE 1.29 mg 0.37 mg

Table 3.3. Parameters of the deterioration model for the weight of capacitors ST1-4 and ST2-4

Figure 3.21. Comparison of measurements of the weight of the capacitors ST1-4 and ST2-4 to the proposed model

3.6.2. Thermoelectric influences Turning our attention to the electrical tests, we can usefully exploit the evolution of the capacitance value in TE4. The influence of the thermal stress and the electrical stress may be described by a model based on a power law as a function of the aging time, as shown below [NAI 16]: ( )=−

×

+

[3.11]

– C = capacitance value in µF; – t = aging time in hours; – ci = constants depending on the component and the test conditions.

Reliability Study for Cuboid Aluminum Capacitors

TE4-5

c1 2.219 µF/ h

c2 0.58

c3 4,816 µF

133

RMSE 19.44 µF

Table 3.4. Parameters of the deterioration model of capacitance of capacitor TE4-5

Figure 3.22. Comparison of the measured capacitance of TE4-5 with the value predicted by the model

Since the same chemical process is responsible for causing the case to swell and the capacitor to lose weight, the models describing the evolution of the values of the width and the mass are the same. The only difference is in the values of the parameters as a function of the intensity of the applied stresses. In this section, we will model the evolution of the width and the weight from TE1 and TE4 using the component that experienced the most deterioration. Note that the model and the experimental data are once again highly consistent. TE1-1 TE4-5

w1 15.76 10-3 mm/h 6.09 10-3 mm/h

w2 16.53 mm 16.36 mm

w3 11.25 10-4 h-1 8.04 10-4 h-1

Table 3.5. Parameters of the deterioration model for the width of capacitors TE1-1 and TE4-5

RMSE 0.046 mm 0.084 mm

134

Reliability of High-Power Mechatronic Systems 2

Figure 3.23. Comparison of measurements of the case width of capacitors TE1-1 and TE4-5 to the proposed model

ET1-1 ET4-5

m1 41.21 mg/h 30.44 mg/h

m2 33,221 mg 32,818 mg

m3 1.24 10-3 h-1 9.30 10-4 h-1

RMSE 1.737 mg 4.974 mg

Table 3.6. Parameters of the deterioration model for the weight of capacitors TE1-1 and TE4-5

Figure 3.24. Comparison of measurements of the weight of capacitors TE1-1 and TE4-5 to the proposed model

3.7. Conclusion and continuation In this chapter, we presented a comprehensive approach to studying the operational reliability of cuboid-shaped aluminum capacitors with liquid electrolyte. Clarifying the characteristics of the technology is an essential step that must be completed before embarking upon a reliability study of any component. Reliability studies often begin by designing capacitor aging test scenarios that are representative of the operational profile of the high-power

Reliability Study for Cuboid Aluminum Capacitors

135

electronic system in which the capacitor is installed. Over the course of our tests, we monitored the state of health of the capacitors in order to identify the most significant parameters, which were found to be the capacitance, the width of the capacitor’s case and the weight. Deterioration models were proposed for these characteristics based on experimental data gathered from tests with a cumulative duration of 28,000 hours. These tests emphasized the effect of combined electrical and thermal stresses. The models thus developed can now serve as a basis for calculating the lifetime of components by defining a criterion for the maximum level of deterioration deemed tolerable in any of the three modeled parameters. The calculated lifetime can be then used to identify a model for the failure rate of this capacitor technology as well as its parameters and can also be used to determine the activation energy of this technology. This will allow the data listed in the FIDES guide to be updated for any system in which these capacitors are mounted. 3.8. Appendix: notice aluminum electrolytic capacitor

136

Reliability of High-Power Mechatronic Systems 2

3.9. Bibliography [COU 15] COUSSEAU R., PATIN N., FORGEZ C. et al., “Improved electrical model of aluminum electrolytic capacitor with anomalous diffusion for health monitoring”, Mathematics and Computers in Simulation, vol. 131, no. 6, pp. 268–282, January 2015. [GAS 96] GASPERI M.L., “Life prediction model for aluminum electrolytic capacitors”, Conference Record of the 1996 IEEE, vol. 3, pp. 1347–1351, October 1996. [IEC 07] IEC 60384-4:2007, Fixed capacitors for use in electronic equipment - Part 4: Sectional specification - Aluminium electrolytic capacitors with solid and nonsolid electrolyte, International Electrotechnical Commission, 2007. [IMA 07] IMAM, A.M., DIVAN D.M., HARLEY R.G. et al., “Real-time condition monitoring of the electrolytic capacitors for power electronics applications”, Twenty-Second Annual IEEE Applied Power Electronics Conference and Exposition, IEEE, pp. 1057–1061, 2007.

Reliability Study for Cuboid Aluminum Capacitors

137

[KIU 83] KIUCHI K., YANAGIBASHI M., “Operating life of aluminum electrolytic capacitor”, Telecommunications Energy Conference, INTELEC 83, Fifth International, pp. 535–540, IEEE, 1983. [MUL 15] MULDER J., VEGTER H., ARETZ H. et al., “Accurate determination of flow curves using the bulge test with optical measuring systems”, Journal of Materials Processing Technology, vol. 226, pp. 169–187, 2015. [NAI 16] NAIKAN V.N.A., RATHORE A., “Accelerated temperature and voltage life tests on aluminium electrolytic capacitors: a DOE approach”, International Journal of Quality & Reliability Management, vol. 33, no. 1, pp. 120–139, 2016. [PER 02] PERISSE F., Etude et analyse des modes de défaillances des condensateurs électrolytiques à l’aluminium et des thyristors, appliquées au système de protection du LHC (Large Hadron Collider), PhD Thesis, University Claude Bernard Lyon 1, 2002. [PIE 15] PIERRAT L., GRZESKOWIAK H., Etude d’un Livrable L1.3, WorkPackage 1, Projet FiRST-MFP, 2015.

convertisseur

DC/DC

4 The Reliability of Components: A New Generation of Film Capacitors

4.1. Introduction The reliability of high-power mechatronic systems fundamentally depends on the reliability of film capacitors. Film capacitors are important components within the complex assemblies of high-power mechatronics systems, regardless of the field of application, whether in the automotive, railway or aerospace industries. On average, they represent 40% of the components in electronic circuits [CHO 99]. These electronic components must meet high-performance reliability requirements in environments with extreme stresses, such as high atmospheric pressure, high applied voltage, critical ambient temperatures or severe humidity rates [MAK 14]. According to Gu [GU 08], passive components represent 75% of the electronic components used in the avionics sector, half of which are capacitors. This chapter provides the reader with an overview of the various technologies at play in film capacitors and presents the intrinsic and extrinsic parameters that affect the reliability performance of these components. In order to evaluate this performance, a section on accelerated life tests and highly accelerated tests presents a methodology for estimating the expected and experimental reliability. A case study of an accelerated life test is Chapter written by Henri GRZESKOWIAK, Daniel TRIAS and David DELAUX.

140

Reliability of High-Power Mechatronic Systems 2

presented to provide an example of a detailed and illustrated scientific approach. The final section engages in a constructive discussion of the prospects of estimating the reliability of these kinds of component. 4.2. The reliability of components: capacitors. Types of film

a new generation of film

4.2.1. Polypropylene Polypropylene (PP) is the most commonly used polymer for constructing plastic film capacitors and is well suited for high-precision applications because of its good dielectric characteristics (low dielectric absorption, high dielectric strength), its stable capacitance as a function of temperature and the excellent self-healing ability of its metalized variant. These capacitors supersede the previous generation of capacitors based on polystyrene. The main disadvantage of the polymer polypropylene is its low melting temperature, which makes it unsuitable for surface mounting. Areas of application include: pulse regimes, AC voltage, high-power electronics, high-current circuits, high-precision circuits and audio. 4.2.2. Polyethylene terephthalate Together with PP films, polyethylene terephthalate (PET) films are the most widely used polymer for capacitors, notably because of their low bulk and permittivity εr = 3.3, which is superior to every other type of polymer used to design film capacitors. Like PP films, these films also have good dielectric characteristics. However, these characteristics vary more strongly as a function of temperature and frequency. The high loss angle of tgδ = 50 × 10–4 with respect to frequency and temperature means that these capacitors are mediocre in high-precision and high-power applications. However, since this polymer has a higher melting temperature, it can be surface mounted, unlike PP films. Areas of application include: low currents, filtering, high-power electronics, decoupling and audio.

The Reliability of Components

141

4.2.3. Polyethylene naphtalate (PEN) This polymer has similar properties to PET, with the exception of a higher melting temperature, which allows it to be used in high-temperature applications. It is mainly used for manufacturer-side surface mounting. Areas of application include: high operating temperatures. 4.2.4. Polyphenylene sulfide This polymer has been used since 2000 as a replacement for polycarbonate capacitors. Its high melting point and good stability as a function of temperature and frequency make it suitable for high-temperature and high-frequency AC applications. Polyphenylene sulfide (PPS) based capacitors are mostly used for surface mounting. Areas of application include: high-precision capacitors, high-stability applications, AC applications and applications at all temperatures. PP

PET

PEN

PPS

PTFE

εr-permittivity

2.2

3.3

3.0

3.0

2.1

Tg δ at 1 kHz (10–4)

2

50

40

6

2

Ri*C (s)

>100,000

>10,000

>10,000

>10,000

>100,000

Dielectric absorption

0.01–0.1%

0.2–0.5%

1–1.2%

0.05–0.1%