Handbook of US Consumer Economics [1 ed.] 0128135247, 9780128135242

Handbook of U.S. Consumer Economics presents a deep understanding on key, current topics and a primer on the landscape o

913 50 15MB

English Pages 550 [440] Year 2019

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Handbook of US Consumer Economics [1 ed.]
 0128135247, 9780128135242

Table of contents :
Cover
Handbook of US Consumer Economics
Copyright
Contributors
Preface
1 -
Empirical analysis of the US consumer: fact, fiction, and the future
1. Big(ger) data: new sources and new questions
2. Consumer spending and the aggregate economy
3. Household finance
4. Responding to shocks
5. Spending over the life cycle
6. Measurement issues
7. International perspectives
8. Concluding thoughts
References
2 -
Handbook of the consumer chapter: trends in household debt and credit
1. Overview
2. Data
3. Decomposing the borrowing cycle
4. Trends in borrower characteristics
5. Trends in other debt
6. Perspectives on current household debt
6.1 Change in debt composition
6.2 Implications of the change in debt composition
6.3 Delinquencies
7. Conclusion
References
3 - Trends in household portfolio composition
1. Introduction
2. Survey of consumer finances data and comparison to aggregates
2.1 Wealth measurement: comparing the Survey of Consumer Finances to macroaggregates
2.2 The Survey of Consumer Finances and other household finance research
3. Composition of average household portfolios
4. Household portfolios across the asset distribution
4.1 Across time
5. Asset concentration
6. Cohorts
6.1 Interpreting the cohort figures
6.2 Median household assets
6.3 Ownership of risky assets: business, equity, and housing
6.4 Business ownership
6.5 Equity ownership
6.6 Home-ownership
6.7 Mortgage holding
6.7.1 The risky asset share
6.8 Business share
6.9 Equity share
6.10 Housing share
6.11 Combined risk asset share
7. Financial vulnerability, shocks, and the health of the household balance sheet
7.1 Risk from asset price shocks
7.2 Financial vulnerability: income and asset price shocks18
7.3 Trends in vulnerability across time
8. Conclusion
References
4. Household debt and recession in Brazil∗
1. Introduction
2. Aggregate view
3. Novel data set on Brazilian household debt
4. Characteristics of the household debt boom
4.1 Composition of household debt
4.2 Government-controlled banks and a tale of two booms
4.3 Credit growth across the income distribution
5. Potential causes of the household debt boom
5.1 Macroeconomic context
5.2 Institutional reforms and domestic programs
5.3 International factors
6. Concluding remarks
References
5 - Rationality in the consumer credit market: choosing between alternative and mainstream credit
1. Introduction
2. Pawn credit as an alternative to regular bank credit
3. Background: how pawnbroking works
4. Data and summary statistics
4.1 Data
4.2 Summary statistics
5. Main results
5.1 Empirical implementation
5.2 Access to mainstream credit for all swedes versus alternative credit users
5.2.1 Awareness about creditworthiness and immigrant status
6. Conclusion
References
6 -
How do consumers respond to real income shocks?
1. Introduction
2. JPMCI research on consumer spending responses to income and price changes
2.1 Healthcare spending and tax refunds
2.2 Consumer spending around job loss and the expiration of unemployment insurance benefits
2.3 Consumer spending and the decline of gas prices between 2014 and 2015
2.4 Consumption, investment, and mortgage resets
2.5 Income fluctuations around mortgage defaults
3. Conclusion
Appendix
References
7 -
Spending to and through retirement
1. Introduction
2. Description of data sources
2.1 CE Survey
2.2 HRS/CAMS Survey
2.3 Chase data
3. Literature review
3.1 The phased retirement hypothesis
3.2 Quantitative evidence of spending reductions in retirement
3.3 The retirement transition period
4. The life cycle of spending
4.1 Data Refinement
4.2 The J.P. Morgan expenditure model
4.3 Generational view
4.4 Investable wealth levels
4.5 Accounting for long-term care costs
4.6 Implications of the life cycle of spending for plan providers and employers
4.7 Implications of the life cycle of spending for firms that provide financial planning
5 Shifting into retirement
5.1 The retirement transition period data filter
5.2 Defining “retirement”
5.3 At what age do people retire?
5.4 Average spending the year before versus the year after retirement
5.5 Average spending the year before versus the year after retirement: households with at least $500,000 in investable wealth
5.6 Beyond averages: distribution of changes in spending the year before versus the year after retirement
5.7 Evidence of a retirement spending surge
5.8 Spending volatility
5.9 Spending volatility beyond the transition phase
6. Implications
6.1 Key takeaways for employers
6.2 Ideas for firms that develop retirement plans for individuals
7. Suggestions for further research
8. Closing
References
8 - Are millennials different?
1. Introduction
2. Definitions of generations and a review of research on age, generations, and economic decisions
3. A comparison of demographics by generation
4. Comparison of income and balance sheets by generation
4.1 Income
4.2 Debt
4.3 Assets and net worth
5. Comparison of consumption behavior by generation
5.1 Household spending in the CE survey by age and generation
5.2 An empirical assessment of generational consumption patterns
5.3 Do millennials have a unique consumption basket?
6. Case study I: vehicle purchases
6.1 Case study II: spending on food and housing
7. Conclusion
Data appendix
References
9 - China's consumer spending e-commerce: facts and evidence from JD's festival online sales
1. Introduction
2. Overall development of China's e-commerce
2.1 China's main online retailers
2.2 JD's E-Commerce data
3. Patterns and key features of China's e-commerce
3.1 Online consumer spending, by festival
3.2 Online consumer spending, by product
3.3 Online consumer spending, by age cohort
3.4 Online consumer spending, by region
4. E-commerce spending and regional income
4.1 Trends and patterns against regional income
4.2 Estimation results
5. Concluding remarks
Appendices
References
10 -
Consumer expectations and the macroeconomy
1. Disentangling preferences and expectations
2. Survey data on subjective expectations
3. Quantitative and probabilistic question formats
4. The information content of probabilistic questions
5. Are expectations data predictive?
6. Integrating subjective expectations data into economic models of behavior
7. Summary and directions for future research
References
11 -
Macro forecasting using alternative data
1. The importance of macroeconomic measurement and prediction
2. Important economic data releases and prediction
3. Macro Data are Noisy
3.1 The revision problem in traditional data
3.2 Increased noise in times of low growth
4. Our goal: real-time macro data with less noise
4.1 Nowcasting
4.2 The pyramid-like framework of nowcasting
5. Alternative data
5.1 The microfoundations of macro: alternative data in the context of the Lucas and Romer critique
6. An framework for alternative data
7. Predicting data releases with search data
7.1 Why curate? The Google Flu story
7.2 Modeling differences rather than levels
7.3 Housing, retail, and auto sectors with alternative data
8. Modeling case study: Nonfarm payrolls
8.1 Interpretable versus Blackbox or top-down versus bottom-up models via Kuhn
8.2 The practical reason: modeling noise in small data sets
8.3 The five keys: clean data, internal consistency, shrinkage, bootstrapping, and ensembling
8.4 The model overconfidence metric
8.5 Discussion of case study results
9. Live production results
9.1 Prediction in practice: the main mistakes17
9.2 Public benefits of microfoundations of macro
9.3 Two main contributions: accurate measurement and more detail
9.4 Mitigating data colonialism?
Acknowledgments
References
12 -
Regional price parities in the United States
1. Introduction
2. Price levels for CPI areas
3. Regional price parities for states and metropolitan areas
4. Selected results
4.1 Regional price parities for states
4.2 Adjusted personal incomes for metropolitan and nonmetropolitan portions of states27
5. Concluding remarks
Acknowledgments
Disclaimer
References
13 -
Measuring prices and real household consumption of medical goods: service-based versus disease-based approaches
1. Introduction
2. Basic theoretical framework
2.1 Utility and health
2.2 Medical price and cost of living indexes
2.3 Decomposing nominal expenditures
3. The current service-based approach to medical measurement
3.1 Current methods
3.2 What current price indexes measure and do not measure
4. The disease-based approach
4.1 Single-disease price indexes
4.2 General disease-based price indexes
5. BLS experimental disease-based price indexes
6. Data
7. Results
7.1 Trends in utilization by disease
7.2 Disease-based price indexes
7.3 Decomposition of nominal expenditures
8. Discussion and future work
8.1 Limitations of current disease-based measures
8.2 Quality adjustment
9. Conclusion
References
14 -
A brief history of the supplemental poverty measure1
1. Introduction
2. History and background
2.1 The official poverty measure
2.2 National Academy of Sciences panel on poverty measurement and research that lead to the SPM
2.3 The interagency technical working group to develop a supplemental poverty measure and early research
3. Construction of the SPM thresholds
3.1 Equivalence scale
3.2 Threshold estimation
3.3 Geographic adjustments
4. Resources
5. Additions to income: noncash benefits
5.1 Supplemental Nutrition Assistance Program
5.2 National School Lunch Program
5.3 Supplementary nutrition program for women, infants, and children
5.4 Low-Income Home Energy Assistance Program
5.5 Housing assistance
6. Subtractions from resources: necessary expenses
6.1 Taxes
6.2 Work-related expenses
6.3 Child support paid
6.4 Medical expenses
7. SPM estimation
8. SPM poverty rates
9. Changes to the SPM since 2011
10. Extensions of the SPM
11. Ongoing research on the SPM
12. Future directions for SPM
13. Summary
References
Index
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
Z
Back Cover

Citation preview

Handbook of US Consumer Economics

Edited by Andrew Haughwout Federal Reserve Bank of New York, New York, NY, United States

Benjamin Mandel J.P. Morgan Asset Management, New York, NY, United States

Academic Press is an imprint of Elsevier 125 London Wall, London EC2Y 5AS, United Kingdom 525 B Street, Suite 1650, San Diego, CA 92101, United States 50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom Copyright Ó 2019 Elsevier Inc. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions. This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).

Notices Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary. Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility. To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein. Library of Congress Cataloging-in-Publication Data A catalog record for this book is available from the Library of Congress British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library ISBN: 978-0-12-813524-2 For information on all Academic Press publications visit our website at https://www.elsevier.com/books-and-journals

Publisher: Candice Janco Acquisition Editor: J. Scott Bentley Editorial Project Manager: Barbara Makinster Production Project Manager: Paul Prasad Chandramohan Cover Designer: Mark Rogers Typeset by TNQ Technologies

Contributors Sumit Agarwal, National University of Singapore, Singapore Bettina H. Aten, Bureau of Economic Analysis, Suitland, MD, United States Marieke Bos, Bos is at the Swedish House of Finance at the Stockholm School of Economics, Sweden, and a Visiting Scholar at the Federal Reserve Bank of Philadelphia, PA, United States of America Ralph Bradley, Retired, Littleton, NH, United States Jesse Bricker, Research and Statistics, Federal Reserve Board, Washington, DC, United States Sharon Carson, Retirement Solutions, J.P. Morgan Asset Management, New York, NY, United States Liana E. Fox, Research Economist Social, Economic and Housing Statistics Division, U.S. Census Bureau, Washington, United States Gabriel Garber, Research Department, Central Bank of Brazil, Brasilia, DF, Brazil Thesia I. Garner, Research Economist, Division of Price and Index Number Research, Bureau of Labor Statistics, U.S Department of Labor, Washington, DC, United States Andrew Haughwout, Federal Reserve Bank of New York, New York, NY, United States Apurv Jain, Harvard Business School and Microsoft Research, Boston, MA, United States JPMorgan Chase Institute, JPMorgan Chase & Co, Washington, DC, United States Christopher J. Kurz, Division of Research and Statistics, Federal Reserve Board of Governors, Washington, DC, United States Donghoon Lee, Federal Reserve Bank of New York, New York, NY, United States Geng Li, Division of Research and Statistics, Federal Reserve Board of Governors, Washington, DC, United States Benjamin R. Mandel, Multi-Asset Solutions, JP Morgan Asset Management, New York, NY, United States Joseph Marlo, Retirement Solutions, J.P. Morgan Asset Management, New York, NY, United States Brett Matsumoto, Division of Price and Index Number Research, Bureau of Labor Statistics, Washington, DC, United States

xiii

xiv Contributors Atif Mian, Princeton University, United States Kevin B. Moore, Research and Statistics, Federal Reserve Board, Washington, DC, United States Je Oh, Retirement Solutions, J.P. Morgan Asset Management, New York, NY, United States Jacopo Ponticelli, Northwestern University, United States Katherine Roy, Retirement Solutions, J.P. Morgan Asset Management, New York, NY, United States Joelle Scally, Federal Reserve Bank of New York, New York, NY, United States Amir Sufi, University of Chicago Booth School of Business, United States Lauren Thomas, Federal Reserve Bank of New York, New York, NY, United States Jeffrey Thompson, New England Public Policy Center, Research Department, Federal Reserve Bank of Boston, Boston, MA, United States Wei Tian, School of Economics, Peking University, Beijing, China Giorgio Topa, Research and Statistics, Federal Reserve Bank of New York, New York, NY, United States Wilbert van der Klaauw, Federal Reserve Bank of New York, New York, NY, United States Daniel J. Vine, Division of Research and Statistics, Federal Reserve Board of Governors, Washington, DC, United States Yang Yang, BBD & Department of Political Science, University of California San Diego, San Diego, CA, Unites States Miaojie Yu, China Center for Economic Research (CCER) and National School of Development, Peking University, Beijing, China

Preface June 2019 marked a full decade since the conclusion of the 2007 recession, an event which most observers agree originated in the household sector, as sharply declining home prices and rising mortgage defaults led to unprecedented pressures in world financial markets. In spite of their large size and importance in the overall economy, consumer behavior in housing markets, more generally, had not been at the center of research activity. For many years prior to the 2000s, residential real estate finance in particular had been treated by academics as a relatively quiet and uneventful little corner of the financial markets, where defaults were rare and usually tied to factors idiosyncratic to the individual borrower. The housing market collapse and the financial crisis that followed changed all that, of course. As consumption began to fall, suddenly questions about consumer decision-making and behavior were front and center in the national debate, not just among academics and market professionals, but among the public more broadly. In 2007, as the recession began, “subprime” was named the word of the year by the American Dialectic Society, emphasizing the place that mortgage lending to borrowers with questionable credit histories had come to occupy in the public mind. “Subprime” followed a series of much less wonky words of the year, including “plutoed” and “truthiness” as the prior 2 years. In 2010, “google” was named word of the decade. These shifts in the reach and perceived relevance of consumer-related research precipitated two lower-frequency trends already underway: the rising prevalence of consumer topics in the broader field of economic research and the use of empirical techniques to address those topics. One measure of these trends is the incidence of consumer research in the American Economic Association’s EconLit, a bibliographic database of economic literature containing roughly 1.5 million articles for the period 1991e2018. As shown in Figure 1, articles on household behavior and family economics (JEL codes D10-19) rose as a share of total articles, going from 2% in aggregate in the early 1990s to 6% in recent years. Looking at more detailed categories reveals that this change was dominated by two subcategories, “Consumer Economics: Empirical Analysis” and “Household Saving, Personal Finance.” We discuss this in greater depth, as well as other key innovations in empirical economic research, in Chapter 1 of this volume. Underpinning these trends, new empirical methodologies, vast increases in computing power, and especially new data sources for empirical studies of

xv

xvi Preface 3% 1991-1995 1996-2000 2001-2005 2006-2010

2%

2011-2015

1%

0% D10: General

D11: Consumer D12: Consumer D13: Household D14: Household D15: D18: Consumer Economics: Economics: Producon and Saving; Personal Intertemporal Protecon Theory Empirical Intrahousehold Finance Household Analysis Allocaon Choice; Life Cycle Models and Saving

FIGURE 1 Share of total EconLit articles added in each period for JEL code D1dHousehold Behavior and Family Economics. Source: American Economic Association, author calculations.

consumers have all become increasingly widely available, allowing a greater number of researchers to address a greater number of questions in a serious way. This confluence of factorsdimproved tools and an increased urgency for better understanding of behaviordis the underlying motivation for the Handbook of US Consumer Economics. Our intention as editors was to bring together a wide range of examples of some of the new trends in consumer economics. To do so, we reached out to an unusually diverse set of authors from different backgrounds in academia, business, and government. We received an enthusiastic response, with papers submitted by authors from institutions ranging from the Census Bureau to Microsoft, located not just in the United States, but from Singapore to Brazil. As part of the process of selecting the chapters for the volume, most of the authors joined a January 2018 conference hosted by the JPMC Institute in Washington, DC. The day-long conference produced a lively discussion of each of the individual chapters, with presenters and commenters from very different backgrounds sharing ideas and feedback. During the day, and as the chapters took form over the subsequent months, several important common themes began to emerge, which are summarized in the introduction to the volume. It also became clear that the authors were mainly interested in a volume that would consist primarily of new researchddeep dives into a set of specific subjectsdrather than a set of literature reviews. As such, it is critical to note that the Handbook does not pretend to be comprehensive but rather reflects new approaches to a set of questions old and new.

Preface xvii

The resulting volume will, we hope, represent a valuable contribution to the reader’s understanding of important facts about the American consumer. At a minimum, the variety of authors, subjects, methods, and, perhaps especially, new data sets described in the chapters that follow underscore the breadth of the subject and will serve to provoke further work and a deepening understanding.

Chapter 1

Empirical analysis of the US consumer: fact, fiction, and the future Andrew Haughwout1, *, Benjamin R. Mandel2 1 Federal Reserve Bank of New York, New York, NY, United States; 2Multi-Asset Solutions, JP Morgan Asset Management, New York, NY, United States

A few night ago a drunken mandthere are lots of them everywhere nowadaysdwas crawling on his hands and knees under the bright light at Broadway and Thirty-fifth street. He told an inquiring policeman he had lost his watch at Twenty-third street and was looking for it. The policeman asked why he didn’t go to Twenty-third street to look. The man replied, ‘The light is better here.’ Kingston Daily Freeman, February 21, 1925

The term “streetlight effect” is used to describe an observational bias in research, given the tension that often exists between the observational demands of a research question and the measurements at hand to answer it. The fact that direct experiments are relatively uncommon in economics is a reflection of this tension. One might argue further that US consumption, with few detailed data sources compared to economic variables like employment and income, is particularly prone to these issues. Today, the interplay of methodological advances and data availability is driving a broader empirical shift in the economics profession that is very much in operation with reference to the US consumer. As the area circumscribed by “good light” expands, a more holistic view is emerging of consumer motivations, constraints and, ultimately, behavior, and the observational biases inherent to this line of research are gradually diminishing. The objective of the Handbook of US Consumer Economics (“Handbook”) is to amass a deep and data-driven perspective on consumers. With consumption accounting for over two-thirds of US GDP, this perspective has first-order * The views expressed here are those of the authors, and do not necessarily reflect those of the Federal Reserve Bank of New York of the Federal Reserve System. Handbook of US Consumer Economics. https://doi.org/10.1016/B978-0-12-813524-2.00001-9 Copyright © 2019 Elsevier Inc. All rights reserved.

1

2 Handbook of US Consumer Economics

relevance for understanding how the overall economy is evolving and its underlying sources of growth. Consumer expenditure, to be sure, is a key manifestation of consumer behavior and an area of focus. But the Handbook’s main emphasis across the different chapters is a broad understanding of the underlying drivers of spending: evolving financial health, the nature of (and reaction to) economic shocks, the implications of age and the life cycle, and a broader perspective on household outcomes and their measurement. What emerges is a set of empirical findings about the consumer that is likely to be equally enlightening from the perspective of researcher, policymaker, or practitioner. The biggest challenge in summarizing the vast universe of literature on this area and its subtopics is organizational. One potential rubric to do so would be to sort the topics by frequency. At the high frequency end of the spectrum, empirical research has documented consumer responses to an array of changes in their economic environmentdfor the most part, real income shocks (e.g., unemployment, tax rebates, health issues, relative price changes) and wealth effects (e.g., real estate, financial assets). At intermediate frequencies are consumer behaviors over the course of a business cycle, their relationship to other cyclical phenomena, and their contribution to aggregate fluctuations. And at lower frequencies, the consumer life cycle defines a set of intra- and intergenerational patterns. One would also include in this latter group slowmoving secular changes in preferences, financial markets, or policy. Another useful rubric would be to sort topics according to their distributional implications: heterogeneous responses to economic shocks, consumption inequality, intergenerational differences, and differentiation in price deflators, to name a few. Our approach in the Handbook is to consolidate these rubrics into a set of five themes that capture the key clusters common to both. The themes are (1) consumer spending and the aggregate economy; (2) household finance; (3) responding to shocks; (4) spending over the life cycle; and (5) measurement issues. The objective of this chapter is to elucidate these themes as they pertain to current empirical research and, specifically, how the chapters in the Handbook fit in. In prelude, it will also describe the broader trends in data used for consumer-related research. Given the ambitious goal of highlighting salient research in consumer economics writ large, it is also worth noting that the selection of topics, and even the definition of consumer economics as a subfield, is nonexhaustive. One might argue, for instance, that behavioral aspects of consumer behavior should constitute its own thematic section of the Handbook. While these insights are currently incorporated at least implicitly in the context of other topics, a more focused treatment would not have been unreasonable in light of the growing empirical literature on bounded rationality and consumers’ decision-making under uncertainty. We also deploy a narrower definition of consumer research that might be desirable for someone steeped, for example, in the discipline of family and consumer sciences (also known as home economics). Notwithstanding these definitional issues, hopefully the reader emerges at the

Empirical analysis of the US consumer Chapter | 1

3

other end of this volume with a broader understanding of the US consumer, the key questions being asked, and the empirical tools being used to answer them. The remainder of this chapter is organized as follows. Section 1 discusses the evolution of data sources used in the analysis of the US consumer, which is followed by an overview of the thematic sections of the Handbook in Sections 2e6, respectively. Section 7 brings in a selection of international perspectives. Section 8 concludes with lessons learned and suggestions for future research.

1. Big(ger) data: new sources and new questions An important motivation for the Handbook is that the analysis of consumption has become increasingly powerful over the past two decades, owing to parallel developments in data availability and empirical methods. New surveys and the proliferation of “big” data have widened the range of source materials, forming a more holistic perspective of household finance and spending. The evolution of research with respect to the consumer mirrors a broader shift in the economics discipline in the direction of applied research. As illustrated by Hamermesh (2013) for Top-3 economics journal publications, research methodology has shifted from being predominantly theoretical in nature in the 1960s through 1980s to more data-oriented in the 1990s and 2000s.1 Moreover, there has been a compositional shift in the types of empirical paper from those using publicly available sourcesdmost often government surveys and macroeconomic time seriesdto data more proprietary in nature. Indeed, according to that study, over 60% of top publications in 1983 had a theoretical emphasis, while the vast majority of the remainder was empirical employing public data. By 2011, theoretical contributions had fallen to less than 30%, with the bulk of the remainder divided between empirical (public data) and empirical (proprietary data). Papers with their own data set now account for roughly a third of Top-3 publications. Einav and Levin (2014) show a similar trend for publications in The American Economic Review, with just under half of empirical papers published in 2014 requesting an exemption from the journal’s data availability policy, with those divided roughly equally between private sector and administrative data sources. One can draw rough comparisons between this overall trend and the evolution of research data sets oriented toward the US consumer. An important nexus of consumer-related economic theory and data in the postwar period has been the concept of consumption smoothing. The deceptively simple idea of transferring consumption intertemporally from high- to low-income periods is the cornerstone implication of expected utility theory. It is based on the ideas of the permanent income hypothesis (PIH), developed by Friedman (1957), of the economics of the life cycle, developed by Modigliani and Brumberg (1954), and 1. Hamermesh’s data set includes publications in American Economic Review (AER), Journal of Political. Economy (JPE), and Quarterly Journal of Economics (QJE).

4 Handbook of US Consumer Economics

their formalization in Hall (1978). The evidence offered by Hall was based on aggregate consumption time series from the US National Income and Product Accounts. In that sense, Hall was not only a seminal theoretical paper but an archetype of applied work on the consumer at that time. In what is a close thematic antecedent to the Handbook, Angus Deaton’s Understanding Consumption (1992) devotes a considerable amount of real estate to empirical evidence on the major questions of the time: What is the cross-sectional relationship between saving and growth? How does one reconcile contradictory evidence on the life-cycle hypothesis? Does consumption track income too closely? In addressing these questions, the majority of data sources were of the aggregate sort used by Hall and numerous other authors in the 1970s. These questions also hark back to earlier research in the 1960s associated with the literature on balanced macroeconomic growth (e.g., Tobin, 1967) and the estimation of aggregate consumption functions (see summary in Ackley, 1961). The next phase in empirical studies on the consumer leveraged government-produced public data sources and is characterized by richer, more representative, samples of the US population. Survey programs at the US Bureau of Labor Statistics (BLS) and Census Bureau are good illustrations of the wave of data that broadened out in the 1980s and afterward. The Consumer Expenditure Survey, for instance, was first administered in 1888 (to only 3000 respondents). It served as a basis for analysis of family living conditions, one of the oldest BLS functions, and also provided rudimentary weights for consumer prices indexes. The survey was administered roughly every 10 years over the next century, and then settled into its modern incarnation in 1980, when it began being administered annually and with sample selection done under contract by census. The research dimension of the survey expanded in the 1990s with the first publication of CES microdata in 1993 and the establishment of a separate CE research branch in 1999. Analogous developments were taking place for population census data, labor market surveys, and others, culminating in the creation of the Integrated Public Use Microdata Series (IPUMS) databases beginning in the early 1990s. The Survey of Consumer Finances, Current Population Survey and its annual sidecar publications, Panel Study on Income Dynamics, and Survey of Income and Program Participation are also of this ilk. Developments in data availability, which by the late 1980s included both aggregate time series and more widely available survey microdata, corresponded to new types of research questions about the consumer being asked and answered. Blundell (1988) surveys both the micro- and macrodimensions of research on consumer behavior and describes the prominent linkages between theory and data being explored at that time. These included the optimality and impact of tax proposals, the effect of credit constraints, the implications of real interest rate changes and uncertainty on savings behavior, and the selection of appropriate cost of living indexes. It is also noteworthy that the increasing availability of detailed data offered new empirical

Empirical analysis of the US consumer Chapter | 1

5

perspectives on the research questions of the 1960s and 1970s. Attanasio and Weber (1995), for example, use the CES microdata to demonstrate more resilient links between theories of intertemporal optimization and nondurables expenditures than goods or aggregate data alone. Blundell et al. (1993) use repeated cross-sections of microdata to evaluate estimates of consumer demand relative to measures derived from aggregate data. Far from being anachronistic, survey and aggregate data are still in active use today and, in fact, are still being augmented by new sources. For example, the Survey of Consumer Expectations (see Armantier et al. (2017)) is a nationally representative, Internet-based rotating panel of approximately 1300 households in which respondents field forward-looking questions about inflation, labor markets, and household finance. These measures are designed to provide direct gauges of expectations and, hence, to better understand economic decisionmaking under uncertainty. This brings us to the most recent epoch in the evolution of empirical studies about the consumer: big data. While no single definition completely circumscribes the data sets being referred to as “big,” one can think of it in terms of the differences between today’s data and the microdata available since the 1990s. In a survey article about the repercussions of big data for economic research, Einav and Levin (2014) describe big data as, “. now available faster, [with] greater coverage and scope, and includes new types of observations and measurements that previously were not available.” One might also add that due to the methods of collection of these data sets, whether over the course of online provision of goods and services (e.g., retailers, search engines) or as a by-product of the recordkeeping of firms (e.g., banks and credit card companies) or the administrative records of the government (e.g., tax, social security, and public health authorities), these data are increasingly proprietary, or otherwise restricted, in nature. Hence, both the expansion in the scope of data available and its proprietary nature are consistent with the rise in empirical studies and composition of data sets described in Hamermesh (2013) for economic research as a whole. Einav and Levin (2014) describe two transition paths in economic literature owing to big data: (1) from representative surveys to universal administrative data and (2) the nascent utilization of private sector data. The former involves large, generally government-administered, data such as those of the Internal Revenue Service, Social Security Administration, and Centers for Medicare and Medicaid Services. The latter includes predominantly private data collected by retailers, financial institutions, and technology companies. One of the key advantages of universal coverage in administrative data sets is the ability to analyze changes in populations and their distributions over time, for example, as Piketty (2014) and an array of coauthors do for the wealth and income distributions employing tax data (e.g., Atkinson et al., 2011; Piketty and Saez, 2014). While oftentimes less comprehensive, new types of timely information in private sector big data can be powerful complements to

6 Handbook of US Consumer Economics

traditional sources of information about the economy. The richness of Google search data can be informative about unemployment trends (D’Amuri and Marcucci (2012)) or other indicators that are often not available in real-time from statistical agencies (Choi and Varian, 2012). In another prominent example, the Billion Prices Project described in Cavallo and Rigobon (2016) uses a large volume of online retail price data to construct alternative estimates of consumer price indexes. Varian (2014) provides a primer on methodology. Some contemporary data sets lie in between universal administrative data and private sector sources. For instance, the Federal Reserve Bank of New York’s Consumer Credit Panel is a quarterly panel data set based on a nationally representative sample of individual credit reports (see Lee and van der Klaauw, 2010). The data set provides a time series view of households that is holistic across various types of consumer debt and also allows for aggregation to national-level statistics. The existence of both survey and administrative data is suggestive of the benefits that combining them could provide. In an excellent example of this synthesis, Meyer et al. (2015) use comparisons of survey and administrative data for nine government transfer programs to measure the extent of downward bias in the survey reporting of transfer income, concluding that it has been growing over time. Cognizant of the changes in both the tools of analysis and research questions about the US consumer, our objective in the Handbook is to represent a broad spectrum of current empirical sources. Several of the chapters build on a foundation of detailed survey microdata: Survey of Consumer Finances (Chapter 3); Consumer Expenditure Survey and Panel Study of Income Dynamics (Chapter 8, Chapter 14, Survey of Consumer Expectations (Chapter 10); Consumer Price Index and Public Use Microdata Sample (Chapter 12); Medical Expenditure Panel Survey (Chapter 13). Others use extracts from more universal administrative data: Consumer Credit Panel (Chapter 2); Banco Central do Brasil credit registry (Chapter 4); credit bureau and pawnbroker association data from Sweden (Chapter 5). Others, still, use private sector “big” data sources: Chase Bank account and card spending (Chapter 6, Chapter 7); JingDong transaction-level ecommerce (Chapter 9); Bing search (Chapter 11). What is remarkable is that, notwithstanding the increasing diversity of sources, all are still highly relevant in the contemporary analysis of the consumer. In many cases, multiple data sets are combined to focus on the research question at hand. And increasingly rich sources of information are also generally used in tandem, or with reference to, the national-level government statistics used in macroeconomic analysis up to a century ago. In other words, the different perspectives offered by developments in data availability, quality, and breadth appear to be better complements than substitutes.

Empirical analysis of the US consumer Chapter | 1

7

2. Consumer spending and the aggregate economy Another way to parse the trends in empirical analysis of the consumer is to separate them into broad themes. We begin our thematic exploration of the US consumers through the lens of their macroeconomic importance. The US consumption in the postwar period is a defining feature of its economic history, with reference to both its historical pace of growth and US relative growth. As illustrated in Fig. 1.1, the trend pace of US real consumption growth, at just over 2% since the 1950s, was a step up relative to the prewar period. It has also been relatively stable compared to steeper declines among other developed economies in recent decades. A key driver of strong and stable consumption growth over that period was that overall demand was becoming more consumption intensive. Personal consumption expenditures account for just over two-thirds of the US nominal GDP today, a share that has been roughly flat since 2000 but which rose steadily in prior decades starting out at 60% in the 1960s and early 1970s. The rising US consumption share, in turn, is the result of several coincident factors, including the evolution of household balance sheets, life-cycle spending patterns of the baby boom generation, and also the more general features of the macro environment (such as deepening globalization and trend declines in macroeconomic volatility). Several Handbook chapters speak directly to the interplay between consumption and macroeconomic growth. Jain (Chapter 11) provides a practitioners’ guide to using big data for macroeconomic forecasting, with the idea of not only predicting the outcomes of important macroeconomic statistics but

Real consumpon per capita growth (% 20yr MA)

8%

Australia Belgium Canada

6%

Switzerland Germany

4%

Denmark Spain Finland

2%

France UK

0% Italy Netherlands

-2%

Norway Portugal Sweden

-4% 1890

1910

1930

1950

1970

1990

2010

USA

FIGURE 1.1 US real consumption growth in historical perspective. Source: Author calculations based on Jorda`, O., Schularick, M., Taylor, A.M., 2017. Macrofinancial history and the new business cycle facts. In: Eichenbaum, M., Parker, J.A. (Eds.), NBER Macroeconomics Annual 2016, vol. 31. University of Chicago Press, Chicago.

8 Handbook of US Consumer Economics

also improving on their accuracy. Chapter 10 describes the advantages of pitfalls of constructing direct measures of consumer expectations, which are, in turn, key determinants of real and nominal aggregate outcomes. In a survey article titled simply “Consumption,” Attanasio (1999) examines the dynamic decision framework that has become the workhorse vehicle for the understanding of consumer behavior, highlighting the importance of consistency among the economic phenomena tied into the consumer’s optimization problem as well as the use of microlevel empirical work as a foundation for understanding aggregate outcomes. In providing distinctly microperspectives on broader macrodata and consumer expectations, these chapters are indicative of the progress made along both of these dimensions over the past two decades.

3. Household finance A key driver of robust and steady postwar consumption growth has been the evolution of the US household balance sheets and, specifically, the rapid rise in household net worth. A stylized example of this trend and its associated wealth effects is the baby boom generation, which experienced a massive expansiondindeed, a triplingdin household net worth between 1989 and 2016 (Fig. 1.2). This expansion was driven by the combination of relatively high wage growth, robust saving rates, asset price appreciation, and augmented by a secular increase in leverage. It manifested primarily in the rapid

400 Median baby boomer net worth (2016 USD, thousands)

Debt 350 Assets 300

Net Worth

250 200 150

100 50 0 -50 -100 1989

1992

1995

1998

2001

2004

2007

2010

2013

2016

FIGURE 1.2 The net worth of the median baby boomer household. Source: Survey of Consumer Finances, author calculations.

Empirical analysis of the US consumer Chapter | 1

9

accumulation of real assets, such as primary and secondary residences, which account for the bulk of the median household’s assets. Real estate, in turn, tends to have higher pass-through into consumer expenditures. Case et al. (2005, 2013) estimate the elasticity of (real) consumption to housing wealth to be about 7%, more than twice the elasticity of financial assets, which weighed in at 3% in comparable specifications. These estimates are ordinally consistent with the coefficients in the Federal Reserve’s FRBUS macroeconomic model of the US economy. Financial assets also contributed to household balance sheet expansion and consumption growth, albeit from a lower starting level, registering a 3.7-fold rise on baby boomer balance sheets over that period. Exceptional growth was not confined to the asset side of the US household balance sheets, as household liabilities burgeoned alongside them in recent decades. By way of comparison, the 151% rise in the median baby boomer household’s assets since 1989 corresponded to a 69% rise in household debt. As a result, the rise of household net worth over that period belies an expansion in the overall financial footprint of households. Of course, household liabilities are not a monolith, and changes in consumer credit in recent decades have not been monotonic. In Chapter 2, Haughwout et al. use the Consumer Credit Panel to detail the evolution of household liabilities since the early 2000s, developments around the time of the Global Financial Crisis, and some of the changes since then. In Chapter 5, Agarwal and Bos examine household choices between mainstream and alternative sources of credit. Relating back to our earlier discussion of new data sources, the increasing availability of detailed information on household liabilities has generally outpaced that of household assets. Housing and financial wealth measures are usually inferred from very aggregate and/or low-frequency statistics. For example, measures of housing wealth are available in great geographic detail in the Census of Population and Housing, but only every 10 years. The Federal Reserve’s Survey of Consumer Finances provides a breakdown of broad segments of the household balance sheet every 3 years, but only at the national level and with limited information on the distribution across the population. Aggregate measures of financial wealth are provided in the Federal Reserve Flow of Funds data and, again, in the Survey of Consumer Finances, with more granular albeit narrower data available elsewhere for specific asset holdings like mutual funds. Perhaps one of the most detailed and representative treatments of the asset side of household balance sheets to date is provided by Bricker et al. in Chapter 3 using microdata from the Survey of Consumer Finances. Finally, the distribution of household balance sheets across the US population is related to the extensive empirical literature on inequality. Attanasio and Pistaferri (2016) summarize recent research on the evolution of the US consumption inequality, its inherent measurement challenges, and the relationship with income inequality over time. Notwithstanding some consumption items that have been converging over time between low- and highconsumption households (notably, durables and leisure consumption), they

10 Handbook of US Consumer Economics

find that extant measures of consumption inequality track those of income reasonably well. Using alternative measures of consumption, Aguiar and Bils (2015) come to a similar conclusion. Both of these papers stand in contrast to earlier results by Krueger and Perri (2006), who used Consumer Expenditure Survey data to argue that consumption inequality actually rose much slower than that of income. Given the important role played by household balance sheets in consumption outcomes, the exploration of consumption inequality through the lens of the changing distribution of household assets and liabilities is a promising area for future research.

4. Responding to shocks In addition to explaining the lower frequency trends in the US consumption over recent decades, the household balance sheet is also integral to the observation that macroeconomic volatility has trended downward over time, beginning in the 1980s. The standard deviation of real personal consumption expenditure growth today is roughly a third of what it was in the 1960s and 1970s, even as the variance of real income has been relatively stationary (Fig. 1.3). Implicit in these observations is that economic shocks are increasingly affecting different aspects of household economic decisions. Namely, the shocks underlying aggregate consumption volatility appear to have gradually migrated away from spending and toward household saving rates and balance sheets. Falling consumption volatility, in turn, is one component of the Great Moderation period beginning in the mid-1980s, though not an entirely 8 Standard deviaon of 3m/3m growth (%AR)

Disposable personal income 7

Personal Consumpon Expenditures

6 5 4 3 2 1 0 1963

1970

1977

1984

1991

1998

2005

2012

FIGURE 1.3 Volatility of household income and real consumption. Source: Bureau of Economic Analysis, author calculations.

Empirical analysis of the US consumer Chapter | 1

11

uncontroversial one. In some sense, household consumption was along for the ride as output and inflation volatility fell around that time. According to the taxonomy of plausible causes for the Great Moderation outlined in Ahmed et al. (2004) and Bernanke (2004)dstructural change, macroeconomic policies, and good luckdonly certain elements of structural change pertained directly to changes in household behavior; those elements were primarily related to financial innovations that, among other things, eased liquidity constraints for households. Other drivers focus on the more efficient management of inventories, trends in deregulation, the broad economy’s shift away from manufacturing into services, rising globalization, and innovations in monetary policy. So while consumption, as a function of lower income volatility and declining economic uncertainty, benefited from the decline in aggregate volatility, it was not the predominant driver of the broader trend. There is also some discord as to the extent to which the US consumption even took part in the Great Moderation at all. Gorbachev (2011) finds that after accounting for predictable variation in personal consumption arising from variation in real interest rates, preferences and income shocks, consumption volatility actually rose between the 1970s and early 2000s. A key finding in that paper was that high household-level consumption volatility could coexist with lower aggregate volatility given a decline in the covariance of consumption growth across households. In a similar finding, using a representative longitudinal survey of the US households, Dynan et al. (2012) find that income became more volatile between 1971 and 2008, with the widening of the percent change distribution across households concentrated in its tails. Davis and Kahn (2008) cite microlevel volatility statistics to argue that there was no decline in consumption volatility and individual earnings uncertainty during the Great Moderation period or, at a minimum, that households experienced a much smaller decline in volatility than that experienced by firms. The contrasting results between micro- and macroperspectives suggest that the empirical vantage point from which volatility observations are taken matters quite a lot. The current state of the art in this respect is the evaluation of individual household responses to specific identified shocks. An indicative example is the raft of papers written about household spending in response to fiscal stimulus measures (e.g., Parker et al., 2013; Mian and Sufi, 2012; Sahm et al., 2012). In this same spirit, Greig and Hamoudi (Chapter 6) use bank administrative data to measure household spending responses to an array of real income shocks: tax refunds, job loss, changes in gasoline prices, among others.

5. Spending over the life cycle As mentioned above in the context of evolving empirical studies of the consumer, by now there is a long tradition of explaining discrepancies between empirical reality and simple versions of the PIH and life-cycle

12 Handbook of US Consumer Economics

theories. Carroll and Summers (1989) document that the theories are inconsistent with the “grossest features of cross-country and cross-section data on consumption and income,” in a way that favors a narrower definition of consumption smoothingdover periods of several years rather than decades. They focus on the close empirical links between consumption and income, a violation of the PIH implication that optimal (smoothed) consumption should follow a random walk. Indeed, a very crude look at spending levels across age groups today, as reported in the BLS Consumer Expenditure Survey (CES), shows the hump-shaped profile over the life cycle (Fig. 1.4). The fact that the consumption profile follows the hump shape of income across age groups quite closely suggests that consumers are either unwilling or unable to spread consumption effectively over time. More striking is that this nexus between consumption and income occurs even for predictable changes in income such as those arising from life-cycle employment patterns. Several fixes have been proposed in order to bridge the gap between theory and data. Campbell and Mankiw (1989) propose an extension of the simple model, from a single, fully rational, and forward-looking representative consumer to include a second type of consumer who makes decisions according to “rules of thumb.” This modification helps to explain two empirical violations of the theory: (1) once again, that expected changes in income are associated with expected changes in consumption and (2) real interest rates are not closely related to expected changes in consumption, which means that forward-looking consumers do not adjust their consumption in response to changing interest rates. Attanasio and Browning (1995) demonstrate that by

Expenditures relave to populaon average (1984-2017)

140% 130% 120% 110% 100% 90% 80% 70% 60% 50% 40% 25-

25-34

35-44

45-54

55-64

65-74

75+

FIGURE 1.4 Consumer expenditures by age group. Source: Consumer Expenditure Survey, author calculations.

Empirical analysis of the US consumer Chapter | 1

13

allowing demographics to affect preferences and by relaxing the assumption of certainty equivalence, idiosyncratic age effects and precautionary savings can generate hump-shaped consumption profiles without having to appeal to more mechanical explanations like rules of thumb, myopia, or liquidity constraints. This line of reasoning is consistent with the evolution of the consumption-age profile in Fig. 1.4 insofar as the small degree of flattening in the curves that has occurred in recent years. Whatever the degree of recent financial innovations and the corresponding easing in liquidity constraints, the general contour of spending cannot be explained fully by those developments alone. Building on these results, subsequent contributions to the literature attempt to parse the age-specific consumption behavior of households more finely. Part of this effort involved more sophisticated ways of isolating age cohort effects in the data. For instance, Gourinchas and Parker (2002) deploy a synthetic cohort technique on the CES to infer average age profiles of consumption and income over the working lives of households in different education and occupation groups. They find that these profiles line up reasonably well with the predictions of a flexible structural model of optimal life-cycle consumption. Flexible, in the sense that consumption behavior can change over the life cycle, from bufferstock agents in youth to something more closely resembling a certaintyequivalent consumer in middle-age. Ferna´ndez-Villaverde and Krueger (2007) account for roughly half of hump-shaped consumption paths over the life cycle with changes in household size, but with particularly large failures of consumption smoothing for consumer durables. Browning and Crossley (2001) examine the class of models associated with the life-cycle framework, arguing that the subset of models that have been taken to the data have had more successes than failures. They also assert that the economics profession is just at the beginning of systematically applying these models to microdata. This thought leads us directly to the chapters in the Handbook that focus on spending over the life cycle. The findings of Greig and Hamoudi (Chapter 6), that individuals’ spending is sensitive to predictable income shocks or those that have very small effects on lifetime income, demonstrate that the Summers and Carroll (1989) PIH critique is very much alive in the microdata. On a more constructive note, contemporary empirical applications are generally more careful in isolating cohort-specific effects. We see this as a necessary ingredient in Kurz et al. (Chapter 8) in their exploration of whether millennials’ spending patterns are inherently different from prior generations. Their finding that millennial spending profiles are not grossly different after properly accounting for cohort-specific effects is a new perspective on the relative role of demographic factors versus preferences as drivers of spending. Cohort-specific demographic effects are also quite important in the observations about the evolution of household assets (Chapter 3). Finally, the use of transactions microdata opens up the possibility of studying very granular changes in consumption around particular income transitions (e.g., Chapter 7) around retirement. Though it has been argued that a host of the consumption puzzles

14 Handbook of US Consumer Economics

around retirement have been put to bed (Hurst, 2008; Hurd and Rohwedder, 2008), the interplay of consumption and income around the time of retirement is not, as it turns out, as stark or straightforward as one might have thought.

6. Measurement issues Circling back to the earlier discussion of the evolving research landscape, the proliferation of bigger data and improving methodologies both contribute, at least in principle, to better measurement of consumer-related phenomena. But it is not so cut and dried. Jain (Chapter 11) emphasizes some of the trade-offs between more data and more noise in empirical analysis. Lazer et al. (2014) examine the case study of Google Flu Trends that was beset, among other things, with having overfit their predictive model using Google search data. There is also the more elemental issue that some of the central measurement problems pertaining to the consumer are difficult to overcome, regardless of the quantity of data thrown at them. The measurement errors associated with real consumption statistics are both thorny and persistent. The nominal aspects of the computation, and inflation statistics in particular, have received the most attention in recent decades. One of the highest profile examples is the Advisory Commission to Study the CPI in the mid-1990s (also known as the “Boskin Commission” report). The Boskin report estimated substantial overstatement of the CPI indexdon the order of 1 percentage point annuallydas the combined result of a host of biases. On balance, the multitude of studies that followed the report documented a partial (though incomplete) improvement in the extent of measurement error over time. Gordon (2000), one of the Boskin Commission authors, estimated that improvements implemented in the aftermath of the report had lowered the overstatement of annual CPI inflation by 45 basis points, but that 65 basis points remained. In a survey article, Lebow and Rudd (2003) estimated that, after accounting for improvements but with the introduction of additional sources of bias, the estimated cumulative error remained at 90 basis points. An important reason for the persistence of these errors is their multifaceted nature, including an array of substitution biases pertaining to the basket of goods used in the index. To name a few (1) substitution from more expensive to less expensive goods within the basket (in response to relative price changes) creates difficulties with the fixed basket construction of the index; (2) new product introductions are problematic to the extent they are not included in the basket in a timely manner, or at all, and (3) price level differences across outlets are not always captured when consumers shift from high to low price sellersdas would be the case substituting from a mom-and-pop to a big-box store, or from a brick-and-mortar outlet to another firm selling online. Houseman et al. (2011) extend this logic to include substitution from USproduced intermediate inputs to those produced abroad. Then there is unobserved quality change. Shapiro and Wilcox (1996) refer to quality change as

Empirical analysis of the US consumer Chapter | 1

15

the “house-to-house combat of price measurement.” The battle metaphor appears to have stuck. In the title of their article, Groshen et al. (2017) refer to the work underway to mitigate quality change and new goods biases in government statistics as “a view from the trenches.” The treatment of data is another of the challenges inherent to consumer research. For instance, Hamilton (2017) uses empirical tests of the Hall’s PIHdspecifically, that consumption follows a random walkdto illustrate how the HP-filtered data can potentially be misleading. Data revisions pose yet another problem. Stekler (1967) provides an early examination of this question, asking whether prerevision data are useful at all for the purpose of economic analysis. Croushore (2005) argues that the real-time nature of the data matters, as using latest-available data increases the ability of consumerconfidence indexes to predict consumer spending. In response to these measurement challenges, the literature has evolved in two ways. The first includes more direct extensions of data and methodology. The application of hedonic methods to address quality change issues in national statistics is a good example, as are attempts to grapple with real-time data issues (Croushore, 2011). The second is in the pursuit and development of new empirical perspectives, several of which are highlighted in the Handbook. Aten (Chapter 12) develops regional price parity measures for the United States in order to better characterize real incomes across geographic areas with heterogeneous price levels. Bradley and Matsumoto (Chapter 13) address the quality change issue in medical care pricing by rerendering the price indexes in terms of health outcomes. And Garner and Fox (Chapter 14) describe the development of a supplemental poverty measure that better accounts for tax and transfer programs, as well as the particular spending needs of the US households near the official poverty threshold.

7. International perspectives Notwithstanding the progress described in previous sections, analysis of the US consumer is still early along its path toward a comprehensive empirical perspective of consumer behavior. Looking to contributions from outside of the United States allows us to benchmark progress in this regard and, in some instances, to get a sense of where things are headed. One clear example of room for improvementdthat is, with hopeful progress in other countriesdis in the consolidation of survey and administrative data sources, along the lines of that which have occurred in Sweden and Denmark. Browning and LethPeterson (2003) is an early example of merging universal administrative data on Danish disposable income and certain household assets in order to impute consumption (as the difference between household income and changes in net worth). Comparing their results to those in the Danish Expenditure Survey, they establish a proof of concept for that imputation methodology. Subsequent efforts in Sweden use disaggregated information on

16 Handbook of US Consumer Economics

real estate and financial assets to estimate more granular contributions of capital gains to consumption (Koijen et al., 2014) and by imputing rents and other real estateerelated flows (Kolsrud et al., 2017). In Norway, where there is still a wealth tax in placedand, hence, more comprehensive administrative data for household assetsdFagereng and Halvorsen (2017) and Eika et al. (2017) also employ the imputed consumption methodology. Another useful comparison to the US consumer experience is that of emerging market economies, where some of the underlying drivers of consumption are in earlier stages of development and where relatively high levels of economic volatility put issues like consumption smoothing in sharp relief. From a microeconomic perspective, there are questions of both the cause and effect in the relationship between higher volatility and household behavior. An extensive literature, surveyed in Neumeyer and Perri (2004), documents the relatively high-consumption volatility in emerging markets, with consumption actually more volatile than income (Kose et al., 2003; Aguiar and Gopinath, 2004). In an attempt to explain this oddity, Hicks (2015) uses survey microdata from Mexico to argue that consumption is actually much smoother than that implied by expenditures after accounting for substitution between market and nonmarket activities, an analogous argument to the one made in Aguiar and Hurst (2013) for the United States. Moreover, this substitution mechanism, and the fact that the nonmarket economy share is higher in Mexico than in the United States, helps to explain observed expenditure volatility differences between the two countries. From a macroeconomic perspective, the evolution of expenditure types and consumption drivers in emerging markets has been an area of focus and provides important comparative context for the Handbook. One topic of intense interest across both the academic and business communities is the evolution of consumer behavior in China and the extent to which it is converging to that in developed market economies. Atsmon et al. (2010) argues that the convergence process is well underway, with Internet usage and ecommerce as key mechanisms. Fan et al. (2018) argue that e-commerce in China has indeed been rising and that it helps to explain growing intercity trade and declining spatial consumption inequality. In Chapter 9, Tian et al. use microdata from one of China’s largest e-commerce platforms to understand the trajectory, frequency, and composition of this trend, providing intriguing context for an analogous increase in e-commerce in the United States, where it accounts for about a 10th of retail sales by value. Finally, financial development and its manifestation at the household level features prominently in studies of economic development and emerging market business cycles. Muller (2017) documents a 15 percentage point increase in household credit as a fraction of GDP in emerging economies from 1980 to 2014, at a time when corporate credit to GDP was roughly flat. In Chapter 4, Garber et al. use Brazilian credit registry data to characterize an entire household debt cycle between 2003 and 2016. They argue that the

Empirical analysis of the US consumer Chapter | 1

17

Brazilian boom-and-bust experience in household credit growth over that period was largely supply driven and had a similar profile to prior experiences in emerging markets. The stylized nature of these experiences sheds light on credit dynamics elsewhere and their role in broader cyclical fluctuations.

8. Concluding thoughts The overarching goal of the Handbook is to gather empirically oriented contemporary papers on key themes in the analysis of the US consumer. In our attempt to describe those themes (nonexhaustively), it should be clear that any collection of 14 chapters cannot possibly fully represent the universe of research in this area. Hence, this chapter simply attempts to fill in some of the blind spots by describing relevant related areas of research. In a similar endeavor, Pistaferri (2015) concludes that despite the centrality of the analysis on consumption to research and economics, “data on household consumption and spending in the US are few and problematic.” In other words, the transition toward empirical analysis that we have documented for this line of researchdand, indeed, which characterizes economic research more broadlydis still hamstrung to some extent by issues of data availability and quality. In contrast to the rapid proliferation of administrative data sets for research in Scandinavian countries, other regions including the United States lag significantly behind in terms of the ability to link administrative data to surveys and other sources. Viewed through this lens, the diverse patchwork of public and private sector data available in the United States seems ripe for harmonization and consolidation. Doing so would no doubt require greater efforts to overcome substantial administrative and logistical hurdles, not to mention dealing with weighty issues pertaining to data privacy. It is also important to recognize the ultimate limitations of attempts at data consolidation. Gaps would still remain on key variables such as real income, where universal data coverage on nominal earnings exists through administrative data sets, but where survey data are still the predominant data source for deflators. And, further, even in a world with full national integration of data, international comparisons would still be limited by comparability issues. A nascent area of research that would benefit from greater data harmonization and integration is the study of international influences on the US consumer. Amiti et al. (2017) examine China’s trade liberalization in the early 2000s and its effect on US manufacturing price indexes. Implicit in their conclusion that changes in trade policy exerted significant downward pressure on US prices (through an array of channels) is the idea that consumer welfare benefits move in line with those terms of trade improvements. An analogous argument for FDI into Mexico is provided by Atkin et al. (2018), where Mexican microdata are used to show substantial cost of living reductions due to foreign supermarket entry. Thus, assuming that the right data can be brought to bear on it, more direct delineation of globalization effects on US consumer

18 Handbook of US Consumer Economics

prices, spending patterns, and welfare is an area of fruitful future research. More complete data will also help to understand the effects of other policies on consumption, operating through uncertainty (Bertola et al., 2005; Bloom, 2014), the mechanics of fiscal stimulus as mentioned in Section 4, or monetary policy’s influence on consumption levels and inequality (Coibion et al., 2012). One final conjecture about the future is that research on consumption actually moves beyond the reconciliation of data with the life-cycle hypothesis or PIH, and in a much more atheoretical direction. The rise of behavioral considerations to explain the economics of the consumer is already underway, as described in DellaVigna’s (2009) survey article. An open question is whether the diverse set of deviations from assumptions made in core economic theoriesdvia nonstandard preferences, beliefs, and decisionmakingdultimately leads to a more granular, but less centralized and coherent, understanding of the consumer.

References Ackley, G., 1961. Macroeconomic Theory. Macmillan, New York. Advisory Commission to Study the CPI, 1996. Final report of the advisory Commission to study the consumer price index. In: Boskin, M.J., Dulberger, E., Gordon, R., Griliches, Z., Jorgenson, D. (Eds.), Final Report to the Senate Finance Committee. December 4, 1996. U.S. Govt. Print. Office, Washington, DC. Aguiar, M., Bils, M., 2015. Has consumption inequality mirrored income inequality? American Economic Review 105 (9), 2725e2756. Aguiar, M., Gopinath, G., 2004. Emerging Market Business Cycles: The Cycle is the Trend. Federal Reserve Bank of Boston Working Paper, No. 04-4. Aguiar, M., Hurst, E., 2013. Deconstructing life cycle expenditure. Journal of Political Economy 121 (3), 437e492. Ahmed, S., Levin, A., Wilson, B.A., 2004. Recent U.S. Macroeconomic stability: good policies, good practices, or good luck? Review of Economics and Statistics 86, 824e832. Amiti, M., Dai, M., Feenstra, R.C., Romalis, J., 2017. How Did China’s WTO Entry Affect U.S. Prices? FRBNY Staff Report No. 817. Armantier, O., Topa, G., van der Klaauw, W., Zafar, B., 2017. An overview of the survey of consumer expectations. Economic Policy Review (23:2). Federal Reserve Bank of New York. Atkin, D., Faber, B., Gonzalez-Navarro, M., 2018. Retail globalization and household welfare: evidence from Mexico. Journal of Political Economy 126 (1), 1e73. Atkinson, A.B., Piketty, T., Saez, E., 2011. Top incomes in the long run of history. Journal of Economic Literature 49 (1), 3e71. Atsmon, Y., Dixit, V., Magni, M., St-Maurice, I., September 2010. China’s new pragmatic consumers. McKinsey Quarterly. Attanasio, O.P, Browning, M., 1995. Consumption over the life cycle and over the business cycle. American Economic Review 85 (5), 1118e1137. Attanasio, O.P., 1999. Consumption. In: Taylor, J.B., Woodford, M. (Eds.), Handbook of Macroeconomics, vol. 1. Elsevier, pp. 741e812 (Chapter 11). Attanasio, O.P., Pistaferri, L., 2016. Consumption inequality. Journal of Economic Perspectives 30 (2), 3e28.

Empirical analysis of the US consumer Chapter | 1

19

Attanasio, O., Weber, G., 1995. Is Consumption Growth Consistent with Intertemporal Optimization? Evidence from Consumer Expenditure Survey. Journal of Political Economy 103, 1121e1157. Bee, A., Meyer, B.D., Sullivan, J.X., 2015. The validity of consumption data: are the consumer expenditure interview and diary surveys informative? In: Carroll, C.D., Crossley, T.F., Sabelhaus, J. (Eds.), Improving the Measurement of Consumer Expenditures. University of Chicago Press. Bernanke, B., 2004. The great moderation. In: At the Meetings of the Eastern Economic Association, Washington, D.C., February 20. Bertola, G., Guiso, L., Pistaferri, L., 2005. Uncertainty and consumer durables adjustment. Review of Economic Studies 72 (4), 973e1007. Bloom, N., 2014. Fluctuations in uncertainty. Journal of Economic Perspectives 28 (2), 153e176. Blundell, R., 1988. Consumer behaviour: theory and empirical evidence e a survey. Economic Journal 98 (389), 16e65. Blundell, R., Pashardes, P., Weber, G., 1993. What do we learn about consumer demand patterns from micro data? American Economic Review 83 (3), 570e597. Browning, M., Crossley, T.F., 2001. The life-cycle model of consumption and saving. Journal of Economic Perspectives 15 (3), 3e22. Browning, M., Leth-Petersen, S., 2003. Imputing consumption from income and wealth information. The Economic Journal 113 (488), F282eF301. Campbell, J., Mankiw, N.G., 1989. Consumption, Income, and Interest Rates: Reinterpreting the Time Series Evidence, with John Campbell. NBER Macroeconomics Annual 4, 185e216. Carroll, C.D., Summers, L.H., 1991. Consumption Growth Parallels Income Growth: Some New Evidence. In: National Saving and Economic Performance. National Bureau of Economic Research, Inc., pp305e348 (NBER Chapters). Case, K.E., Quigley, J.M., Shiller, R.J., 2005. Comparing wealth effects: the stock market versus the housing market. Advances in Microeconomics 5 (1), 1e32. Case, K.E., Quigley, J.M., Shiller, R.J., 2013. Wealth Effects Revisited: 1975e2012. NBER Working Paper 18667. Cavallo, A., Rigobon, R., 2016. The billion prices project: using online prices for measurement and research. Journal of Economic Perspectives 30 (2), 151e178. Choi, H., Varian, H., 2012. Predicting the present with Google trends. Economic Record 88 (1), 2e9. Coibion, O., Gorodnichenko, Y., Lorenz, K., Juan Silvia, 2012. Innocent Bystanders? Monetary Policy and Inequality in the U.S. Mimeo. Croushore, D., 2005. Do consumer confidence indexes help forecast consumer spending in real time? North American Journal of Economics and Finance 16, 435e450. Croushore, D., 2011. Frontiers of real-time data analysis. Journal of Economic Literature 49 (1), 72e100. D’Amuri, F., Marcucci, J., 2012. The Predictive Power of Google Searches in Forecasting Unemployment. Bank of Italy Temi di Discussione No. 891. Davis, S.J., Kahn, J.A., 2008. Interpreting the great moderation: changes in the volatility of economic activity at the macro and micro levels. Journal of Economic Perspectives 22 (4), 155e180. DellaVigna, S., 2009. Psychology and economics: evidence from the field. Journal of Economic Literature 47 (2), 315e372. Dynan, K., Douglas, E., Sichel, D., 2012. The evolution of household income volatility. The B.E. Journal of Economic Analysis and Policy 12 (2). Eika, L., Mogstad, M., Vestad, O., 2017. What Can We Learn About Household Consumption from Information on Income and Wealth. Working Paper.

20 Handbook of US Consumer Economics Einav, L., Levin, J., 2014. Economics in the age of big data. Science 346 (6210). Fagereng, A., Halvorsen, E., 2017. Imputing consumption from Norwegian income and wealth registry data. Journal of Economic and Social Measurement 42 (1), 67e100. Fan, J., Tang, L., Zhu, W., Zou, B., 2018. The Alibaba effect: spatial consumption inequality and the welfare gains from e-commerce. Journal of International Economics 114 (C), 203e220. Elsevier. Ferna´ndez-Villaverde, J., Krueger, D., 2007. Consumption over the life cycle: facts from consumer expenditure survey data. The Review of Economics and Statistics 89 (3), 552e565. MIT Press. Friedman, M., 1957. A Theory of the Consumption Function. National Bureau of Economic Research, Princeton, N.J. Gorbachev, O., 2011. Did household consumption become more volatile? American Economic Review 101 (5), 2248e2270. American Economic Association. Gordon, R.J., 2000. The Boskin Commission Report and its Aftermath. NBER Working Paper No. 7759. Gourinchas, P.-O., Parker, J.A., 2002. Consumption over the life cycle. Econometrica v70 (1,Jan), 47e89. Groshen, E.L., Moyer, B.C., Aizcorbe, A.M., Bradley, R., Friedman, D.M., Spring 2017. How government statistics adjust for potential biases from quality change and new goods in an age of digital technologies: a view from the trenches. Journal of Economic Perspectives 31 (2), 187e210. Hall, R., 1978. Stochastic implications of the life cycle-permanent income hypothesis: theory and evidence. Journal of Political Economy 86 (6), 971e988. Hamermesh, D.S., 2013. Six decades of top economics publishing: who and how? Journal of Economic Literature 51 (1), 162e172. Hamilton, J.D., 2017. Why You Should Never Use the HodrickePrescott Filter. NBER Working Paper 23429. Hicks, D.L., 2015. Consumption volatility, marketization, and expenditure in an emerging market economy. American Economic Journal: Macroeconomics 7 (2), 95e123. Houseman, S., Kurz, C., Paul, L., Mandel, B., 2011. Offshoring bias in U.S. Manufacturing. Journal of Economic Perspectives 25 (2), 111e132. Hurd, M.D., Rohwedder, S., 2008. The Retirement-Consumption Puzzle: Actual Spending Change in Panel Data. RAND Working Paper Series No. WR-563. Hurst, E., 2008. Understanding consumption in retirement: recent developments. In: Ameriks, J., Mitchell, O. (Eds.), Recalibrating Retirement Spending and Saving. Oxford University Press. Koijen, R., Van Nieuwerburgh, S., Vestman, R., 2014. Judging the quality of survey data by comparison with "Truth" as measured by administrative records: evidence from Sweden. In: Improving the Measurement of Consumer Expenditures. National Bureau of Economic Research, Inc., 308e346. Kolsrud, J., Landais, C., Spinnewijn, J., 2017. Studying Consumption Patterns Using Registry Data: Lessons from Swedish Administrative Data. CEPR Discussion Paper No. DP12402. Kose, M.A., Prasad, E., Terrones, M., 2003. Financial Integration and Macroeconomic Volatility. IMF Working Paper. Krueger, D., Perri, F., 2006. Does income inequality lead to consumption inequality? Evidence and theory. Review of Economic Studies 73 (1), 163e193. Lazer, D., Kennedy, R., King, G., Vespignani, A., 2014. The parable of Google Flu: traps in big data analysis. Science 343 (6176), 1203e1205. Lebow, D.E., Rudd, J.B., 2003. Measurement error in the consumer price index: where do we stand? Journal of Economic Literature XLI, 159e201.

Empirical analysis of the US consumer Chapter | 1

21

Lee, D., van der Klaauw, W., 2010. An Introduction to the FRBNY Consumer Credit Panel. Federal Reserve Bank of New York Staff Reports #479. Meyer, B.D., Mok, W.K.C., Sullivan, J.X., 2015. Household surveys in Crisis. Journal of Economic Perspectives 29 (4), 199e226. Mian, A., Sufi, A., 2012. The effects of fiscal stimulus: evidence from the 2009 Cash for Clunkers program. The Quarterly Journal of Economics 127 (3), 1107e1142. Modigliani, F., Brumberg, R.H., 1954. Utility analysis and the consumption function: an interpretation of cross-section data. In: Kurihara, K.K (Ed.), Post-Keynesian Economics. Rutgers University Press, New Brunswick, NJ, pp. 388e436. Muller, K., 2017. Sectoral Credit Around the World, 1940e2014. Mimeo. Neumeyer, P., Perri, F., 2004. Business Cycles in Emerging Economies: The Role of Interest Rates. Federal Reserve Bank of Minneapolis Research Department Staff Report 335. Parker, J.A., Souleles, N.S., Johnson, D.S., McClelland, R., 2013. Consumer spending and the economic stimulus payments of 2008. American Economic Review 103 (6), 2530e2553. Passero, W., Garner, T.I., McCully, C., 2015. “Understanding the relationship: CE survey and PCE. In: Carroll, C.D., Crossley, T.F., Sabelhaus, J. (Eds.), Improving the Measurement of Consumer Expenditures. University of Chicago Press (Chapter 6). Piketty, T., 2014. Capital in the Twenty-First Century. Harvard University Press. Piketty, T., Saez, E., 2014. Inequality in the long run. Science 344, 838e843. Pistaferri, L., 2015. Household consumption: research questions, measurement issues, and data collection strategies. Journal of Economic and Social Measurement 40, 97e123. Sahm, C.R., Shapiro, M.D., Slemrod, J., 2012. Check in the mail or more in the paycheck: does the effectiveness of fiscal stimulus depend on how it is delivered? American Economic Journal: Economic Policy 4 (3), 216e250. Shapiro, M.D., Wilcox, D.W., 1996. Mismeasurement in the consumer price index: an evaluation. In: NBER Macroeconomics Annual 1996. Ben Bernanke and Julio Rotemberg. MIT Press, Cambridge, MA, pp. 93e142. Stekler, H.O., 1967. Data revisions and economic forecasting. Journal of the American Statistical Association 62 (318), 470e483. Tobin, J., 1967. Life-cycle saving and balanced growth. In: Fellner, W., et al. (Eds.), Ten Economic Essays in the Tradition of Irving Fisher. John Wiley, London, New York, pp. 231e256. Varian, H.R., 2014. Big data: new tricks for econometrics. Journal of Economic Perspectives 28 (2), 3e28.

Chapter 2

Handbook of the consumer chapter: trends in household debt and credit Andrew Haughwout, Donghoon Lee, Joelle Scally, Lauren Thomas, Wilbert van der Klaauw Federal Reserve Bank of New York, New York, NY, United States

1. Overview Since the onset of the 2008 financial crisis, consumer financial and borrowing behavior, once considered a relatively quiet little corner of finance, has been of enormously increased interest to policymakers and researchers alike. Prior to the Great Recession, there was a historic run-up in household debt, driven primarily by housing debt, which coincided with a speculative bubble and sharp rises in home prices. Then, as prices began to fall, millions of households began defaulting on their mortgages, unable to keep up with home payments, and greatly contributing to the onset of the deepest recession since the 1930s. Following the steep increase in debt balances during the boom, households began rapidly paying off their loans during and immediately after the Great Recession. Since 2013, debt has begun to increase and eventually rise above its previous levels, albeit at a much slower rate than before, at least partially due to stricter lending standards. In this chapter, we examine the trends in household debt before, during, and since the 2000s financial crisis and Great Recession. As we will show, this period is unique in American history in several ways. Our analysis will show the sources of the historic run-up in debt during the bubble period of the early 2000s, the change in borrowing behavior that took place as the financial crisis and Great Recession took hold, and the nature of the recovery that began in 2013. We find that while total household debt has recovered to its previous level in nominal terms, its composition and characteristics have changed dramatically along many dimensions. Our chapter also makes a methodological contribution, by describing and using a relatively new data set that is uniquely well suiteddindeed designed Handbook of US Consumer Economics. https://doi.org/10.1016/B978-0-12-813524-2.00002-0 Andrew Haughwout, Wilbert van der Klaauw, Donghoon Lee, Joelle Scally and Lauren Thomas: 2019 Published by Elsevier Inc.

23

24 Handbook of US Consumer Economics

specificallydfor the kind of analysis we present. The data set, known as the New York Fed Consumer Credit Panel (CCP), tracks individual and household debt throughout time for a large panel of individual consumers and households. The data set’s large size and administrative purpose make it unusual and, consistent with this Handbook’s theme of new data sources for consumer analysis, we will explain how it was constructed in some detail.

2. Data Our primary source of data is the New York Fed’s CCP. The CCP contains information from a large, representative sample of consumer credit reports from Equifax, one of the three national credit reporting agencies. The data are reported quarterly, from 1999Q1 to the present.1 While this feature is not a critical part of the analysis presented here, the data are very timely: updates are typically available within 2 months of the end of a quarter. Thus the data for 2018Q3 were posted in November 2018. The individuals in the primary sample are selected using the randomly assigned last two digits of their social security number, producing a dynamically updated panel data set that is representative of the population at every point in time.2 The population from which the sample is drawn is thus individuals with a credit report and a social security number. In addition to the primary sample, the data include a household sample that provides the same information on all those persons with credit reports who reside at the same address as the individuals in the primary sample. Generally speaking, a credit report will exist for individuals who have obtained or sought loans from the formal financial sector, including the government. Altogether, the CCP captures the credit reports of more than 45 million Americans as of mid-2018. The CCP contains detailed information drawn from each individual’s Equifax credit report. The data include information on loan accounts, public record and collection accounts, and some basic demographic information. For loan accounts, the data include balance, presence and extent of delinquencies, and payment information for each type of loan that each borrower holds at each point in time. Nearly all loan balances (roughly 97%) fall into five borrowing categories3:

1. As of this writing, the data are available through the third quarter of 2018; our analysis goes through 2018Q2. Updated information from the data set is reported in the Quarterly Report on Household Debt and Credit, hosted on the New York Fed’s Center for Microeconomic Data. 2. See Lee and van der Klaauw (2010) for more details about the sampling design and content of the CCP. 3. The remaining 3% is in a category referred to as “other,” and includes such obligations as sales financing loans, personal loans, and retail credit cards.

Handbook of the consumer chapter Chapter | 2

l

l

l l l

25

mortgages (first mortgages and home equity installment loans, sometimes referred to as closed-end second (CES) mortgages) secured by housing collateral; home equity revolving loans, sometimes referred to as Home Equity Lines of Credit (HELOCs), and secured by home equity; auto loans (including automobile leases); bank credit cards, and student loans.

The level of data aggregation varies depending on the type of loan involved. The CCP reports information on up to 19 individual mortgages, including both first mortgage and home equity installment loans, and five home equity lines of credit; the data also include aggregates for each housing loan type. This allows very detailed analysis of housing debt, which has historically been far and away the most important household obligation. In 2018Q2, housing debt comprised 71% of total household debt. The CCP also includes information on individual student loans. It is thus possible, for mortgages, HELOCs, and student loans, to track individual obligations, as well as the individuals who are responsible for them, over time. For credit cards and auto loans, data are reported at the individual borrower level. Thus we may see that a specific individual has three credit cards, with a total balance of $10,000 and a credit limit of $15,000, but we will not know the balance or limit on a specific card. The data set also encompasses information on foreclosures, bankruptcies, loans, and other accounts in collections. Finally, a short list of individual characteristics: geographic location (down to the census block), birthyear, and each individual’s Equifax Risk Score, a summary statistic intended to predict how risky a given borrower is, analogous to the well-known FICO risk score. The records are thoroughly anonymized, making it impossible to identify individuals in the data. In Fig. 2.1, we combine 1999e2018 data from the CCP with the considerably longer, but less detailed, data from the Federal Reserve Board’s Financial Accounts of the United States.4 What is immediately apparent in the figure is the dramatic and unprecedented nature of the events of the 2000s relative to the rest of the postwar history of US household debt. As can be seen in Fig. 2.1, nominal household debt has increased quite steadily since World War II, with the very notable exception of the 2008e12 period. Indeed, debt increased in all but 8 of the 227 quarters between 1951Q1 and 2008Q3, and in 4. This series, previously known as the Flow of Funds, provides aggregate debt statistics for households and nonprofits. Since this data set measures a slightly different concept of debt than the CCP, which is confined to households, we produce a simulated national accounts series by backwards applying reported rates of change in the Flow of Funds to the 1999 CCP data. Student debt is excluded from the CCP figures prior to 2003 at which point it represented approximately 3% of total household debt. See (Haughwout et al., 2017a, 2017b) for further detail.

26 Handbook of US Consumer Economics Total Debt Estimated pre-1999

Mortgage

HE Revolving

Auto Loan

Credit Card

Student Loan

Other

Trillions of Dollars 16 14 12 10 8 6 4 2 0 1945

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

2005

2010

2015

Note: For comparability, we applied the pre-1999 quarterly growth rate from the Flow of Funds to splice the two series in 1999.

FIGURE 2.1 Household debt in historical context. Source: New York Fed Consumer Credit Panel/Equifax; Federal Reserve Board.

no case did it decline for two quarters in a row; growth versus a year earlier was positive in literally every quarter. Less obvious because of the scale of the figure is that throughout much of the postwar period, debt has grown rapidly. During the 1950s, for example, nominal debt increased from $47 billion to $152 billion, a compound rate of 12.5% per year. Double digit year-over-year percentage increases in the series have been fairly common historically, including a string of 21 consecutive quarters between mid-2002 and mid-2007.5 Notably, debt grew through strong national economic expansions (þ105% between 1991Q1 and 2000Q4 for a 7.3% compound annual growth rate (CAGR)) and recessions (an average 3% CAGR during the 1990e91 and 2001 recessions). The strong increase in debt during 2000e06 was associated with the housing bubble that was inflating during that period. Fig. 2.2 shows the steady path of house price increases through 2006, with a strong upward trend in price changes with increases exceeding an annual rate of 10% between late 2003 and early 2006. Mortgage and HELOC balances led the increases in household debt during the early 2000s, as these two forms of housing-secured debt more than tripled from $3.3 trillion (72% of the total) in 1999 to almost $10 trillion (78%) by 2008Q3. Other debt rose as well, from $1.3 to $2.7 trillion: substantial increases but not nearly as large in either percentage or dollar terms as 5. Excluding student debt, which is included in the CCP data only after 2003, the figure is 20 of 21 quarters, since the rate in 2003Q3 was 9.2%.

Handbook of the consumer chapter Chapter | 2

27

Housing Price Index, 1991-2006 250 230 210 190 170 150 130 110 2002

2004

2006

2008

2010

2012

2014

2016

2018

FIGURE 2.2 Housing price change, 1991e2006. Note: shading shows NBER recessions. Source: CoreLogic, a property data and analytics company.

those seen in the housing-related space. Overall, rising housing debt comprised over 80% of the increase in household debt that occurred during the 1999e2008 credit boom. While mortgage debt was overwhelmingly the main contributor to the rise in housing debt in dollar terms, HELOCs grew fasterdHELOC debt in 2008Q3 was 765% of its 1999Q1 levelda more than eightfold increase. The phase that commenced in the second half of 2008din the period following the downturn in home prices in early 2007, the collapse of Lehman Brothers, and the onset of the financial crisisdfeatured an unwinding of part of the debt increase that had occurred during the previous decade. It is this perioddnot the borrowing boom which preceded itdthat is the most striking break from the historical pattern in Fig. 2.1. Indeed, the “Great Deleveraging” that took place between 2008 and 2013 was truly unprecedented historically: starting in 2008Q4, debt declined for nine consecutive quarters. In the previous 63 years (255 quarters), debt had never declined in consecutive quarters in nominal terms. Zooming in on Fig. 2.1 with a shorter time period, we are better able to view the composition of household debt balances by product in Fig. 2.3, in which several things are clear. First, total household debt is dominated by housing debt, which comprises over two-thirds of the total. Second, there is a strong cycle in mortgage debt which drives the cycle in total household borrowing. Third, auto and credit card debt follow a pattern similar to that in housing debt, rising and then falling. Finally, there is a strong upward trend in

28 Handbook of US Consumer Economics

Trillions of Dollars 15

Mortgage

Total Debt Balance and its Composition HE Revolving

Auto Loan

Credit Card

Trillions of Dollars

Student Loan

15

Other

2018Q2 Total: $13.29 Trillion (3%)

12

(11%)

12

(6%)

(9%)

9

(3%)

9

6

6 (68%)

3

3

0

0

FIGURE 2.3 Total debt balance and its composition. Source: New York Fed Consumer Credit Panel/Equifax.

the amount of student debt outstanding, shown in red, which appears to be impervious to the overall pattern.

3. Decomposing the borrowing cycle These facts about the important role of housing in the borrowing boom of the first part of the 2000s suggest that a focus on the evolution of mortgages and HELOCs will shed light on household behavior and, potentially, the dynamics that led to and contributed to the financial crisis that began in late 2008. In this section, we trace out the sources of balance changes during the first part of the 2000s, the deleveraging period, and the recent return to positive debt growth. Changes in outstanding debt figures are the net result of a complex set of borrower activitiesdnew borrowing, scheduled repayment as well as prepayment of outstanding balances, and defaults resulting in balance declines from charge-offs balances. In Figs. 2.4 and 2.5, we decompose the aggregate balance change into some of these channels, to isolate different behavioral patterns that are relevant to understanding household behavior during the boom, bust, and recovery. Turning first to Fig. 2.4, we focus on four-quarter growth in nonhousing debt, after removing the effect of charge-offs associated with defaults.6 We

6. We remove the effects of default to determine whether any deleveraging that occurred during the 2008e11 period was through the scheduled repayment of debt, or whether it was all accomplished through defaults.

29

Handbook of the consumer chapter Chapter | 2

Non-Mortgage Debt Change Other than Charge-Offs (4Q sum) 300.0

250.0

Auto, credit card and other 200.0

150.0

Student loan 100.0

50.0

0.0

-50.0

-100.0

FIGURE 2.4 Nonmortgage debt change other than charge-offs. Source: New York Fed Consumer Credit Panel/Equifax.

Millions

Total Number of New and Closed Accounts and Inquiries

400 350

Millions 400

Number of Accounts Closed within 12 Months

350

300

300

250

250

200

200

150

150

100

Number of Inquiries within 6 Months

100

Number of Accounts Opened within 12 Months 50

50

0

0

FIGURE 2.5 Total number of new and closed accounts and inquiries. Source: New York Fed Consumer Credit Panel/Equifax.

30 Handbook of US Consumer Economics

further decompose the changes in these obligations into student loan and other debt. The reason for this is apparent from the figure: while there is evidence of a strong credit cycle in the blue line (auto, credit card, and other debt), annual student debt growth was consistently positive. After about 2005, growth in student debt is also remarkably steady at around $75 billion per year. In spite of a severe recession and a general deleveraging in the household sector, borrowing to fund educational expenses has been a steady contributor to outstanding household debt balances. We will return to the remarkable steadiness of student borrowing later. Auto, credit card, and other debt display no such stability. Here we see strong direct evidence of the credit cycle, and perhaps indirect evidence of the use of housing wealth to fund general consumption. After frequently exceeding $200 billion in annual growth during the 2000e03 period, net borrowing in this category stepped down to about $100e$150 billion from 2003e08. We interpret these figures as net new borrowing, excluding the effect of defaults, used to finance vehicle leases and purchases (auto debt), and general consumption (credit card debt and other). The fact that these kinds of new borrowing seem to have declined as the housing boom reached its peak in 2003e06 and the economy grew faster overall is suggestive of the use of relatively lower-cost housing debt in place of auto and credit card borrowing. While somewhat reduced after 2003, these forms of borrowing on net continue to be strongly positive until 2008, when they suddenly drop below zero. This reduction in balances outside of the default process results from actual pay down of debt by borrowers, which directly reduces the resources available for spending on cars and other elements of the household consumption basket. One mechanism by which large declines in nonmortgage debt was accomplished was account closings, shown in Fig. 2.5. During the recession, on net, 120 million credit card accounts were closed. Fig. 2.5 also indicates the lack of new accounts to replace those that were closed. Were lenders restricting credit and closing accounts, or were consumers voluntarily closing and not opening new ones? Fortunately, the CCP can offer some clues to this puzzle. During the recession and the following months, the number of accounts closed within the previous 12 months spiked from 2008 to 2010, with a high of 375 million accounts reported closed over the past 12 months at the end of 2009Q3. The number of accounts opened maintained a steady pace prior to the recession, but began a decline in 2008 that continued through 2010. At its trough, new account openings had dropped nearly 40% to 158 million in 2010Q3 from their peak of roughly 250 million per quarter from 2005Q3 to 2007Q3. These patterns could be a result of either supply or demand; lenders may have tightened their standards, while consumers were most likely reluctant to take on new debt in such a vulnerable time. We can shed a light on this picture by looking at the number of inquiries reported to Equifax, which

Handbook of the consumer chapter Chapter | 2

31

reflect applications for new credit accounts and which are also reported in Fig. 2.5.7 Holding the quality of applicants constant, if inquiries remained stable while the number of new accounts declined, we could infer that the reduction in credit was mainly due to stricter lending standards. On the other hand, if the number of inquiries, adjusted for quality of applicant, followed the path of account openings, then part of the decline in debt was due to consumers applying for less credit. Of course, it is important to note that this does not fully disaggregate supply and demand; while consumers may decrease their credit applications, and thus the number of credit inquiries, due to a decrease in demand, they may also be responding to perceived tighter lending standards. Fig. 2.5 also shows the number of inquiries closely tracks the number of new accounts opened. Similar to the rate of new account openings, the number of inquiries dropped from a steady 240 million annual inquiries rate to a trough of 150 million in 2010Q2, before bouncing back. This suggests that the decrease in new accounts was due at least in part to a drop in credit applications by borrowers. However, some data do suggest that lenders did indeed tighten their lending standards. Aggregate credit limits for credit cards, which may represent the amount of risk a lender is willing to tolerate (higher credit limit ¼ higher risk) dropped at much higher rates than aggregate balances, see Fig. 2.6. Credit card limits decreased from a high of $3.7 trillion in 2008Q3 to $2.6 trillion in the fourth quarter of 2010, a 30% drop; over the same period, balances only dropped 15%, from $858 to $730 billion; tightening is also observed in the credit score distribution of credit card borrowers.8 In the auto market, the median credit score for auto loan originations shifted up slightly from the low 680s in 2005e07 to just over 710 in 2009e10dsoon followed by a return to the 690s in the early 2010s. Fig. 2.7 shows a similar decomposition for housing-related debt. Here, we explicitly show, in the red line, the effect of charge-offs, which we had suppressed in Fig. 2.4. Unsurprisingly, charge-offs have an increasingly negative effect on mortgage balances from 2006 to 2010, reaching nearly $360 billion in the four quarters ending 2010Q4. After that, the effects of the foreclosure crisis begin to fade, although charge-offs remained an important negative contributor to the change in debt outstanding through 2014.

7. While capturing all account openings, not all inquiries are sent to and recorded by all three credit bureaus. Assuming stability in the share of inquiries reported to Equifax, the pattern shown in Fig. 2.5 accurately portraits the overall trend in credit account applications. 8. See Campbell, G., A. Haughwout, D. Lee, J. Scally and W. van der Klaauw, 2016. “Just Released: Recent Developments in Consumer Credit Card Borrowing,” Liberty Street Economics, August 9.

32 Handbook of US Consumer Economics

Trillions of Dollars

Credit Limit and Balance for Credit Cards and HE Revolving

Trillions of Dollars

4

4 HELOC Balance

CC Balance

CC Limit

HELOC Limit

3

3

2

2

1

1

0

0

FIGURE 2.6 Credit limit and balance for credit cards and HE revolving. Source: New York Fed Consumer Credit Panel/Equifax.

$ Billion

Decomposion of Changes in Housing Debt (from 4 quarters ago)

1200

1000

800

First lien originations plus normal (non charge-off) payoffs 600

400

First lien amortization, refinances, and junior lien balance changes

200

0

-200

First and second lien charge offs

-400

-600

FIGURE 2.7 Decomposition of changes in housing balance (four quarter change). Source: Federal Reserve Bank of New York Consumer Credit Panel/Equifax.

Handbook of the consumer chapter Chapter | 2

$ Billion

33

Refinance, First Lien Amorzaon and Junior-Lien Acvity

400

300

Refinance 200

100

Junior-Lien Activity 0

-100

-200

First Lien Paydown -300

-400

FIGURE 2.8 Refinance, first-lien amortization, and junior-lien activity (four quarter change). Source: Federal Reserve Bank of New York Consumer Credit Panel/Equifax.

The blue line in Fig. 2.7 depicts the change in mortgage balances that is attributable to the buying and selling of homes. Home buying has a positive effect on balances for two primary reasons: (1) Even with constant loan-to-value ratios (LTVs), the loan size required to buy a home will get larger as prices rise over time. (2) The seller’s LTV is on average lower than the buyer’s LTV, since the seller has typically paid down some of the balance on the loan.9 While these reasons ensure that home buying activity will generate positive effects on outstanding balances, their strength clearly varies over time, producing a strong cyclicality in the blue line in Fig. 2.7. The rapid decline in the annualized purchase contribution from nearly a trillion dollars in 2006 to below $200 billion in 2009 reflects the sharp decline in house prices and housing transactions. Finally, the green line here shows the contribution of the remaining factors that affect housing-related debt balances. These factors are further broken out in Fig. 2.8, but before turning to that it is worth examining their aggregate effect Fig. 2.7. Between 2003 and 2006, the combination of first-lien amortization (the pay down of balances as people make their mortgage payments),

9. In the case of new construction, the buyer’s mortgage replaces whatever kind of financing builders had used; typically we will not observe builders’ debts in the CCP and the buyer’s entire mortgage will be a net increase in the outstanding mortgage balance.

34 Handbook of US Consumer Economics

cash-out refinances, and changes in balances on junior liens were adding around $200 billion a year to aggregate housing debt balances. Thus, after making their required monthly mortgage payments, households drew down the value of their housing equity by as much as $290 billion annually in 2003 ($880 billion in total for years 2003e06) to fund their consumption plans. As shown in Fig. 2.8, this borrowing was accomplished through two channels: cash-out refinance, shown in the gray line, and through junior liens both by opening CES mortgages, and opening and drawing against HELOCs. It is well known that the growth of housing debt and house values went hand-in-hand during the boom years of the 2000s, as the use of the “House as ATM” became widespread. Figs. 2.7 and 2.8 provide precise information on how significant this phenomenon became, and how it was accomplished, while Fig. 2.4 complements this information by showing that a similar dynamic was in place outside of housing debt. Net new borrowing in consumer ($200 billion annual rate from 2003 to 06) and housing-related ($200 billion) added roughly $400 billion (annual rate) to households’ funds available for consumption. This pace of household borrowing to fund consumption other than housing began to slow sharply after 2006 and reversed by 2008. By 2009, households were paying down $100 billion of debt at an annual rate, a complete turnaround in their borrowing and saving behavior by nearly half a trillion dollars compared to 2006, and one which coincided with a major and decline in personal consumption expenditures, which peaked in June 2008 and recovered to that previous nominal peak only in March 2010.

4. Trends in borrower characteristics We have seen that the rapid increase in debt shown in Fig. 2.3 was partially due to increases in the value of homes and the debt required to purchase them, and partially due to increases in the appetite for debt-financed consumption. Who were the borrowers behind this surge? Figs. 2.9 and 2.10 show part of the answer to this question for mortgage borrowing (trends in new HELOC borrowing will be discussed separately below). Here, each bar shows the total of new mortgage originations (the gross flow of new borrowing) by age, in Fig. 2.9, and by credit score group, in Fig. 2.10. The first thing to note in these two figures is the very sharp decline in mortgage originations after 2007. But there are also shifts in borrower types. The age distribution of mortgage originations shows a pronounced cyclical pattern. The effects of the financial crisis and the subsequent tightening of mortgage lending standards are apparent in the age distribution of borrowers. After capturing an average 38% of mortgage originations from 2003 to 08, borrowers under 40 years saw their share drop to 28% for 2012 and 2013. The share of mortgage lending to this group of borrowers, however, rebounded at the end of the period.

Handbook of the consumer chapter Chapter | 2

35

Mortgage Originations by Age Billions of Dollars 1,200 18-29

30-39

40-49

50-59

60-69

1,000

Billions of Dollars 1,200 70+ 1,000

800

800

600

600

400

400

200

200

0

0

Note: Age is defined as the current year minus the birthyear of the borrower. Age groups are re-defined each year. Balances may not add up to totals due to a small number of individuals with unknown birthyears.

FIGURE 2.9 Mortgage originations by age. Source: New York Fed Consumer Credit Panel/ Equifax.

Billions of Dollars 1,200

Mortgage Originations by Credit Score*

0)

Equal to one, if all outstanding uncollateralized debt is maxed out more than 80%

Score

Credit bureau’s estimated default risk within 12 months, number between 0 and 100, where 0 is low default risk

Number of arrears

Number of arrears including bank, government, and private claims

independent variables of interest. The pawn borrower sample is composed of people of relatively low socioeconomic status. It is no surprise that pawn borrowers have lower income, are less likely to own a house, or have income from capital than Swedes in the random sample. Furthermore, they exhibit a high demand for credit (inquiry) and high utilization of their uncollateralized

TABLE 5.2 Summary statistics. (1)

(2)

(3)

(4)

(5)

(6)

(7)

Random sample sd

1(Granted>0)

0.83

0.38

Age

49.85

18.31

Male

0.49

1(Entrepreneur>0)

Median

(9)

(10)

obs

N

1,018,160

163,352

Pawn sample N

Mean

sd

514,318

102,907

0.78

0.41

4,179,000

174,125

46.27

15.19

45

5,869,824

238,359

0.50

4,179,000

174,125

0.46

0.50

0

5,869,824

238,359

0.12

0.33

4,179,000

174,125

0.11

0.31

5,869,824

238,359

Income before tax, SEK

281,797

434,333

240,700

4,179,000

174,125

221,689

1,191,692

197,700

5,869,824

238,359

Log of income before tax

12.25

0.92

12.39

4,179,000

174,125

12.05

0.84

12.19

5,869,824

238,359

1(capital_income>0)

0.35

0.48

4,179,000

174,125

0.10

0.30

5,869,824

238,359

Value of real estate, SEK

757,582

2,655,920

0

4,179,000

174,125

265,425

2,451,628

0

5,863,282

238,359

logHouse_value

9.54

10.63

0

4,179,000

174,125

3.46

7.86

0

5,869,824

238,359

Debt to income ratio

0.26

14.26

0

4,179,000

174,125

0.46

11.98

0.01512

5,869,824

238,359

Number of consumption credits

3.14

3.20

2

4,179,000

174,125

3.27

3.68

2

5,869,824

238,359

Number of credit cards

1.53

1.82

1

4,179,000

174,125

1.73

2.14

1

5,869,824

238,359

Credit card limit, SEK

21,452

33,619

8000

4,179,000

174,125

26,368

40,200

10,000

5,869,824

238,359

Credit card balance, SEK

4527

14,700

0

4,179,000

174,125

11,616

26,270

0

5,869,824

238,359

49

Median

Continued

131

obs

Rationality in the consumer credit market Chapter | 5

Mean

(8)

(1)

(2)

(3)

(4)

(5)

(6)

(7)

Random sample

(8)

(9)

(10)

Pawn sample

Mean

sd

Median

obs

N

Mean

sd

Median

obs

N

Number of installment loans

0.07

0.28

0

4,179,000

174,125

0.07

0.28

0

5,869,824

238,359

Installment loan limit, SEK

6012

36,259

0

4,179,000

174,125

5,527

31,940

0

5,869,824

238,359

Installment loan balance, SEK

5997

36,180

0

4,179,000

174,125

5,509

31,859

0

5,869,824

238,359

Number of credit lines

0.39

0.83

0

4,179,000

174,125

0.75

1.21

0

5,869,824

238,359

Credit line limit, SEK

27,665

152,780

0

4,179,000

174,125

44,877

105,714

0

5,869,824

238,359

Credit line balance, SEK

25,656

149,597

0

4,179,000

174,125

43,180

103,953

0

5,869,824

238,359

Number of mortgages

0.94

1.57

0

4,179,000

174,125

0.41

1.18

0

5,869,824

238,359

Mortgage limit, SEK

381,562

976,744

0

4,179,000

174,125

200,479

651,622

0

5,869,824

238,359

Mortgage balance, SEK

381,540

976,711

0

4,179,000

174,125

200,469

651,546

0

5,869,824

238,359

Number of credit request last 12m

0.94

1.70

0

4,179,000

174,125

1.83

2.98

1

5,869,824

238,359

1(maxed_out>0)

0.05

0.22

4,179,000

174,125

0.12

0.32

5,869,824

238,359

Score

3.14

11.96

0.2

4,179,000

174,125

18.36

26.51

1.5

5,863,282

238,359

Number of arrears

0.73

5.36

0

4,179,000

174,125

5.04

15.17

0

5,863,282

238,359

132 Handbook of US Consumer Economics

TABLE 5.2 Summary statistics.dcont’d

Rationality in the consumer credit market Chapter | 5

133

credit. As they have many more arrears that instigate a negative flag on their credit records that are then remembered for 3 years, their access to mainstream credit is limited, which is reflected in their high score and which implies a high default risk and thus low creditworthiness.

5. Main results 5.1 Empirical implementation We begin by estimating the probability to have a loan application granted for consumers in the random sample by utilizing the information from their credit bureau files. We estimate the following probit regressions: 1ðgranted > 0Þ ¼ b1 age þ b2 male þ b3 entrepreneur þ b4 income þ b5 debtincratio þ b6 loghousevalue þ b7 capitalinc: þ b8 #credit þ b9 limitcredit þ b10 scoreb11 #creditapplications þ b12 #arrears þ

(5.1)

b13 maxedoutcredit þ ε; where we divide all continuous variables into deciles to allow for nonparametric effects. We then use the covariates of this regression to make out-ofsample predictions for the group of borrowers who do not apply for mainstream credit. Note that we have for all individuals in our sample a monthly snapshot of their respective credit bureau files. As mainstream banks do not have access to pawn credit histories, we assume that we have all information that the mainstream bank (algorithm) would have utilized when estimating their granting decision for those who did not apply.

5.2 Access to mainstream credit for all swedes versus alternative credit users In this section we look more closely at the individuals who did not try to get cheaper mainstream credit when they chose to take pawn credit, the so-called discouraged borrowers. This group forms the lion’s share of pawn credit borrowers (73.7%). We will first define discouraged borrowers and then investigate whether the pawn credit borrowers who did not apply for regular credit because they anticipated a rejection were correct about their chances, as limited access to mainstream (cheaper) credit by these pawn credit borrowers would suggest that pawn loans are sought as a last resort. In line with Lusardi et al. (2011), we reason that high-cost short-term borrowing is near the bottom of the “pecking order” of coping mechanisms for low-income households to deal with a financial shock. Finally, we explore the possibility that immigrants’ knowledge about their own chances of obtaining regular credit differs from that of the Swedish-born pawnshop customers. Given imperfect information and application costs, some individuals do not apply for mainstream credit even when they need it. These are called

134 Handbook of US Consumer Economics

“discouraged borrowers.” They are defined as borrowers who do not apply for loans because they feel they will be rejected. Han et al. (2009) argue that discouragement can be viewed as a self-rationing mechanism in the application decision, implying that bad and good borrowers can be discouraged. Such discouragement is therefore defined as being efficient when bad borrowers are discouraged but as being inefficient when good borrowers are discouraged and/or when bad borrowers get into the loan pool. We can identify discouraged borrowers with no uncertainty about their need for credit, as we observe them taking pawn credit while not applying for a regular bank loan. We now investigate whether the individuals who did not apply for mainstream credit because they anticipated a rejection were correct about their chances. To estimate the predicted probability, we use the covariates obtained from the probit estimation of Eq. (5.1) that describes the mainstream banks’ decision to grant a loan application estimated using a random sample of the Swedish population. We then estimate an out-of-sample predicted probability to obtain mainstream credit for the consumers in the random sample and the pawn borrower sample. As a benchmark, we use the median-predicted granting probability for consumers in the random sample that received new mainstream consumption credit (the vertical solid line) and the dashed vertical lines mark the 25th and 75th percentile, respectively. We start by looking at the unconditional difference in predicted granting probabilities between the random (white) and the pawn sample (gray). The upper left panel in Fig. 5.1 shows that in the cross section, only 28% of the pawn borrowers have an above-median predicted probability to be granted mainstream credit. However, in the upper right panel, we can see that among the pawn borrowers who actually apply for mainstream credit, this predicted probability is much higher, 60%. Panel A. Random sample

FIGURE 5.1 Decision trees. This figure plots the probability to apply for mainstream credit and taking a pawn loan for consumers in our random sample.

Rationality in the consumer credit market Chapter | 5

135

When we calculate the same probabilities for the consumers in the pawn borrower sample who took a pawn loan but did not apply for mainstream credit, the so called ‘discouraged” borrowers (panel D), we see that only 36% has a more than median probability to be granted mainstream credit (Fig. 5.2). From the perspective of Han et al. (2009), the discouragement of this group of pawnshop customers could be viewed as efficient, as the vast majority of their loan applications would have been denied if the discouraged borrowers had applied at mainstream banks. There are, nevertheless, as we discussed above in Section 2, a number of reasons why people might prefer pawn credit

FIGURE 5.2 Access to mainstream credit for all Swedes and pawn borrowers. This figure shows the predicted probability distribution to have access to mainstream credit for different populations. To estimate the predicted probability, we use the covariates obtained from the probit estimation of Eq. (5.1), which describes the mainstream banks’ decision to grant a loan application estimated using a random sample of the Swedish population. We then estimate an out-of-sample predicted probability to obtain mainstream credit for the consumers in the random sample (RS) and the pawn borrower sample (PS). As a benchmark, we use the median-predicted granting probability for consumers in the random sample that received new mainstream consumption credit (the vertical solid line) and the dashed vertical lines mark the 25th and 75th percentile, respectively. In panel A and B, we compare the predicted probability distributions of consumers in the random sample (white) and pawn sample (gray). In panel A, it is shown that in the cross section, 28% of the pawn borrowers have an above-median predicted probability to be granted mainstream credit. In panel B, we document that among the pawn borrowers who actually apply for mainstream credit, 60% have an above-median predicted probability to obtain mainstream credit. We calculate the same probabilities for the consumers in the pawn borrower sample who take a pawn loan and applied for mainstream credit in the last 2 months (panel C) or did not apply for mainstream credit (panel D), the so-called ‘discouraged” borrowers. In panel C, 53% of the consumers have a more than median probability to be granted mainstream credit and in panel D this probability is 36%.

136 Handbook of US Consumer Economics

over mainstream credit independent of the price difference. For example, speed, the lack of a requirement to hand over proof of creditworthiness, and the lack of a direct risk of future exclusion from mainstream credit in case of default. Therefore, it is possible that discouraged borrowers do not apply for mainstream credit because they prefer pawn credit over mainstream credit.

5.2.1 Awareness about creditworthiness and immigrant status In Table 5.3 we document that immigrants are highly overrepresented in the group of pawn credit borrowers, 14% of the general population in Sweden is foreign born, while 28% of the pawn borrowers are foreign born, we investigate if this correct assessment by the majority of discouraged borrowers to have access to mainstream credit differs by immigrant status. Fig. 5.3 plots the predicted probability distributions for consumers in the pawn borrower sample split by immigrant status. To the left we compare the unconditional difference between the random sample and first-generation immigrants (panel A), second-generation (panel B) and pawn borrowers born in Sweden with no immigrant parents; “rest” (panel C). To the right we plot the difference in distributions conditional on taking pawn credit while not applying for mainstream credit in the previous 2 months. We estimate that only a quarter of first-generation pawn borrowers would have a reasonable chance to be granted mainstream credit, and this increases to the same level (36%) among the first-generation discouraged borrowers as the overall population of discouraged borrowers. The percentages are even smaller for secondgeneration immigrants in panel B; 25% and 33% and the Swedish born, without immigrant parents; 30 and 33%, respectively. These results suggest that children of immigrants are able to move away from pawn credit, as the overrepresentation of second-generation pawn

TABLE 5.3 Sample description. Pawn sample

Random sample

Percentage

Number of people

Percentage

Number of people

First-generation immigrants

27.9%

68,163

14.1%

24,488

Second-generation immigrants

13.5%

33,100

8.1%

14,126

Swedish born with no immigrant parents

58.6%

137,096

77.8%

135,511

Total

100%

238,359

100%

174,125

FIGURE 5.3 Probability to access mainstream credit by immigrant status. The panels of this figure compare the predicted probability distributions to have access to mainstream credit for consumers in the pawn borrower sample split by immigrant status. First-generation immigrants (panel A), second-generation immigrants (panel B), and pawn borrowers born in Sweden with no immigrant parents; ‘rest” (panel C). We compare their distribution to the distribution of the random sample (to the left) and show the distribution of pawn borrowers who take a pawn loan but do not apply for mainstream credit in the previous 2 months. To estimate the predicted probability for these samples, we use the covariates obtained from a cross section probit estimation of Eq. (5.1) that describes the mainstream banks’ decision to grant a loan application estimated using a random sample of the Swedish population that excludes the alternative credit users. As a benchmark, we use the medianpredicted granting probability for consumers in the random sample that received new mainstream consumption credit (the vertical solid line) and the dashed vertical lines mark the 25th and 75th percentile, respectively. In panel A, in general, 25% of first-generation pawn borrowers have a more than median probability to be granted mainstream, but 36% of the first-generation pawn borrowers who take pawn loans that do not apply for mainstream credit first. The percentages are similar for second-generation immigrants in panel B; 25% and 33%, respectively. The numbers for the Swedish born, without immigrant parents, are 30% and 33%, respectively.

138 Handbook of US Consumer Economics

borrowers is less than first-generation immigrants, see Table 5.3. Furthermore, our findings suggest that second-generation immigrant borrowers are better informed about their chances of obtaining mainstream credit than their parents and are at least equally well informed as their Swedish-born counterparts.

6. Conclusion We find that demand for pawn credit is mainly driven by people who have not recently experienced a rejection when applying for mainstream credit. But mainstream credit rejection does increase the likelihood to take out a pawn credit instead. In addition, we find that the lion’s share of pawn credit borrowers who do not attempt to get cheaper mainstream credit before going to the pawnshop for credit has a small chance of having their loan application approved by the mainstream bank. Only 36% of this group actually has a predicted probability of having their loan application approved, which is at least as big as that of the median-granted group in the general population. Put differently, 64% of this group is implicitly excluded from mainstream credit at the time they decide to take pawn credit. Furthermore, we find evidence that pawn borrowers with an immigrant background, which is 28% of the pawn sample, are at least equally well informed about their chances of obtaining regular credit as their Swedish-born counterparts. As pawn credit demand is mainly driven by borrowers who did not apply for mainstream credit first, it is likely that either expectations (that is, an adequate or inadequate calculation by the borrower that he/she is ineligible for mainstream bank credit) or other behavioral factors (for instance, reputational concerns and repayment commitments) are important determinants of highcost credit demand. These results contribute to our understanding of the individual-level effects of access to mainstream credit.

References Agarwal, S., Driscoll, J.C., Gabaix, X., Laibson, D., 2011. Learning in the Credit Card Market. Agarwal, S., Skiba, P.M., Tobacman, J., 2009. Payday loans and credit cards: new liquidity and credit scoring puzzles? The American Economic Review 99, 412e417. Bertrand, M., Morse, A., 2011. Information disclosure, cognitive biases, and payday borrowing. The Journal of Finance 66, 1865e1893. Bhutta, N., Skiba, P.M., Tobacman, J., 2015. Payday loan choices and consequences. Journal of Money, Credit and Banking 47, 223e260. Bos, M., Carter, S., Skiba, P., 2012. The Pawn Industry and its Customers: The United States and Europe. Vanderbilt Law and Economics Research Paper No. 12-26. Campbell, J.Y., Jackson, H.E., Madrian, B.C., Tufano, P., 2011. Consumer financial protection. The Journal of Economic Perspectives 25, 91e114. Caskey, J.P., 1991. Pawnbroking in America: the economics of a forgotten credit market. Journal of Money, Credit and Banking 23, 85e99. Caskey, J.P., 1994. Fringe Banking: Check-Cashing Outlets, Pawnshops, and the Poor. Russell Sage Foundation, New York.

Rationality in the consumer credit market Chapter | 5

139

Deaton, A., 1991. Saving and liquidity constraints. Econometrica 59, 1221e1248. Gross, D.B., Souleles, N.S., 2002. Do liquidity constraints and interest rates matter for consumer behavior? Evidence from credit card data. Quarterly Journal of Economics 117, 149e185. Guiso, L., Sapienza, P., Zingales, L., 2006. Does culture affect economic outcomes? The Journal of Economic Perspectives 20, 23e48. Han, L., Fraser, S., Storey, D.J., 2009. Are good or bad borrowers discouraged from applying for loans? Evidence from US small business credit markets. Journal of Banking and Finance 33, 415e424. Hubbard, R.G., Judd, K.L., Hall, R.E., Summers, L., 1986. Liquidity constraints, fiscal policy, and consumption. Brookings Papers on Economic Activity 1986, 1e59. Jappelli, T., Pagano, M., 1999. The welfare effects of liquidity constraints. Oxford Economic Papers 51, 410e430. Johnson, R.W., Johnson, D.P., 1998. Pawnbroking in the US: A Profile of Customers. Credit Research Center, School of Business, Georgetown University. Ludvigson, S., 1999. Consumption and credit: a model of time-varying liquidity constraints. The Review of Economics and Statistics 81, 434e447. Lusardi, A., Schneider, D., Tufano, P., 2011. Financially fragile households: evidence and implications. Brookings Papers on Economic Activity 83e151. Melzer, B.T., 2011. The real costs of credit access: evidence from the payday lending market. The Quarterly Journal of Economics 126, 517e555. Morse, A., 2011. Payday lenders: heroes or villains? Journal of Financial Economics 102, 28e44. Osili, U.O., Paulson, A.L., 2008. Institutions and financial development: evidence from international migrants in the United States. The Review of Economics and Statistics 90, 498e517. Skiba, P.M., Tobacman, J., 2008. Payday Loans, Uncertainty and Discounting: Explaining Patterns of Borrowing, Repayment, and Default. Skiba, P.M., Tobacman, J., 2009. Do Payday Loans Cause Bankruptcy? Vanderbilt Law and Economics Research Paper. Stegman, M.A., 2007. Payday lending. The Journal of Economic Perspectives 21, 169e190.

Chapter 6

How do consumers respond to real income shocks? JPMorgan Chase Institute JPMorgan Chase & Co, Washington, DC, United States

1. Introduction How much do peaks and troughs in income feed through to consumers’ economic welfare? In its most stylized form, the permanent income hypothesis (PIH) posits that a family’s current consumption and its planned future consumption depend only on their preferences and expectations about prices and lifetime income, but not on how that income is structured over time. An empirical implication of this hypothesis is that a family’s consumption changes in response to news about a permanent change to income (for example, a lasting change to tax policy), but not in response to transitory or predictable fluctuations (for example, the arrival of a tax refund check). Economic data contain some patterns that are broadly consistent with the spirit of PIH, especially over decades-long time scales and landmark purchases (Hall, 1978; Bernanke, 1984). However, evidence has shown that transitory and predictable fluctuations in income do in fact drive changes in consumption, which contradicts the most stylized representation of PIH.1 The JPMorgan Chase Institute has assembled high-frequency, finely categorized data on income and expenditure of millions of households. Drawing on these data, we describe empirical linkages between expenditures on the one hand and several specific fluctuations in income and prices on the other. The types of fluctuations we have analyzed vary in terms of their levels of predictability and their expected durations (Fig. 6.1). (See Fig. 6.13 in the Appendix for a detailed description of the sampling criteria and data asset in each study.) Consistent with the large body of evidence indicating limitations to consumption smoothing, we observe spending responses even to changes that were

1. For reviews of the extensive empirical literature, see Browning and Lusardi (1996) and Chapters 8e10 of Jappelli and Pistaferri (2017). Handbook of US Consumer Economics. https://doi.org/10.1016/B978-0-12-813524-2.00006-8 Copyright © 2019 Elsevier Inc. All rights reserved.

141

142 Handbook of US Consumer Economics Event Arrival of tax refund

Income or price Income

Cash flow impact Positive

Predictability & duration Highly predictable, one-time cash infusion Job loss may be predictable; final unemployment insurance payment is highly predictable Largely unpredictable, duration difficult to forecast

Observed expenditure impact Dramatic increase in spending on healthcare (and nonhealthcare) services Significant decrease in spending at unemployment, and again in the month after the final unemployment insurance payment.

Job loss

Income

Negative

Gas price decline

Price

Positive

Mortgage resets

Price

Positive

Highly predictable, duration of at least one year

Small increase in spending on non-durables upon notification of adjustment, then larger increase when rates adjust

Income fluctuations around mortgage default

Income

Negative

Unknown predictability of income fluctuation; income fluctuation is temporary

People stop paying their mortgage when their income drops temporarily

Dramatic increase in spending on non-durables

FIGURE 6.1 We observe spending responses even to income and price changes that were predictable, that were likely to have no impact on an intertemporal budget constraint, and that have known limited duration.

predictable, likely had no impact on their lifetime income, and had known limited duration. We are able to go beyond confirming the general principle that consumption smoothing is limited because we observe each of the individual transactions that add up to most of each household’s income and expenditure. Using this more granular and fuller view, we are able to describe two underdocumented dimensions of the relationship between cash flow and consumption. First, we are able to document leads and lags in spending responses, which allows us to identify how much spending changes with the arrival of information about a cash flow event and how much with the arrival of the cash itself. For example, do consumers misperceive their permanent income and therefore react to news about the size of a tax refund that should not have been surprising? Or, does cash itself matter more than even very specific information that it will arrive? Second, we are able to characterize not only changes in total spending but also changes in specific categories of spending, including those which consumers would likely prefer to time according to needs rather than cash flow. For example, we can distinguish consumption of nondurables from spending on durables and services, such as healthcare. To hone in on these timing dynamics, we analyze debit and credit card spending at the time of purchase, rather than when the credit card bill is paid. Furthermore, in the case of healthcare spending, we distinguish between in-person card purchases likely made at the point of consumption versus remote or online purchases that could represent bills paid for services received in the past.

How do consumers respond to real income shocks? Chapter | 6

143

Taken together, our findings across multiple studies indicate consumption patterns that violate not only the most stylized representation of PIH but also contemporary adaptations of dynamic models with perfectly forward-looking consumers. These findings require a frank acknowledgment of the fact that frameworks that rely on rational expectations and intertemporal optimization are inherently limited in terms of their usefulness for understanding consumer behavior or designing optimal policies and financial product offerings. After discussing these studies in more detail, we conclude with a discussion of the implications of these limitations.

2. JPMCI research on consumer spending responses to income and price changes 2.1 Healthcare spending and tax refunds We constructed an event study around the arrival of tax refunds to show that consumer out-of-pocket spending on healthcare is significantly affected by cash flow dynamics (Farrell et al., 2018a). Tax refunds are a significant cash flow event for many households. In 2016, 73% of tax filers received a tax refund, with an average refund of $2860 (Internal Revenue Service, 2017). Families learn the size of their tax refund when they file, although they likely have a good idea of how much to expect even sooner than thatdas shown in Fig. 6.2 (drawn from Farrell et al., 2018b), filers in any age or income group who are owed larger refunds are more likely to receive their refunds earlier in the season, which likely reflects the fact that many filers have an idea of the likely size of their refunds and those who are expecting larger refunds have incentive to submit their returns earlier. Although filers can control the timing of their refund on the scale of weeks or months by filing earlier or later in the season, once they have filed they cannot control or predict exactly when they will receive the funds. Tax refunds, therefore, represent significant positive cash flow that is highly predictable in its size but unpredictable in its timing.

FIGURE 6.2 Even within a demographic group, tax filers who are owed larger refunds file sooner.

144 Handbook of US Consumer Economics

We focus on the healthcare spending response to the arrival of the tax refund, as this is a category of spending where the timing of spending is more likely to be tied to a person’s physical health, and which consumers would likely prefer to time according to their health needs rather than cash flow. Therefore, for healthcare, more than for other spending categories, we can draw potential welfare implications from evidence that families’ cash flow patterns drive them to alter consumption that would have been feasible given their permanent income. We observe that healthcare spending responded sharply to the arrival of the cash, as opposed to the arrival of information. Average healthcare spending was 60% higher in the week starting with the tax refund than in an average week over the 6 months prior, and it remained elevated by 20% for 75 days (Fig. 6.3, drawn from Farrell et al., 2018b). Moreover, 62% of the additional healthcare spending was done in person at healthcare service providers so that it represented deferred care. Almost all of the remaining 38% represented deferred bill payment and a negligible portion representing healthcare goods, such as drugs, which could be stockpiled. Two pieces of evidence allow us to be confident that the cash infusion played a key role in enabling additional healthcare spending (Farrell et al., 2018a). First, the increase in healthcare spending the week after the arrival of the tax refund was entirely attributable to an increase in spending on debit cards (83% increase) and electronic payments (56% increase). There was no increase in credit card spending. Second, the response was 20 times larger for families with the lowest average daily balances in their checking accounts over the year before the refund ($536 or less, the bottom quintile) than those with the highest ($3577 or more, top quintile). We also see evidence that the cash infusion had welfare implications. As evident in Fig. 6.4, those who increased their healthcare spending to a larger

FIGURE 6.3 Out-of-pocket healthcare spending responds immediately to the receipt of a tax refund.

How do consumers respond to real income shocks? Chapter | 6

145

FIGURE 6.4 Tax filers’ healthcare spending response to their refund is foreshadowed by their filing behavior.

extent after receiving their refund tended to have filed earlier in the season. These early filers also devoted a larger fraction of their healthcare spending response to deferred care (Farrell et al., 2018b). This indicates that people are evidently motivated to file their taxes earlier, not only to receive the cash infusion earlier but also because they prefer to consume healthcare services earlier. Families could, in principle, have received these same healthcare services even earlier, for example, by reprogramming their tax withholding or by spending out of savings and then replenishing with the refund or by borrowing against the anticipated refund. It is implausible that most of these families should be so credit or savings constrained that these are not feasible options for them. Taken together, our results highlight the extent to which families’ consumption of healthcare services is sensitive to a predictable cash infusion even when it has no impact on permanent income and also provide evidence that families would prefer to minimize the impact that cash flow has when they consume healthcare services.

2.2 Consumer spending around job loss and the expiration of unemployment insurance benefits We constructed an event study around the direct deposit of a first unemployment insurance (UI) benefit for 160,000 account holders who received up to 6 months of benefits starting in 2014 (Farrell et al., 2016). A spell of unemployment which qualifies for UI benefits is likely to have been unanticipated in terms of both timing and duration.2 Furthermore, as shown in the green line in

2. To qualify for UI benefits, a job separation must meet specific conditions, including that it must be involuntary (for example, a layoff or a company restructuring).

146 Handbook of US Consumer Economics

FIGURE 6.5 Spending and income both drop sharply at the onset of a spell of unemployment.

Fig. 6.5, such a spell tends to entail a decline in income lasting at least 2 years. This implies that the job separation itself may have brought news of a decline in permanent income. Therefore, that fact that spending declines almost immediately with the onset of an unemployment spell (as shown in the blue line in Fig. 6.5) would be consistent even with the most stylized representation of PIH. Leveraging the granular view of spending afforded by transaction-level data, we can characterize the decline in spending more thoroughly. The sharpest drops are in spending on flights and hotels, restaurant and entertainment, retail, and transport. Each of these categories declines by 9%e11%. Some of this decline likely reflects a decreased need for workrelated spending. When someone stops working, they no longer need to pay every day for expenses such as gas to get to work and buying food at the cafeteria. We also see modest declines in spending on groceries (4%) and utilities (2%). Out-of-pocket medical expenses actually increase by 4%, perhaps in part reflecting the impact of losing employer-provided health insurance. These adjustments at the start of a spell of unemployment fit neatly with almost any hypothesis about consumer behavior that has rational expectations and intertemporal optimization at its core because it is entirely plausible that the beginning of a spell of unemployment might change a family’s beliefs about its lifetime income. However, UI policies in every state explicitly lay out the maximum duration of benefits. For those who remain unemployed for longer than that duration, the end of unemployment benefits represents a perfectly predictable drop in income. However, as shown in the blue lines in Fig. 6.6, spending drops sharply in response to this dropda pattern that is

How do consumers respond to real income shocks? Chapter | 6

147

FIGURE 6.6 Among the long-term unemployed, spending drops sharply again at the exhaustion of UI benefits, even though that event is perfectly predictable.

impossible to reconcile with almost any rational expectations model of consumer behavior.3 Furthermore, as shown in Fig. 6.7, the consumer welfare impacts of this second spending adjustment are likely even more severe than the first. While spending on groceries, utilities, and debt service was not sharply impacted by the beginning of a spell of unemployment even among the long-term unemployed, spending in these categories drops sharply with exhaustion of UI benefits.4 Families would almost certainly prefer smoother spending patterns for nondurables like these. Cuts to payments on credit cards (17%), auto loans (9%), and mortgages (6%) when UI benefits run out are more modest that cuts to student loan payments (27%). One possible explanation is that the consequences of mortgage and auto delinquency (repossession) and credit card delinquency (loss of a liquid buffer) are more severe than the consequences of student loan delinquency. In addition, income-based repayment policies may allow people in some states to suspend or reduce their student loan payments when their income drops.

3. One way to make this pattern fit a rational expectations hypothesis might be to conjecture that these families would be gradually learning how hard it will be to find a new job, and so they revise their expectations about their intertemporal budget constraint downward a second time as they learn, dropping spending again. However, comparing the dotted blue to the solid blue line in Fig. 6.7 calls that conjecture into serious question; in states where unemployment benefits run out sooner, spending drops sooner. There is no plausible explanation for why families would suddenly revise their expectations in such close coincidence with their states’ policies regarding the end of the benefit period. 4. See also Figs. 3 and 6 of Ganong and Noel (2017).

148 Handbook of US Consumer Economics

FIGURE 6.7 The second spending adjustment among the long-term unemployed (in response to the exhaustion of UI benefits) involves deep cuts on nondurables and significant cuts to debt servicing.

These patterns reflect important limitations to any understanding of consumer behavior that is built on a framework of rational expectations. As families can foresee the drop in income caused by the exhaustion of UI benefits by tracking their benefits payments on a calendar, any model built on rational expectations and intertemporal optimization would imply that as the date of exhaustion draws nearer, they will make more and deeper spending adjustments in preparation. On the other hand, as illustrated by the fact that the blue lines are significantly smoother than the green lines in Figs. 6.6 and 6.7 (and as shown in Ganong and Noel, 2018), families on average do manage to smooth over the income changes to some extentdthe patterns are not consistent with any single canonical stylized modeldneither the most stylized representation of PIH, the standard “buffer stock” model (Carroll, 1997), nor a stylized model of families living hand to mouth. One important possibility discussed by Ganong and Noel (2018) is that the average patterns indicated here might reflect heterogeneity among families, with distinct groups of families behaving in concordance with each of these different models.

2.3 Consumer spending and the decline of gas prices between 2014 and 2015 Gas prices were 25% lower in 2015 than in the prior year ($2.60 in 2015 compared with $3.47 in 2014).5 This price change was unpredictable in terms 5. Gas prices fell sharply in the fourth quarter of 2014 and remained low for most of 2015 despite some fluctuations. Starting at a peak monthly price of $3.77 in June 2014, prices fell precipitously and continuously to a trough of $2.21 in January 2015 (Fig. 6.1). Using a year-over-year comparison to account for seasonality, November 2014 was the first month in which gas prices were lower than in the prior year. National average gas prices subsequently rose in the first half of 2015 to $2.89 in June, and prices reached a high of $4 in California.

How do consumers respond to real income shocks? Chapter | 6

149

of both timing and duration. If families believed the change in prices was to be relatively short-lived, then we should see no material change in consumption behavior. If, on the other hand, families believed the change represented a structural shift, such that prices would remain low over the long run, then this may have led them to revise their beliefs about their total lifetime purchasing power. In that case, an increase in total spending would be consistent with almost any rational expectations model of consumer behavior. However, given that fuel represents just 5% of the average family’s total spending, the total lifetime purchasing power impact of even such a steep drop in fuel prices is relatively small. We examined a sample of 1 million core Chase customers to ascertain the magnitude of savings households experienced from lower gas prices, and whether and on what they spent these windfall gains; our findings are summarized in Fig. 6.8 (Farrell and Greig, 2016). We estimated that the 25% drop in gas prices generated a potential savings of $632 for middle-income households, of which only 42% went to durables purchases or savings. About three-fifths of the remaining 58% went to spending on nondurable goods and services other than fuel such as restaurants, retail, and groceries, while the rest went toward increased consumption of fuel. Some of this increase in fuel spending represented substitution away from other transit options.

FIGURE 6.8 Consumers spent 58% of their potential savings from lower fuel prices on nondurables.

150 Handbook of US Consumer Economics

If the price change is certain to be short-lived, then a family seeking to smooth its consumption would save almost all of the dollars freed up by the price change in preparation for when prices returned to normal. If the price change is certain to be permanent, then they would be able to spend all of it. So, “should” families have spent 58% of these freed up dollars on nondurables? As it turns out, average fuel prices in 2017 were still at similar levels to 2015, suggesting that so far the decision to spend almost three-fifths of the freed up dollars may reflect about the right amount of optimism regarding the duration of the price change in this particular case. However, as we illustrate in the next example, consumer responses to price changes do not always work out so well.

2.4 Consumption, investment, and mortgage resets People increase their spending when they receive news that their adjustablerate mortgage (ARM) payment will reset downward, but then increase again when the reset actually starts. We examined a sample of 4321 deidentified US homeowners with a 5/1 ARM originated between April 2005 and December 2007, which reset to a lower interest rate between April 2010 and December 2012 (Farrell et al., 2017). Under Federal mortgage servicing rules, each of these homeowners was notified in advance of the adjustment what their new interest rate and monthly payment would be. When they received this information, they were also told that the new payment amount would last at least a year. We examined how these homeowners changed their credit card spending and revolving balances in response to the news of this predictable price change of known duration, as well as in response to the price change itself. Fig. 6.9 shows the pace of credit card spending around the change in these homeowners’ mortgage payments. Average spending increased sharply about 8 months before the rate reset and again about 2 months prior. This is consistent with a family responding to news that its lifetime purchasing power has increased slightly, as the cost of its housing will be lower for at least a year. Spending increased sharply again in the month when the rate actually reset, even though this second change was perfectly predictable. Looking across categories of spending (Fig. 6.10), we observed variation in the staging of the spending response. Spending on auto repair, transportation, home improvement, and services was highly responsive to news of the coming rate change and did not change much when rates actually changed, whereas spending on staples, healthcare, retail, and leisure was more responsive to the predictable change in payment amount than to news. With the notable exception of healthcare, it is likely that spending changes in response to the news had broader consequences for families’ well-being than changes in response to the actual cash flow event (for example, delaying an auto repair likely has broader impacts on a family’s ability to meet its needs than delaying a retail purchase).

How do consumers respond to real income shocks? Chapter | 6

151

FIGURE 6.9 Credit card spending among consumers increased in response to information about a decline in their monthly mortgage payment and then increased again when the lower payment actually took effect.

FIGURE 6.10 Homeowners responded to news about a rate reset by increasing spending in some categories and to the rate reset itself by increasing spending in others.

152 Handbook of US Consumer Economics Cumulative average change in income, spending, and revolving balance

$9,327

$9,500

$928

$9,000

Financed on credit card

$8,500

$363 $565

Excess spending Excess financing

$8,000 $7,500

$8,399

$7,000

$8,964 $6,500

Likely paid for out of income

$6,000 $1,000 $500 $0 Income increase from mortgage reset

Total spending increase

Revolving balance increase

FIGURE 6.11 The credit card spending response to a mortgage rate reset exceeded the total decline in housing costs by 4%, but their credit card debt increased even more than was necessary to cover the excess.

Overall, as shown by the left column in Fig. 6.11, lower payments freed up about $8964 per year for the average homeowner over the year following the rate reset. However, as shown by the middle column, the annual increase in credit card spending exceeded those freed up dollars by 4%. Any understanding of consumer behavior that is built around a rational expectations framework would indicate that families should only increase consumption by more than the reduction in payments if they believed that their payments would continue to decline into the future. However, as the right column in Fig. 6.11 illustrates, the increase in revolving credit card balances was 2.6 times larger than the excess spending response. Simply by reallocating the dollars freed up by lower housing costs, these homeowners could have had exactly the same spending response with considerably lower credit card debt. Regardless of how optimistic they might have been about future interest rate adjustments, the average homeowner overreacted to the price change, increasing their consumption more than their long-run income increased. This is especially clear because for the vast majority of homeowners, any increase in expected lifetime purchasing power resulting from lower housing costs was only offsetting a decline in lifetime wealth from a lower home values. Median home values had declined by $84,000 in the period between when these mortgages were originated and when the payment resets occurred. These patterns suggest that the impact of a change in monthly housing payments on monthly nonhousing consumption exceeds their impact on lifetime purchasing power. These implications are further supported by later research on the impacts of mortgage modifications that were offered by the Home Affordable Modification Program (HAMP) in the wake of the Great Recession (Ganong and Noel, 2017; Farrell et al., 2017b). That research

How do consumers respond to real income shocks? Chapter | 6

153

indicated that modifications which included a principal write-down were no more effective at reducing defaults than those that did not. These two classes of modifications had similar impacts on monthly payments, but very different impacts on total lifetime housing costs. Principal reduction that did not result in an incremental reduction in monthly mortgage payment had little impact on immediate cash flows and therefore little impact on default. Principal reduction also had no impact on consumption. Those who received payment and principal reduction spent no more than those who received only payment reductions, despite the wealth effect of principal reduction.

2.5 Income fluctuations around mortgage defaults The affordability of a mortgage is traditionally gauged by long-run metrics like the ratio of steady-state debt payments to expected average monthly income over the period of the loan.6 However, short-run fluctuations in income and prices are also of primary importance. This is evident from the fact that not only do people increase their spending when their mortgage payment drops but also people stop paying their mortgage when their income drops temporarily (Farrell et al., 2017b). Mortgage default closely followed a sharp drop in income, and mortgage payments recovered as income recovered (Fig. 6.12). This pattern held regardless of where homeowners fell in terms of traditional affordability metrics (e.g., payment to income ratio) or the amount of equity they had in their homes (e.g., “abovewater” borrowers, who have positive home equity in that they owe less on their home than their home is worth vs. “underwater” borrowers, who have negative home equity in that owe more on their home than their home is worth). The income drop before default was similar across all of these groups, suggesting that it was an income shock rather than a high payment burden or negative home equity that triggered default. Payment-focused mortgage debt reduction was equally effective at slowing default than principal-focused mortgage debt reduction precisely because it targeted relief to a household’s cash flow rather than to their balance sheet.

3. Conclusion Leveraging high-frequency, finely categorized data on income and expenditure of millions of households, we have described how expenditure responds to four specific changes in prices and income with varying degrees of impact and predictability. Consistent with the large body of evidence indicating limitations to consumption smoothing, we observe spending responses even to 6. For example, for a lender to get all of the regulatory and financial advantages associated with Qualified Mortgages, they must confirm that all of the borrower’s steady-state debt payments amount to no more than 43% of their expected average monthly income.

154 Handbook of US Consumer Economics

FIGURE 6.12 On average, a substantial negative income shock preceded default for both belowmedian and above-median mortgage payment-to-income borrowers and both abovewater and underwater borrowers.

changes that likely had no impact on lifetime purchasing power. We are able to go beyond confirming the general principle that consumption smoothing is limited, showing that expenditure responds to both information about a future price or income change and the (temporary) change in purchasing power when it actually occurs. Furthermore, we find that expenditures for which timing is likely to have welfare implications, such has healthcare, adjust upward in response to positive news and downward in response to negative events. These findings indicate consumption patterns that violate not only the most stylized representation of the PIH but also contemporary adaptations built on a rational expectations framework. At the same time, the evidence indicates that consumers do not simply live hand to mouth; in two cases (mortgage rate resets and job separations), we observed consumers reacting to news about changes to lifetime purchasing power even before those changes took effect. This suggests that consumers do manage their cash flow and take the future into account when making consumption decisions. Understanding the nature

How do consumers respond to real income shocks? Chapter | 6

155

and extent of limitations to consumption smoothing is crucial for the design of optimal policies and financial product offerings. We offer a few specific examples below. Consumers need tools that make it easier to budget, save, and adjust. In response to perfectly forecastable changes in their long-run purchasing power, we have observed consumers underreacting to the prospect of a decline (expiration of UI benefits) and overreacting to an improvement (mortgage rate resets). The cognitive burden associated with reprogramming a family’s spending, saving, and borrowing when circumstances change is almost never taken into account when consumption behavior is analyzed through the lens of a rational expectations framework. Many families would likely benefit from financial tools that help with the complicated process of making these adjustments. These tools would make budgeting more intuitive and make it easier to stick to a plan with “set-it-and-forget-it” (or “reset-it-and-forget-it”) functionality. For example, if more UI recipients were given a tool to preallocate a fraction of their benefits to a sidecar account to provision for the day they exhaust their eligibility, we may not have observed the sharp declines in grocery spending. In addition, many might benefit from simpler financial offerings that make it less likely these kinds of adjustments will be necessary in the first place. Families need more shock absorbers, especially for healthcare. In three of our four examplesdtax refunds, mortgage resets, and UI expirationdwe observe predictable changes in cash flow having sharp and significant impacts on many important expenditure categories, including healthcare. Our research has also consistently shown that these impacts are smaller for families with higher levels of cash savings. It should be easier for families to save for emergencies. During tax filing season, families appear to look to their refunds to access a large cash infusion, but emergencies happen throughout the year. Health savings accounts, medical reimbursement accounts, and other taxadvantaged savings vehicles make it possible for families to provision specifically for health emergencies, but they carry complicated rules and impose significant penalties if the family ends up needing the cash for a nonhealth emergency. Policies and tools that enable families to provision more effectively might include many of the useful features of these existing tools, but without these rigidities. Means testing for social safety nets should focus on income levels, not just asset levels. Asset tests penalize families for building up reserves necessary to smooth consumption. The impact of even short-run changes in purchasing power on current consumption exceeds their impact on lifetime purchasing power. Families who have provisioned for emergencies should not be forced to exhaust those savings before they can get help with managing the impacts of volatility. Underwriting standards should account for income volatility, not merely steady-state income. Static affordability metrics like the ratio of steady-state

156 Handbook of US Consumer Economics

debt payments to expected average monthly income are not good predictors of default risk, in part because even transitory disruptions to income have larger impacts than would be predicted by any rational expectations model. Dynamic approaches which account for cash flow dynamics and forecast income and expense risk should be more widely integrated into underwriting practices. There is value in tracking even transitory fluctuations in income and spending. For most economic monitoring applications, income and spending measures are adjusted to smooth out predictable variation (for example, seasonal variation). These adjustments are implicitly motivated by assumptions about the welfare impacts of predictable fluctuations. The findings we report here suggest that those assumptions should be revisited for some applications. In general, the justification for smoothing should be more explicit and question-specific. We need a better theoretical framework for understanding consumer behavior. Many of the patterns we report here, and other patterns that have been consistently established in academic studies, are inconsistent with the workhorse rational expectations framework that has helped to organize thinking around consumer behavior. Some scholars have attributed a lack of consumption smoothing to liquidity constraints, but these models would fail to explain the excess consumption we observed, for example, in the consumption response leading up to and after mortgage resets (Carroll, 1997; Kaplan et al., 2014). Behavioral scientists have offered alternative frameworks that relax assumptions of rational expectations and intertemporal optimization, but none are as comprehensive or broadly applicable (for example, Madrian and Shea, 2001; DellaVigna and Malmendier, 2006; Olafsson and Pagel, 2018). Key features of some of these models include more robust conceptions of how consumers form and update expectations about the future, and how they approach intertemporal trade-offs. Many of these features have been used to extend the constrained optimization paradigm that lies at the foundation of modern microeconomics, but these extensions are often very applicationspecific. The design of optimal public policies and financial product offerings would benefit immeasurably from a coherent model of consumer behavior that captures these nuances. Products and policies designed to serve perfectly forward-looking consumers, or entirely myopic hand-to-mouth consumers, will always be of limited use to most real-world consumers.

Appendix FIGURE 6.13. Description of sampling criteria and data asset for each study.

How do consumers respond to real income shocks? Chapter | 6

157

Event

Sampling criteria and data asset

Arrival of tax refund (Farrell et al., 2018a, 2018b) Figs. 6.2e6.4

This report draws on the JPMC Institute healthcare out-of-pocket spending Panel (JPMCI HOSP) data asset and examines how healthcare payments vary in the days and weeks around when account holders receive their tax refunds. We analyze average outof-pocket healthcare expenditure on over a dozen categories of healthcare goods and services for each day in the 100 days before and after a tax refund payment, for 1.2 million checking account holders in the JMPCI HOSP who received a tax refund between 2014 and 2016. The JPMCI HOSP data asset was constructed using a sample of deidentified core Chase customers for whom we observe financial attributes, including out-of-pocket healthcare spending between 2013 and 2016. For the purposes of our research, the unit of analysis was the primary account holder. We focused on accounts held by adults aged 18e64, as adults 65 and older were more likely to make payments using paper checks, which we could not categorize. To provide better visibility into income and spending, we selected accounts which met the following criteria: 1. Had at least five checking account outflows each month 2. Had at least $5000 in take-home income each year 3. Used paper checks, cash, and non-Chase credit cards for less than 50% of their total spending. The JPMCI HOSP data asset includes customers who resided within the 23 states in which JPMorgan Chase has a retail branch presence. We reweighted our population to reflect the joint age and income distribution among the 18e64-year-old population within each state.

Job loss (Farrell et al., 2016) Figs. 6.5e6.7

From a universe of 28 million Chase checking account holders, this report assembled an anonymized sample of 160,000 families across 18 states who met the following five criteria. 1. Received direct deposit of their first unemployment insurance (UI) check after December 2013 and their last UI check before June 2015 2. Received UI for six or fewer contiguous months 3. Experienced one spell of receiving UI benefits 4. Live in states that offer 26 weeks of UI benefits. In finding 3, we compare this group to UI recipients in Florida where benefits lasted 16 weeks in 2014 and 14 weeks in 2015 5. Have at least five outflows out of their checking account in the 3 months before and after UI receipt. Among our sample, median earnings among families that received UI benefits was $4,540, roughly comparable with the national median of $5106. Among the families in our sample, we studied income and spending by analyzing inflows and outflows out of the checking account as well as on Chase debit and credit cards. We defined income as all inflows which are not explicitly categorized as transfers from other financial accounts, and we rescaled take-home labor income into pretax dollars. We defined spending to include debit card expenditures, Chase credit card expenditures, consumer debt payments (mortgages, auto loans, non-Chase credit cards, and student loans), bills (e.g., electricity, cable, insurance), and cash withdrawals from the ATM.

Continued

158 Handbook of US Consumer Economics

dcont’d Event

Sampling criteria and data asset

Gas price decline (Farrell and Greig, 2016) Fig. 6.8

For this report we rely on JPMorgan Chase anonymized data on consumer clients who are primary account holders. To avoid double counting of financial activity, all joint accounts are captured under the primary account holder. From a universe of over 28 million anonymized checking account holders, we created a sample of approximately one million debit card holders who meet the following sample criteria: 1. They have a checking account and at least five outflow transactions from their checking account per month between October 2012 and January 2016. 2. They do not hold a gas station specific card. 3. They live in a zip code with at least 140 households in our sample. 4. They live in a metro area with at least 5 zip codes and at least 750 households in our sample.These criteria give us confidence that we are focusing on core Chase clients and have sufficient coverage of the geographic areas in which we assess the impact of low gas prices on spending behavior. These criteria constrain our sample to the 23 states with Chase branch locations. The demographic characteristics of this sample are slightly different from the nation in that the sample overrepresents primary account holders between 25 and 54 years old, men, households in the West, and households with higher incomes compared to the US population.

Mortgage resets (Farrell et al., 2017) Figs. 6.9e6.11

From a universe of over 6 million mortgage customers, we created a sample of 4321 homeowners who met the following criteria: 1. Had one 30-year 5/1 adjustable-rate mortgage (ARM) originated between April 2005 and December 2007 that reset to a lower rate between April 2010 and December 2012 2. Had not modified or refinanced their mortgage before reset 3. Made interest-only or interest plus principal payments To connect the impact of ARM resets to changes in spending, we then filter our sample to include those customers who have a Chase credit card that 4. Was active at least 24 months before the reset date of their ARM and 5. Had a median of at least 10 transactions per month in the 24 month window surrounding the reset date of their ARM. We require the Chase credit card to be active at least 24 months before the reset date of their ARM to eliminate households who opened a credit card just before reset to make a large purchase. We also require a median of at least 10 transactions per month in the 24month window surrounding the ARM reset date to eliminate households whose Chase credit card is not sufficiently active to be representative of their consumption.

How do consumers respond to real income shocks? Chapter | 6

159

dcont’d Event

Sampling criteria and data asset For our sample, we observe loan amount, term, interest rate, monthly payment, home value estimate, monthly credit card spending, spending by category, revolving balance, and credit limit. We also have access to demographic information such as customer age and annual income. Our sample is not perfectly representative of the typical household with any type of mortgage. Our loan amounts and LTVs are in line with the Federal Housing Finance Agency benchmarks, while our mortgage rates are somewhat lower. The sample also exhibits higher income levels than Survey of Consumer Finance benchmarks. This is partially the result of studying hybrid ARMs. The income of our sample also increases as we screen for credit card holders and sufficient credit card activity.

Income fluctuations around mortgage default (Farrell et al, 2018a,b) Fig. 6.12

For this analysis, we included homeowners who met the following criteria: 1. Chase customers with only one mortgage and a Chase deposit account 2. First default date between October 2013 and October 2014 3. Had monthly mortgage data and an active deposit account for at least 12 months before and 12 months after default. We used this sample of 10,815 mortgages to analyze the correlation between income and default. We then split this sample into aboveand underwater borrowers and analyzed each subsample separately. To examine the correlation between income and default split by above- and below-median premodification mortgage payment to income, we used the sample above but added a restriction that the borrower must have received a modification in the 12 months before or after default. This limitation is necessary because we sourced borrower income from their modification application. This generated a sample of 1807 mortgages.

References Bernanke, B.S., 1984. Permanent income, liquidity, and expenditure on automobiles: evidence from panel data. Quarterly Journal of Economics 99 (3), 586e614. Browning, M., Lusardi, A., 1996. Household saving: micro theories and micro facts. Journal of Economic Literature 34 (4), 1797e1855. Carroll, C.D., 1997. Buffer-stock saving and the life cycle/permanent income hypothesis. Quarterly Journal of Economics 112 (1), 1e55. DellaVigna, S., Malmendier, U., 2006. Paying not to go to the gym. The American Economic Review 96 (3), 694e719. Farrell, D., Greig, F., 2016. The Consumer Response to a Year of Low Gas Prices: Evidence from 1 Million People. JPMorgan Chase Institute.

160 Handbook of US Consumer Economics Farrell, D., Greig, F., Hamoudi, A., 2018a. Deferred Care: How Tax Refunds Enable Healthcare Spending. JPMorgan Chase Institute. Farrell, D., Greig, F., Hamoudi, A., 2018b. Filing Taxes Early, Getting Healthcare Late: Insights from 1.2 Million Households. JPMorgan Chase Institute. Farrell, D., Bhagat, K., Narasiman, V., 2017a. The Consumer Spending Response to Mortgage Resets: Microdata on Monetary Policy. JPMorgan Chase Institute. Farrell, D., Bhagat, K., Ganong, P., Noel, P., 2017b. Mortgage Modifications After the Great Recession: New Evidence and Implications for Policy. JPMorgan Chase Institute. Farrell, D., Ganong, P., Greig, F., Noel, P., 2016. Recovering from Job Loss: The Role of Unemployment Insurance. JPMorgan Chase Institute. Ganong, P., Noel, P., 2017. The Effect of Debt on Default and Consumption: Evidence from Housing Policy in the Great Recession. Mimeo, Harvard University. Available at: https:// scholar.harvard.edu/ganong/publications/effect-debt-default-and-consumption-evidencehousing-policy-great-recession. Ganong, P., Noel, P., 2018. Consumer Spending During Unemployment: Positive and Normative Implications. Mimeo, Harvard University. Available at: https://scholar.harvard.edu/ganong/ publications/how-does-unemployment-affect-consumer-spending-job-market-paper. Gelman, M., Kariv, S., Shapiro, M.D., Silverman, D., Tadelis, S., 2015. How Individuals Smooth Spending: Evidence from the 2013 Government Shutdown Using Account Data. No. w21025. National Bureau of Economic Research. Hall, R.E., 1978. Stochastic implication of the life cycle-permanent income hypothesis: theory and evidence. Journal of Political Economy 86, 971e988. Internal Revenue Service (IRS). Filing Season Statistics for Week Ending, 2016. Internal Revenue Service, 2017. Available at: https://www.irs.gov/newsroom/filing-season-statistics-for-theweek-ending-december-30-2016. Jappelli, T., Pistaferri, L., 2017. The Economics of Consumption: Theory and Evidence. Oxford University Press. Kaplan, G., Violante, G.L., Weidner, J., Spring 2014. The wealthy hand-to-mouth. In: Brookings Papers on Economic Activity. Madrian, B.C., Shea, D.F., 2001. The power of suggestion: inertia in 401(k) participation and savings behavior. Quarterly Journal of Economics 116 (4), 1149e1187. Olafsson, A., Pagel, M., 2018. The liquid hand-to-mouth: evidence from personal finance management software. The Review of Financial Studies 31 (11), 4398.

Chapter 7

Spending to and through retirement Sharon Carson, Katherine Roy, Je Oh, Joseph Marlo Retirement Solutions, J.P. Morgan Asset Management, New York, NY, United States

1. Introduction Planning for retirement is difficult because there are so many unknowns, such as how long you will live, what investment return you may experience, and how much your purchasing power may be eroded over a long time frame. Assumptions related to these questions are regularly scrutinized, updated, and debated. Less prominent, but just as critical, is how much you may spend in the years leading up to and through retirement. Assumptions about spending have profound implications when trying to figure out the retirement planning puzzle, inclusive of how much to save; when to retire; how to invest; and if spending more early in retirement when individuals are more active may be prudent. Most individuals, whether they have a formal plan for retirement or not, use their current level of spending as a baseline. This is often the starting point because individuals generally have an understanding of their current spending and lifestyle and they lack information about how these will change with retirement and age. Some financial planners and planning tools may rely on a default assumption of slightly reduced spending at retirement. After retirement, expenses are usually assumed to increase with inflation to preserve purchasing power. Using a unique data setdJ.P. Morgan Chase information that has been deidentifieddwe are able to help individuals and their financial advisors make better assumptions. We have data for over 31 million households. We included clients who conducted significant banking activity and likely did most of their spending using Chase payment methods, which narrowed our focus to 5.8 million households. For analysis of the transition period around retirement, we analyzed nearly 60,000 households. We started exploring spending patterns in 2014 and have significant findings in two major areas: the life cycle of

Handbook of US Consumer Economics. https://doi.org/10.1016/B978-0-12-813524-2.00007-X Copyright © 2019 Elsevier Inc. All rights reserved.

161

162 Handbook of US Consumer Economics

spending, which is a long-term view from peak spending years through retirement, and a short-term view of the retirement transition period. Changes to retirement planning assumptions that are relatively small may make a significant difference in outcomes. Our study on the long-term trend of spending in retirement is not longitudinal, but our findings are generally consistent with cohort and longitudinal studies cited in the literature review. Given the characteristics of our data set, we are able to provide unique insights for households who have higher wealth levels. The main points related to the life cycle of spending are the following: l

l

Traditional financial plans and financial planning models may overestimate spending in retirement. For those with at least $500,000 in assets, the result may be an overestimation of real spendingdby about 30% over 30 years. This may lead to an unnecessarily constrained lifestyle early in retirement or working longer than required (Roy, 2014). For all wealth groups, it is important to consider the composition of spending, how that may change over time, and the inflation rates for each spending category. Long-term care expenses are fundamentally different and should be accounted for separately.

Spending in the retirement transition period is an emerging, yet critical area of research that few researchers have explored in depth. Our data set is especially useful for analyzing this period, since we have annual and monthly longitudinal results for the same household for up to four consecutive years. Our most significant result: l

As households shift into the retirement transition period, the majority have extraordinary spending either just before and/or just after retirement. This spending volatility may have a negative impact on portfolios and lifestyle if it is unanticipated. The consequences of spending spikes will be worse if they are concurrent with poor returns on investments.

2. Description of data sources Research to date on spending patterns related to retirement has largely relied on publicly available sources. These include the Bureau of Labor Statistic’s Consumer Expenditure Survey (CE Survey) or the University of Michigan’s Health and Retirement Study (HRS) paired with the Consumption and Activities Mail Survey (CAMS).

2.1 CE Survey The CE Survey is a nationwide household survey conducted by the US Bureau of Labor Statistics to measure how Americans spend their money. It consists of

Spending to and through retirement Chapter | 7

163

two separate components: the Interview Survey and the Diary Survey. The Interview Survey, collected quarterly from approximately 6900 households, is used to collect data on large and reoccurring expenditures that consumers may be expected to recall for 3 months or longer, such as rent and utilities. The Diary Survey is answered by about 6900 households per year for two consecutive 1-week periods. It is designed to collect data on small, frequently purchased items, including most food and clothing purchases. New respondents are selected each year for both the Interview and Diary Surveys (US Bureau of Labor Statistics, 2016). The CE Survey is one of the most comprehensive views of spending and spending categories available for US consumers. In addition to data on spending, the Survey collects information on amount and sources of family income, assets and liabilities, and demographic characteristics of family members. Researchers who use the data to analyze spending patterns often complete cohort studies using the annual data tables which were made widely available starting in 1984 (US Bureau of Labor Statistics, 2017). Examples of these studies are provided in the literature review.

2.2 HRS/CAMS Survey HRS is a longitudinal study of US residents that adds new waves of participants over time. It includes information on health, retirement, disability, resources, spending, family support, and demographic characteristics. It is designed to focus solely on those households that include an individual aged 50 years or older. CAMS is a mail survey taken from a subset of the HRS sample. The two surveys are taken in alternate years, but they are designed so that households may be linked. To date, there are 12 waves of core HRS data, with 18,000e23,000 households per wave. The CAMS started in 2001 with a survey of 5,000 households (Institute for Social Research and University of Michigan, 2017). Researchers often use HRS/CAMS data to track the same households for many years.

2.3 Chase data Our data set is based on spending transaction records, which gives us a different view than survey data. This allows for a more accurate record since interviewees’ memories and journaling habits may fail at times. The tendency of people to misreport their financial records has been documented by researchers who have analyzed other types of data, such as surveys of consumer debt (Zinman, 2009; Brown et al., 2015). Also, the number of records we have to work with is much larger than publicly available data. This allows us to parse the households into segments that are large enough to have greater confidence that the results are representative of the group being analyzed.

164 Handbook of US Consumer Economics

Categorization is not as complete as with the surveys, given that we are unable to classify paper checks and cash spending. In addition, some transactions may be completed at other institutions. The description of how we mitigate these issues is provided later in this chapter. For studies that may be helpful to financial advisors, we have a large number of records which are typically representative of their client base, including higher wealth households. Specifically, our population includes a significant number of households that have $500,000 or more in financial wealth, including households with $500,000e$1 million and $1 millione$3 million. The spending transactions are recorded on a frequent basis, which allows for unique insights into spending increases and decreases over short periods of timed what we call spending volatility. The J.P. Morgan Chase Institute has also used Chase data to analyze income and expense fluctuations over short time periods. For example, they found that US consumers in young families (18- to 29-years-old) and families above 65 years had increased credit card debt after a large medical payment compared to the year before that payment (J.P. Morgan Institute, 2017a,b). Data privacy We have a number of security protocols in place which ensure all customer data are kept confidential and secure. We use reasonable physical, electronic, and procedural safeguards that are designed to comply with federal standards to protect and limit access to personal information. There are several key controls and policies in place to ensure customer data are safe, secure, and anonymous. Before J.P. Morgan Asset Management receives the data, all unique identifiable information, including names, account numbers, addresses, dates of birth, and Social Security numbers, is removed. J.P. Morgan Asset Management has put privacy protocols for its researchers in place. Researchers are obligated to use the data solely for approved research and are obligated not to reidentify any individual represented in the data. J.P. Morgan Asset Management does not allow the publication of any information about an individual or entity. Any data point included in any publication based on customer data may only reflect aggregate information. The data are stored on a secure server and can be accessed only under strict security procedures. Researchers are not permitted to export the data outside of J.P. Morgan Chase’s systems. The system complies with all J.P. Morgan Chase Information Technology Risk Management requirements for the monitoring and security of data. J.P. Morgan Asset Management provides valuable insights to policymakers, businesses, and financial advisors, but these insights cannot come at the expense of consumer privacy. We take every precaution to ensure the confidence and security of our account holders’ private information.

Spending to and through retirement Chapter | 7

165

3. Literature review There is a significant amount of research related to spending over the course of retirement. Michael Stein documented his thoughts on this 20 years ago, based on his experience as a financial advisor (Stein, 1988). Since that time there have been several quantitative studies on the topic.

3.1 The phased retirement hypothesis Michael Stein’s book The Prosperous Retirement (1998), while not well known, is important because it documented the experience of retirees who were in Stein’s client base for many years. Stein hypothesized that spending in retirement occurs in three phases: active, passive, and final. Each phase is largely driven by health. As described by Stein, the “Go-Go” phase is a continuation of a retiree’s preretirement lifestyle where work is replaced with free time and health has not substantially declined. During this time, retirees spend about equal to their preretirement spending. This lasts approximately 10 years. After the “Go-Go” period, there is a transition to the more passive “Slow-Go” decade when retirees reduce spending by 20%e30% in real terms. In the final phase, “NoGo,” retirees continue to reduce spending across most categories but face unpredictable and potentially large, healthcare and long-term care costs.

3.2 Quantitative evidence of spending reductions in retirement Results from early studies that used the CE Survey generally aligned with Stein’s hypothesis (Bernicke, 2005) and (Fisher et al., 2005). Both studies found a significant gradual decline in spending at older ages, after adjusting for inflation. This results in a real reduction in spending of 25% or more by the end of retirement. Fisher was careful to state that spending reductions do not always equate to consumption reductions since many younger households spend to build home equity; many older households continue to live in their own home but do not make mortgage payments. Subsequent studies used the HRS/CAMS to help discern if retirees were spending less in real terms because of low income and asset levels at older ages (Banerjee, 2012, 2018; Browning et al., 2016). Stated more simply, they wanted to understand if retirees were running out of funds. They found this was true for some households. Banerjee found one-third of households without pensions had 20% or less of their starting assets left after 18 years in retirement. Not surprisingly, households with more income, starting wealth and pensions were more likely to preserve their assets or increase their assets in retirement. Browning concluded that on average, those in the highest wealth quintile could spend as much as 47% more than they actually spent over the retirement time frame, including an allowance of 40% of starting assets for bequests and long-term care.

166 Handbook of US Consumer Economics

A study that utilized both the CE Survey and HRS/CAMS data came to a similar conclusion, with a slight twist (Blanchett, 2013). Blanchett found that on average, there is a real reduction in spending over retirement, after accounting for higher inflation that older individuals experience due to changes in the basket of goods and services they purchase over time. He also noticed an uptick in spending at older ages. This pattern was dubbed “the retirement spending smile.” Blanchett’s conclusion was that while personal situations vary, there is evidence that traditional models may overestimate real spending in retirement by about 20%.

3.3 The retirement transition period Research on household spending in the retirement transition period is emerging. One of the first studies to examine this topic was from the Employee Benefit Research Institute (Banerjee, 2015). Banerjee found that average spending excluding mortgage principal payments declines the first few years after retirement. However, nearly half of households spent more than they did just prior to retirement; this declines to one-third of households by the sixth year after retirement. The findings were similar across income quartiles. Research on the retirement transition period also includes insights on households who have a mix of work income and retirement income (Ramnath et al., 2017). The authors noted that self-employment at older ages may serve as a pathway that allows workers to gradually reduce work when they become eligible for Social Security. This was more prevalent for households with greater wealth. Similarly, the J.P. Morgan Institute also looked at selfemployment of older households and saw evidence of increasing use by seniors of the online platform economy to generate income. Further information on this trend may be found in the previous chapter.

4. The life cycle of spending Our analysis, which started in 2014, reinforces the conclusions of other researchers who have studied the decline in spending from the peak earnings years to old age. Our study also adds two new pieces to the puzzle: (1) an analysis of households with significant investable wealth, which provides a strong indication that the tendency for households to reduce spending at older ages is not due to lack of funds for this group and (2) a more complete picture of how the shift in the types of goods and services purchased by households as they age may impact inflation-adjusted spending in retirement.

4.1 Data Refinement We started with over 31 million households that have banking relationships with Chase. For these households, we analyzed de-identified and aggregated

Spending to and through retirement Chapter | 7

167

12.0 million who have at least $500 in deposits and 5 or more expenditures over the year

9.2 million who were included in the data every month

8.3 million have customers in only 1 household (households that split due to divorce or other reasons not included)

8.0 million have asset and income data and are between 25 and 100 years of age

5.8 million who spent a significant portion of their estimated income at Chase Included: spent 50% or more of their estimated gross income at Chase OR spent at least the median amount spent by households with similar assets at Chase Excluded: households with 50% or more of their spending on non-Chase credit card payments

FIGURE 7.1 Data filter for spending patterns by age (2016).

data from Chase credit cards,1 debit cards, electronic payments, ATM withdrawals, and check transactions (Chase payment methods). For our analysis on the life cycle of spending, we started with data from January 1eDecember 31, 2016, and added the filters shown in Fig. 7.1. Walking through this data filter in detail: We applied minimum thresholds: we included those who regularly used Chase to transact, defined as those who had at least $500 in deposits and five or more expenditures per month. Then we excluded any customers who were not in the data set for all 12 months of 2016, which eliminated those who opened or closed their accounts during the year. Customers who were in two or more households during the year were also excluded. They likely included recently married or divorced customers, or those who spread their accounts across two families during the analysis period for other reasons. We then screened out customers younger than 25 years, older than 100 years, and those who had no asset or income estimates available. Our colleagues in the Chase Consumer and Community Bank (CCB) developed rigorous proprietary models to estimate income and investable wealth for their clients with a relatively high degree of confidence. The estimates were provided in ranges of income and investable wealth rather than specific amounts to ensure client anonymity. Estimated income is pre-tax and includes investment income. Estimated wealth includes deposits, mutual 1. Excludes some co-branded cards.

168 Handbook of US Consumer Economics

funds, and stock and bond investments held at Chase and away from Chase. Home equity was excluded. The estimated income and financial wealth from these models helped us include those who were likely to have a significant proportion of their spending facilitated by a Chase payment method. We did this by selecting customers who had either spent at least 50% their gross estimated income via Chase or who spent at least the median amount spent by other households in the same estimated asset range. We had two additional exclusions related to the proportion of spending at Chase. The first: those customers who had 50% or more of their annual spending on non-Chase credit cards. (Our data file includes payments to other financial institutions, which allows us to identify payments to non-Chase cards.) While some of these customers may have been paying down old debt, others likely had significant non-Chase spending. The second: outliers in each asset group were eliminated, including households that were in the top 0.1% of spenders in any spending category. After these filters are applied, we have a very large sample size for our analysis: 5.8 million customers. Our colleagues at the J.P. Morgan Institute typically adjust their samples to be nationally representative. We did not make these adjustments. Households in our data set are likely to have slightly higher income and investable wealth than the general US population since a primary criterion for inclusion is an active bank account. Since we frequently work with financial advisors and retirement plan sponsors, the unadjusted data set is more likely representative of their client base or their employees. We also did not adjust for geographic location, so although Chase customers reside in all states, our view will generally be more representative of the Chase retail bank branch footprint that spans 26 states.2

4.2 The J.P. Morgan expenditure model Generally, income replacement models and the default settings of financial planning software aim to estimate the amount needed to support a similar lifestyle in retirement. The models often assume a one-time reduction in spending at retirement from mortgage payments stopping, fewer work-related expenses (such as dry cleaning and commuting), and lower taxes. This spending is then adjusted in line with historical inflation or CPI-U each year in retirement. Our analysis of spending over the life cycle allows us to examine these assumptions. Fig. 7.2 is a snapshot in time (2016) which illustrates that on average, spending peaks at midlife. It also shows that the mix of goods and services changes over time. Older households spend significantly more the

2. As of May 2018.

Spending to and through retirement Chapter | 7

169

Checks $60,000

Cash Travel

$50,000

Apparel & Services $40,000

Entertainment Other

$30,000

Transportation Food & Beverage

$20,000

Education $10,000

$0 25 - 29

Housing (includes mortgage) Health Care 30 - 34 35 - 39

40 - 44 45 - 49 50 - 54

55 - 59

60 - 64 65 - 69 70 - 74

75 - 79

80 - 84 85 plus

Age in 2016

FIGURE 7.2 Median spending by age (2016).

old-fashioned wayd using checksd than younger households, as indicated by the top category on the chart. To facilitate use of this data set for retirement planning, we must consider age-related spending changes, including changes to higher inflation categories such as health care and lower inflation categories such as clothing. For the majority of the spending, merchant codes were used to categorize electronic transactions. We then used a two-step process to allocate the remainder of the check and cash spending. The first step: we developed a robust model to estimate ongoing health care expenses after the age of 65 years. This includes spending increases due to consumption of more care at older ages. The output is an estimate of the median cost with traditional Medicare and the most comprehensive supplemental plan that covers co-pays and deductibles for hospital and doctor charges, plus out-of-pocket expenses for prescriptions, vision, dental, and hearing. After we allocated checks and cash to the ongoing health care expense category post 65 years of age, we divided the remainder according to the distribution of spending by category in the CE Survey. Because we often work with financial advisors who have college educated clients, we filtered the CE data to include only college educated respondents. Health care spending estimates Our health care estimates do not include subsidies from former employers, Medicaid coverage, and prescription subsidies for those with low incomes; we also did not account for relatively healthy individuals with less or different insurance coverage who may have lower costs. Those with relatively high out-ofpocket prescription expenses and those who go outside the Medicare system for care may have higher costs than our model output. Continued

170 Handbook of US Consumer Economics

Health care spending estimatesdcont’d We developed our health care cost estimates and growth rates in 2017 with analysis of data from several sources including Employee Benefit Research Institute data as of December 31, 2017 (proprietary non-public file); SelectQuote data as of January 18 (purchased data file), 2018; Centers for Medicare and Medicaid Services website, January 22, 2018, CMS.

Fig. 7.3 illustrates the results. The health-care category is shown in the bottom category of the chart. It is important to note that our categorization of checks and cash did not change the total dollars spent. Therefore, Fig. 7.3 includes the same total median spending at each age as shown in Fig. 7.2. Spending on all items except health care and the “other” category tends to decline at older ages. The “other” spending category includes items such as gifts and donations, bail bonds, gambling, and personal care. Based on the CE Survey spending by age, we believe it is gifts and donations that are increasing at older ages in the “other” category. Another category to note is housing, which is shown just above health care toward the bottom of the chart in Fig. 7.3. We included all mortgage payments, including principal. Our rationale is mortgage payments are necessarily a big part of the monthly outflow for many households, and this changes once the mortgage is paid off. When this happens, households will require fewer resources to support a similar lifestyle. Housing also includes rent payments, real estate taxes, utilities, maintenance, insurance, homeowner association fees, and furnishings. Older individuals may experience cost increases related to real estate taxes and maintenance of an older home. This may include replacing or repairing an aging roof or appliance, renovations, and paying for someone else to maintain the property. Therefore, housing expenses may fluctuate or increase later in life, even if mortgage payments have stopped. Travel $60,000

Apparel & Services $50,000

Entertainment Other

$40,000

Transportation $30,000

Food & Beverage $20,000

Education Housing (includes mortgage) Health Care

$10,000

$0 25 - 29 30 - 34 35 - 39 40 - 44 45 - 49 50 - 54 55 - 59 60 - 64 65 - 69 70 - 74 75 - 79 80 - 84 85 plus

Age in 2016

FIGURE 7.3 Median spending with estimated categorization of checks and cash.

Spending to and through retirement Chapter | 7

171

4.3 Generational view One question that we receive about spending declining with age: since we are showing a snapshot in time, how do we account for generational differences? Using the CE Survey for the older generations, we are able to take cohorts back in time and compare their spending at older ages to their spending at younger ages, on an inflation-adjusted basis. Fig. 7.4 shows inflation-adjusted spending (2017 dollars) for the cohort born between 1925 and 1945. This includes CE Survey data for each age group excluding pensions and cash contributions, and including mortgage principal payments in the housing category. In 2018, this group is between the ages of 73e93 years. By the age of 75 years, this cohort decreased their average spending by 44% from their peak spending years. This clearly illustrates a significant decrease in spending at older ages compared to peak earnings years for the same cohort. Will the baby boomers be different? Because the leading edge of the boomers is now 72 years old, we have preliminary indications. The results on Fig. 7.5 are similar so far, although less dramatic. This is partly due to the fact that the oldest boomers have not yet reached the oldest ages. The latest data file available includes those aged 65e68 years in the 2014 survey and those aged 65e69 years in the 2016 survey. These cohorts reduced their spending by one-fifth from their peak earnings years. It seems reasonable to assume that the boomers will not mirror the older generation exactly. It is also reasonable to expect that some age-related spending adjustments will be made by the boomers due to household size declining at older ages, a reduction in income in retirement and less discretionary spending in the eventual “No-Go” years.

4.4 Investable wealth levels Another question: Are people spending less at older ages because they are running out of funds? While our study will not provide the complete answer to $60,000

Apparel and services 20% 29%

$50,000

Entertainment 44%

Other

$40,000

Transportation $30,000

Food & Beverage Education

$20,000

Housing (includes mortgage) Health care

$10,000

$0 45-54

55-64

65-74

75+

FIGURE 7.4 Historical view: average spending of the silent generation (1984e2016) adjusted for inflation to 2017 dollars.

172 Handbook of US Consumer Economics $60,000

Apparel and services 18%

$50,000

20%

Entertainment Other

$40,000

Transportation $30,000

Food & Beverage $20,000

Education Housing (includes mortgage)

$10,000

Health care $0 45-54

55-64 Age

65 -69*

FIGURE 7.5 Preliminary view: average spending patterns for the early boomers (1984e2016) adjusted for inflation to 2017 dollars.

this question, it does provide some insight. Our hypothesis is that some households are at risk for reduced spending later in life due to a lack of resources, particularly those with lower levels of investable wealth. We also hypothesize that if households with higher levels of wealth are spending less as they age, that is a good indication that their decline in spending is largely discretionary. We analyzed the data by investable wealth level rather than income because income tends to drop significantly at retirement. While financial wealth may also decline, it is likely to be more gradual for most households. Moreover, some wealthy individuals may have increasing investment account balances in retirement, especially when financial markets are positive. For households with less than $500,000 in estimated investable wealth, shown on Fig. 7.6, fixed expenses may be a significant part of total spending. This may include spending on health care, housing, and food, most of which may be difficult to reduce with age. Health care in particular is likely to be a growing expense for households that purchase supplemental insurance and for households with less insurance and who are in poor health. This supposition is supported by other research: according to the 2018 Retirement Confidence Survey, about 4 in 10 retirees say expenses are higher than they anticipated, and 61% of this group said they “cut back spending or did without” in response. The most common expense that is greater than anticipated is health care costs (Greenwald and Associates, 2018). We also analyzed households with higher levels of investable wealth. Figs. 7.7 and 7.8 show the results for those with $500,000e$1 million and $1 millione$3 million. The same pattern is evident, although with a much more pronounced decline in spending at older ages. At the $1 millione$3 million range, we also see an increase in spending in between the ages of 50e59 years. The 50s are likely peak earning years for many who are able to accumulate this level of wealth.

Spending to and through retirement Chapter | 7

173

$60,000

Travel

Apparel & Services

$50,000

Entertainment $40,000

Other

Transportation

$30,000

Food & Beverage $20,000

Education

Housing (includes mortgage)

$10,000

Health Care $0 45 - 49

50 - 54

55 - 59

60 - 64

65 - 69

70 - 74

75 - 79

80 - 84

85+

Age in 2016

FIGURE 7.6 Life cycle of spending: $0e$500,000 in assets. Median spending with estimated categorization of checks and cash.

Travel $200,000

$180,000

Apparel & Services

$160,000

Entertainment

$140,000

Other $120,000

Transportation $100,000

Food & Beverage

$80,000

$60,000

Education

$40,000

Housing (includes mortgage)

$20,000

Health Care $0 45 - 49

50 - 54

55 - 59

60 - 64

65 - 69

70 - 74

75 - 79

80 - 84

85+

Age in 2016

FIGURE 7.7 Life cycle of spending: $500,000e$1 million in assets. Median spending with estimated categorization of checks and cash.

Compared to the overall population, another key difference for households with more investable wealth is adequate funds for ongoing health-care expenses are a relatively small part of total spending. For the median households with $500,000e$ 1 million or $1 millione$3 million, the reduction in spending at older ages is likely to be a natural progression since household size is declining and discretionary spending may go down in tandem with a less active lifestyle.

174 Handbook of US Consumer Economics $250,000

Travel Apparel & Services $200,000

Entertainment Other

$150,000

Transportation $100,000

Food & Beverage Education

$50,000

Housing (includes mortgage) Health Care

$0 45 - 49

50 - 54

55 - 59

60 - 64

65 - 69

70 - 74

75 - 79

80 - 84

85+

Age in 2016

FIGURE 7.8 Life cycle of spending: $1e$3 million in assets. Median spending with categorization of checks and cash.

This research is a quantitative validation of Michael Stein’s work. It also reinforces findings of other researchers, adds evidence that most households with significant financial assets are not reducing spending due to financial constraints, and allows us to provide more specific guidance about planning for increases in health care expenses while taking into account a natural decline in spending in all other spending categories.

4.5 Accounting for long-term care costs Currently, using the Chase data, we are unable to classify long-term care costs separately from other check spending. Even if we were able to classify these costs, relying on median spendingdor the distribution of spending at any one point in timed would be problematic. This is due to the episodic nature of care needs. For retirement planning purposes, we recommend assuming that long-term care costs are not included in our results. They should be treated as a possible separate expense later in life to ensure this risk is accounted for in retirement plans. HRS/CAMS design includes those who have family members in nursing homes or who transfer there after the initial survey. It also conducts an exit interview with the respondents’ family after death. Even so, there may be some reporting bias due to difficult circumstances when care needs arise (Institute for Social Research and University of Michigan, 2017). The US Department of Health and Human Services (HHS) combined HRS/CAMS with other data sources to create a picture of the total cost of paid care. According to HHS, the average American turning 65 years who requires paid care will incur $266,000 in future long-term care costs that may be financed by setting aside $134,000. Some costs are likely to be covered by other payers, especially for households with low incomes and low wealth (Department of Health and Human Services, USA, 2016).

Spending to and through retirement Chapter | 7

175

Use of averages for those who may need paid care may be inadequate for individuals who wish to avoid reliance on Medicaid or family in the event of a longer care need: about 1 in 10 men and 2 in 10 women will need care for more than 5 years (Department of Health and Human Services, USA, 2016). These individuals may wish to consult with a financial professional who has experience in planning; an age of 50 years may be a good time for this assessment. Financial planning firms may include long-term care costs in a cash flowebased plan around the age of 80 years and assess various options. To help more fully assess tail risk, some insurance carriers provide local cost estimates that may be used in personalized plans.

4.6 Implications of the life cycle of spending for plan providers and employers We noted earlier that estimates for spending in retirement are widely used, both by plan providers and employers who are calculating the amount of income their employees will need to replace. In light of the evidence on how spending changes over time, we have four suggestions for employers and plan providers. First, provide an annual income estimate, since employees may have difficulty calculating how much their retirement saving balance may be worth if turned into monthly or annual payments. Second, for pre-retirees, provide information about health care costs in retirement in a holistic way. Employees may be concerned about ongoing health care costs in retirement because they have read articles that state they need to save $147,000 for a woman who wants a 90% chance of having enough money for health care ($131,000 for a man), not including funds for possible long-term care costs (Fronstin and Vanderhei, 2017). This information is important because it should help reduce the number of employees who are surprised by high Medicare-related costs. It may also seem overwhelming. It is important to put costs in context because retirement savings goals include some funds for health care costs, and employees’ current spending is an input into these goals. Employers are in a unique position: they have information about how much employees are spending for health care currently. Making this information readily available alongside information on possible long-term care costs and estimated annual Medicare-related expenses in retirement may help demystify the picture. (Our estimates for Medicare-related costs are provided in the next section: “Implications of the Life cycle of Spending for Firms that Provide Financial Planning.”) Third, use replacement rate targets that reflect the fact that spending patterns differ by income and wealth level. Most highly compensated employees and executives may be looking at overstated savings targets derived from traditional assumptions. For the amount required, provide a range of outcomes from retaining their full purchasing power to a view that takes into

176 Handbook of US Consumer Economics

account some households that may have naturally declining expenses as they age. Our model suggests that the average household with over $500,000 in financial wealth decreases their spending over time to the age of 95 years. Including category-specific inflation adjustments, this may result in real spending that is 30% lower than traditional models suggest for households with $500,000e$1 million and 36% lower for households with $1 millione$3 million. Lastly, when selecting a postretirement investment menu, recognize that the level of spending may change over time. Real spending may decline as discretionary spending decreases in less active years, especially for those with more wealth. Households may also be subject to a spending spike late in life due to health care issues and/or long-term care needs. For these reasons, consider a range of solutions inclusive of products that address longevity risk as well as the need for some flexibility.

4.7 Implications of the life cycle of spending for firms that provide financial planning While employers and plan providers usually have limited information about their employees, firms that provide financial planning have the ability to provide more personalized guidance. Our recommendations assume the use of financial planning software that is based on cash flows, including annual projections of expenses. We recommend separating Medicare-related health care expenses, lifestyle expenses, and long-term care expenses. For pre-retiress to have a complete retirement plan, it must include ongoing health care costs as a separate line item. Our estimate is $5,200 annually per person for Medicare-related premiums and out-of-pocket expenses at age 65 years in 2018, assuming no employer-provided retiree coverage. Depending on the type of Medicare plan selected, costs may be lower, but cheaper options are generally more variable in terms of out-of-pocket expenses. This expense should grow at a rate that includes both health careespecific inflation and increased use of care at older ages. Our combined projected growth rate is 5.8% annually after age 65 years, increased to 6.5% including an uncertainty factor that takes into account inflation variability and the possibility of future Medicare funding issues (J.P. Morgan Asset Management, 2018). For highincome clients, some planning software will also adjust for income-based Medicare premiums; this is important so that clients are not surprised by higher than expected costs. It may also be an opportunity for an advisor to work with a client’s tax advisor in order to minimize the occurrence of higher premiums. Some retirees may want to spend more early in retirement when they are healthier, more active, or able to travel. They may naturally spend less when they are older. To address this, reconsider the default assumption of assuming expenses will continue to grow in line with inflation 2.5% or 3% annually in

Spending to and through retirement Chapter | 7

177

retirement, especially for clients with assets over $500,000. A default rate of 1.5% or 2% may be appropriate for lifestyle expenses excluding health care.

5. Shifting into retirement Shifting into retirement refers to the transition period around the retirement time frame, when households may be making important changes in their lifestyle. Our research into this phase reinforces other researchers’ findings and breaks new ground in two areas. Regarding work in the retirement transition phase, our analysis is different from other studies because it captures the timing of payroll income, Social Security and pensions, including overlapping inflows of these sources. Second, monthly aggregation of the data helped us provide a unique, in-depth view of spending right before and after retirement. Specifically, we isolated the year before retirement (the benchmark year) and compared spending that year to each of the 3 years after retirement. We also created a view of median spending over rolling 12 month periods before and after retirement.

5.1 The retirement transition period data filter While the data set for the long-term view of spending is a snapshot from 2016, our analysis of the retirement transition period is a longitudinal view from 2012 to 2016. Specifically, we started by applying the same filters as we did for the long-term view (shown in Fig. 7.1). We then narrowed the population to households who entered retirement between 2013 and 2015. This allowed us to track their spending for at least 12 months prior to the retirement month and at least 12 months after the retirement month. In addition, we excluded households who started and stopped retirement and households with multiple primary account holders at the same address where one account holder was classified as retired and the other was classified as not retired. (We will study these “nontraditional” households in the future.) When these filters were complete, we had a robust data set of 60,000 households.

5.2 Defining “retirement” Since we do not have a customer-identified retirement date, we developed a process to determine the month that households transitioned into retirement. One of the most important factors in this analysis is the mix of work and retirement income inflows, but other factors, such as age and specific income type were also taken into account. We started by sorting the households that seemed to have a fairly clear status: those with only income from work such as payroll income were classified as “not retired”; those who had only pension, annuity, or Social Security income were classified as “retired”; and those who only had unemployment or

178 Handbook of US Consumer Economics

Population of people Only labor income

Mix of labor and retirement income

Only retirement income

Classification process Algorithm to sort out individuals with income mixtures

Not retired

Retired

FIGURE 7.9 Estimating the retirement date.

disability income were excluded from this analysis.3 To classify the remainder, who had a mix of work and retirement income sources, we developed an algorithm. Development of the algorithm shown in Fig. 7.9 was a two-step process: l

l

Step 1: We created initial rules to determine if a household is likely retired, based on types and proportion of income inflows. Step 2: We verified and further refined the model using a “truth set” of credit card applications where the applicants self-identified as not retired or retired. (Use of de-identified applications was limited to model verification and development.)

After the rules were completed, they were used to determine the most likely retirement status. To simplify our initial analysis, we limited our household classifications to “retired,” and “not retired.” For households who were retired in the 2012e16 time frame, we also determined the month that retirement started, which allowed us to track income type and spending in the months before and after retirement. This analysis led us to one of our more interesting insights. Many employers and most financial planning firms adhere to the assumption that work suddenly and completely stops and then a person is “retired.” Others believe that a new definition of retirement will involve a change in the type of work or a gradual reduction of work for a period of time. From our analysis, we see that for many households, a new reality of phasing

3. Receipt of Social Security income prior to age 60 years by the primary account holder was deemed likely to be disability income. While individual and spousal retirement benefits may be taken as early as age 62 years, retirement benefits for widows and widowers may start as early as age 60 years.

Spending to and through retirement Chapter | 7

179

Percent Retired 2012- 2016

Median retirement age: 64 Most common: 60–69

45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80

Retirement Age

FIGURE 7.10 Distribution of retirement age at retirement.

out of work into retirement is not just around the cornerd it has arrived. The evidence: we found that the majority of households who started receiving retirement income for the first time also received at least some labor income over a period of time. Examining the group that has a mix of retirement and labor income in greater depth is an important step that we are looking forward to in our future work.

5.3 At what age do people retire? Age 65 is a common default retirement date assumption used by employers, researchers, and some financial planners. It is also the expected age of retirement for employees (Greenwald and Associates, 2018) and when most will become eligible for Medicare. Our analysis shows there is fairly wide distribution in observed retirement dates. As shown on Fig. 7.10, age 64 is the median retirement age; the most common retirement ages are between 60 and 69 years, this encompasses about three-quarters of those households we identified as newly retired. This reflects the experience of recent retirees, including the trend of more people working at older ages than in the past, as shown on Fig. 7.11 (J.P. Morgan Asset Management, 2018). Looking at Fig. 7.10, we also see a slight spike in retirement at age 55 years. We believe this may be due to some professions such as teachers and other public sector workers becoming newly eligible for pensions at age 55. We found that pensions and “early retirement” often go together. Six in ten households who retired prior to age 60 years had pension income; this applied

180 Handbook of US Consumer Economics 0.4

0.366

0.35

0.322 0.29

0.3 0.25

0.227

0.219

65-69

0.192

0.2

0.17

70-74

0.151 0.15

75-79

0.125

0.121 0.096

0.1

0.068

0.05 0 1996

2006

2016

2026

FIGURE 7.11 Percent of people in the civilian labor force, 1996e2026. Bureau of Labor Statistics, Employment Projections, Tables 3.2 and 3.3. Actual data to 2016 and projection to 2026. Based on noninstitutionalized population.

27% have pensions

40% 35%

Percent of Retirees

30% 25% 20%

61% have pensions

15%

10% 5% 0%

45-49

50-54

55-59

60-64 65-69 Age at Retirement

Retirees With a Pension

70-74

75-79

80-84

Retirees Without a Pension

FIGURE 7.12 Retirees by retirement age by pension status.

to fewer than 3 in 10 households who retired after 60 years.4 Fig. 7.12 has more information on the distribution of retirement age and pension status. This has implications for public policy and for private companies, since fewer people will have pension income in the future: over half of households aged 70

4. Since the start of pension income was an important indicator for determination of the retirement date, there may be some confirmation bias.

Spending to and through retirement Chapter | 7

181

years or older in 2018 have a pension versus only 20% of households aged 40e49 years (LIMRA Secure Retirement Institute, 2015).5 For the next few sections, we will focus only on those who retired between the ages of 60e69 years. This will include the majority of retirees as well as put the focus on those households who were less likely to have pension income. In light of pension coverage and retirement date trends, this may result in a view of spending that is more representative of future retirees than if we included a broader retirement age range.

5.4 Average spending the year before versus the year after retirement For those who retired between the ages of 60 and 69 years, average spending declined by about 1% the year after retirement compared to the prior year (see 7.13). This is in line with the long-term decline in spending with age discussed earlier. On further examination, we found that only half the retired households spent less the year after retirement compared to the year before retirement. This reminded us that while averages may be very useful when looking at a large population over a long time, individual households are likely to have different paths, especially in the relatively short time periods when they are in a life transition. As a next step, we took a closer look at the average spending of those who spent more the first year after retirement compared to those who spent less the year after retirement, as shown in Figs 7.14 and 7.15. 20000

-1.2% less spending the year after retirement on average

18000

0%

16000 14000 12000

-5%

10000

-2%

8000

1%

6000 10%

4000 2000

-3% -13%

5%

-12%

0%

5%

0

Year Before Retirement

First Year in Retirement

FIGURE 7.13 Average spending the year before and after retirement.

5. Derived from LIMRA Secure Retirement Institute analysis of the 2013 Survey of Consumer Finances, Federal Reserve Board, 2015. For married couples, age group assignment based on age of oldest spouse in 2013.

182 Handbook of US Consumer Economics 25000

25% is the average decrease in total spending, especially large in checks, cash and housing (inflation adjusted)

-33%

20000

15000

-33% -16%

10000

-12% -4%

s C he ck

C as h

el av Tr

th er O

Tr an sp or ta tio n

-21%

ng

n io uc at Ed

En te rta in m en t

l

re

pa

Ap

H ea lth ca re

-12%

-24% 0

-26%

-14%

H ou si

-22%

Be Foo ve d & ra ge

5000

Year Before Retirement First Year in Retirement

FIGURE 7.14 Average spending the year before and after retirement for those who spent less the year after retirement.

25000

31% is the average increase in total spending, especially large in checks, cash and housing (inflation adjusted)

53%

20000

15000

35% 14%

10000

15% 15% 5%

5000

21%

18%

26%

2%

43%

C

he

ck

s

h as C

av Tr

ta or Tr

an

sp

el

n tio

er th O

ng H

ou

si

re ca lth ea H

t En

te

rta

in

uc

m

at

io

en

n

l re Ed

pa Ap

Be Foo ve d & ra ge

0

Year Before Retirement

First Year in Retirement

FIGURE 7.15 Average spending the year before and after retirement for those who spent more the year after retirement.

Results for these two groups are similardin opposite directions. On average, compared to the prior year, those who spent less the year after retirement spent 25% less. In comparison, on average, those who spent more the year after retirement spent 31% more. The spending categories with the largest absolute dollar changes in spending were the same for both groups: checks, cash, and housing, with the biggest changes in check spending. While the spending category that checks are used for is opaque to us, we were able to discern that for most households, it was a few big checks that accounted for the spending differences just before and after retirement. This suggests that some of the changes in spending were related to the timing of large purchases or payments.

Spending to and through retirement Chapter | 7

183

5.5 Average spending the year before versus the year after retirement: households with at least $500,000 in investable wealth Less than 2% (887) of the households in our data set who retired at the age of 60e69 years reach $500,000 in investable wealth. Therefore, our overall results described in the previous section of this chapter are a good reflection of spending behavior for households below this wealth threshold. For the households in the higher wealth subset, the average spending decline the year after retirement compared to the prior year was 4%; this is 3% more than for the results overall. This is not surprising given the long-term results by wealth level discussed earlier. However, when we took a closer look at spending in the transition phase, we found no other significant differences between those with lower and higher wealth. With few differences by wealth, these initial results suggest that some of the spending at retirement is due to behavioral and lifestyle changes.

5.6 Beyond averages: distribution of changes in spending the year before versus the year after retirement

Most common range

+25%

1000

1500

2000

2500

3000

-30%

0

500

Number of Households

3500

The distribution of spending the year before versus the year after retirement, by percentage change, is shown on Fig. 7.16. Compared to the prior year, those on the left side of the chart spent less the year after retirement, and those on the right side spent more the year after retirement. The median change in spending is between 0% and 5%. There is a normal distribution, with a long tail on the right side of the chart, which indicates that there is a wider variation for those who spent more the year after retirement. About two-thirds of the households had between a 30% decrease in spending to a 25% increase in spending the year after retirement. We also see some households who more than doubled their spending. Overall, these results

FIGURE 7.16 Spending change in first year of retirement compared to the prior year. Retirement age: 60e69 years.

184 Handbook of US Consumer Economics

illustrate a wide range in spending changes the year after retirement compared to the prior year. This generally supports a hypothesis of some households making large one-time purchases around the retirement time frame. Admittedly, spending the year after retirement compared to the prior year is a short time frame. Therefore, we took the next step of extending the timeline around retirement as far as feasible, given the limitations of our current data set.

5.7 Evidence of a retirement spending surge In Fig. 7.17, for the population who retired at age 60e69 years, we plotted median household spending for rolling 12 month periods, starting 1e2 years before retirement. We continued, moving forward 1 month with each rolling time period, until we ended with median household spending 2e3 years after retirement. The resulting graphic provides unique evidence of a spending increase starting roughly 7e18 months before retirement and ending 18e29 months afterward, with the peak right at retirement. This shows that there may be a spending surge around the time of retirement. It also gives us a good indication that the wide variation in spending the year before versus the year after retirement is likely to be the result of extraordinary spending that is occurring as people are entering a new life stage. It is important to note that the line that represents median spending over time is representative of the group studied; it is not necessarily the path of any one household. This led us to another question: Are the same households that are spending more just before retirement spending more just after retirement? Our hypothesis: some households know what they are retiring to and prepare ahead of time by paying off loans, buying a second home, taking care of health issues, or renovating their house. Other households may have only considered 46000

Retirement

45000

US Dollars

44000 43000 42000 41000 40000 39000

Rolling Periods in Months Before and After Retirement Spending before retirement

Spending after retirement

FIGURE 7.17 Median spending rolling periods before and after retirement. Retirement ages: 60e69 years.

Spending to and through retirement Chapter | 7

185

what they are retiring from; they may not have contemplated changes ahead of time. Once they arrive in retirement, their spending may temporarily and unexpectedly spike as they make adjustments related to their new lifestyle.

5.8 Spending volatility To test the theory that some households spend more before retirement while others do so afterward, we decided to use each household’s spending the 12 months before the retirement month as a benchmark year. We then compared the benchmark year to each of the 3 years following the retirement month. (The years following the retirement month are not based on calendar years; they are 12-month time periods that vary based on the retirement month for each household.) What we found was a significant amount of spending volatility around the time of retirement as shown in Fig. 7.18 (Carson and J.P. Morgan Asset Management, 2018). Look into these results in more detail: l

l

When it comes to spending in the transition phase, there are relatively few “Steady Eddies.” One-fifth of households kept their spending in the 3 years after retirement within 20% of spending in the benchmark year. Fewer than expected were “Down-shifters”donly 15% decreased their spending by more than 20% each year compared to the benchmark. This is surprising, in light of fairly widespread replacement rate and individual retirement planning assumptions that include an immediate drop in spending at retirement.

FIGURE 7.18 Spending volatility: for each of the three years after retirement, a greater than 20% change in spending compared to the year before retirement.

186 Handbook of US Consumer Economics l

l

A relatively small but not insignificant number of “Up-shifters”d9% increased their spending by more than 20% each year compared to the benchmark. Perhaps these households purchased a second home and they are maintaining two residences or they are traveling more than when they were working. The remainder of the households had volatile spending compared with the benchmark year, with increases or decreases in some years by more than 20%.

This last groupdthe volatile spendersdis more than half of the households we examined. Compared to the benchmark year: l

l

l

About one-quarter of retirees temporarily increased their spending. Specifically, they increased spending in one or two of the 3 years after retirement by more than 20%. About one-quarter of retirees temporarily decreased their spending. Specifically, they decreased spending in one or two of the 3 years after retirement by more than 20%. The remaining 7% had both ups and downs in spendingdwe dubbed this group the “Roller-coasters.” This group experienced the most spending volatility. Compared to the benchmark year, in the 3 years after retirement, they had at least 1 year with a 20% increase in spending and 1 year with at least a 20% decrease in spending.

As a group, the households who experienced spending volatility around the time of retirement generally fit a narrative of some extraordinary spending either before or after retirement. This conclusion is reinforced by the results we saw earlier that suggest a spending surge around retirement.

5.9 Spending volatility beyond the transition phase While our results clearly show spending fluctuations in the transition period, some questions remain about the frequency and magnitude of spending volatility throughout retirement. The possibility of large spending spikes from year-to-year is a particular concern when planning for retirement. In light of this, in addition to looking at changes in spending compared to a benchmark year, we analyzed spending increases greater than 20% from year to year as shown in Fig. 7.19. We found a trend of fewer households experiencing upward year-to-year spikes further from retirement. However, between the second and third years after retirement, about 1 in 10 households experienced a relatively large increase in spending. To see if this trend held true for even greater year-to-year spending spikes, we also looked at spending increases greater than 50% as shown in Fig. 7.20. On this chart, we see an even bigger decrease in the number of households experiencing relatively large year-to-year spending spikes further from retirement.

Spending to and through retirement Chapter | 7

187

0.3 26% 0.25

0.2 16% 0.15 12% 0.1

0.05

0 Year Before Retirement Compared to 1st Year After Retirement

1st Year After Retirement Compared to 2nd Year After Retirement Compared to 2nd Year After Retirement 3rd Year After Retirement

FIGURE 7.19 Households with year-to-year spending increase >20%. Retirement age: 60e69 years.

0.12

0.1

10%

0.08

0.06

5%

0.04

2%

0.02

0 Year Before Retirement Compared to 1st Year After Retirement

1st Year After Retirement Compared to 2nd Year After Retirement Compared to 2nd Year After Retirement 3rd Year After Retirement

FIGURE 7.20 Households with year-to-year spending increase >50%. Retirement age: 60e69 years.

While we are planning to do more research, including some analysis using control groups, our preliminary conclusion is that spending in the retirement transition period is different from later phases of retirement. Also, although experienced by significantly fewer households than in the transition phase, there are still some households with significant year-to-year spending increases a few years out from retirement. Other researchers, including our colleagues in the J.P. Morgan Institute, have more closely analyzed spending volatility across all life stages. This includes Why Managing Expenses is Not an Easy Task (2017) and Coping with Costs, Big Data Expense Volatility and Medical Payments (2017).

188 Handbook of US Consumer Economics

6. Implications 6.1 Key takeaways for employers Implications for employers include recognizing that spending may be different from the assumptions used in traditional models. This is especially true if looking past medians and averages to the paths of individual households. This has several implications. Most critically, spending volatility in the transition phase creates a significant risk if the employees’ savings are invested too aggressively. This is due to sequence-of-return risk. Specifically, a large withdrawal from an account just before retirement or early in retirement may be a problem for savings adequacy by itself and is more likely to be very problematic if it coincides with negative investment returns. Thus, it is critical that target date funds, which are often the default investment vehicle in defined contribution plans, take into account actual spending behavior. Of course, employers must balance this spending and investment risk with the need for some level of equity exposure to adequately fund increasing health care expenses throughout retirement, per the findings in our long-term view of spending. Likewise, if employers are considering product options and defaults for the post-retirement period, they should take into account that there is evidence of a need for some liquidity in the transition phase due to spending volatility and large outlays that may occur at that time. In addition, while a less frequent occurrence, there is some evidence of continued spending fluctuations in later years. Lastly, since those who retire at older ages are less likely to have a pension, and there is evidence of a partial retirement transition phase, there are potential issues and opportunities for employers. Helping employees reach their retirement savings goals may help employees who no longer want to work leave the company. And, for employers facing a staffing shortage or a need for experienced employees, they and their workers may benefit from some accommodations for older workers.

6.2 Ideas for firms that develop retirement plans for individuals Given our results, we believe that comprehensive retirement plans also need to account for the possibility of a spending surge in the transition phase of retirement. In particular, we have three suggestions for financial firms and advisors. First, help clients identify big expenditures: some clients have plans and dreams for retirement. About 5 years before the retirement transition period is a good time to incorporate such changes if they have not already been accounted for in the plan. Ask about planned housing changes, such as moving to a new location or renovating a current home; a new vehicle; travel; plans to

Spending to and through retirement Chapter | 7

189

pay off debt; and lifestyle changes (such as going from skiing 1 week a year to skiing 3 months a year), and include them in the plan. Second, stress-test a spending surge: many clients may know what they are retiring fromdbut not what they are retiring to. For these clients, stress-test a spending surge during one or more of the years just before and after retirement. Third, build a cushion: opportunistically move money to liquid assets prior to retirement to build a cushion that can mitigate the effects of unforeseen spending volatility and reduce sequence-of-return risk.

7. Suggestions for further research Our findings also may have implications for public policy. With the prevalence of a transition period with a mix of work and retirement income, paired with the trend toward more people working at older ages, there may be benefits to society in making it easier for individuals to continue to work. A greater proportion of people working at older ages means more individuals paying payroll and income taxes; this may slightly offset some of the budget issues inherent with supporting an aging population. Older workers who enjoy their work but suffer from burnout or a lack of time off may benefit from bridge jobs that allow them to continue to earn some income, stay engaged with others, and contribute to the economy. Employers who struggle to find talent may be helped as well. One specific area to be explored is government subsidies for health care costs of workers aged 65 years or older. The cost of health care at older ages may be a barrier to employment for some older workers if employers prefer hiring and retaining younger workers who have lower health care costs. To offset this, might employers be given incentives to hire or retain workers eligible for Medicare, which is highly subsidized by the government.6 Researchers interested in exploring this would need to gauge the interest of employers in retaining and/or hiring these workers; what amount of incentive would be required to increase the number of older workers employed; and the subsequent impact on the budget given offsetting increases in taxes and less government-subsided Medicare expense. While a deeper discussion of these issues is beyond the scope of this chapter, we hope that our research, together with work from the J.P. Morgan Institute, will spur additional research and assist in the development of better policy decisions to benefit everyone.

6. Aside from payroll tax funding for Medicare Part A, there is significant financial support for Medicare from general revenues of the US Treasury for Medicare Parts B and D (74% of premiums) (Board of Trustees et al., 2018).

190 Handbook of US Consumer Economics

8. Closing In closing, we hope this summary spurs changes in assumptions used by retirement plan providers, employers, and firms that provide financial planning. An overall summary of the implications follows: Many households with significant financial wealth may spend about onethird less than traditional models assume, after accounting for changes in the basket of goods these households tend to purchase as they age. This means that replacement rates and planning assumptions may need to be adjusted and/ or a range of possibilities presented. Given funds required for ongoing health care costs and possible health risks late in life, it is important to help individuals plan for these expenses. An annual estimate for ongoing expenses with a growth rate, ideally in context of their current spending on health care, is a good place to start. Long-term care is fundamentally different; individuals will need a target amount to set aside and/ or an individual plan to address tail risk. Spending in the year before retirement and first few years after retirement is often particularly volatile as households make adjustments related to a major life change. When making default investment choices and investment menu decisions, employers will need to carefully balance sequence of return risk with other risks that their employees will face. For personalized plans, individuals should be encouraged to think about what large payments they may make in this period, and set aside funds to cover them in advance. If clients are unsure, advisors may encourage a reserve fund. Lastly, employers and firms that provide financial planning should recognize a need for some liquidity in the retirement transition period.

References Banerjee, S., 2012. Expenditure Patterns in Older Americans 2001 e 2009. Employee Benefit Research Institute. Banerjee, S., 2015. Change in Household Spending after Retirement: Results from a Longitudinal Sample. Employee Benefit Research Institute. Banerjee, S., 2018. Asset Decumulation or Asset Preservation? What Guides Retirement Spending? Employee Benefit Research Institute. Bernicke, T., 2005. Reality retirement planning: a new paradigm for an old science. Journal of Financial Planning. Blanchett, D., 2013. Estimating the True Cost of Retirement. Morningstar. Board of Trustees, Federal Hospital Insurance, Federal Supplementary Medicare Insurance Trust Funds, 2018. 2018 Annual Report of Federal Hospital Insurance and Federal Supplementary Medicare Insurance Trust Funds. Brown, M., Haughwout, A., Lee, D., van der Klaauw, 2015. Do we know what we owe? Consumer debt as reported by borrowers and lenders. FRBNY Economic Policy Review. Browning, C., Guo, T., Cheng, Y., 2016. Spending in retirement: determining the consumption gap. Journal of Financial Planning.

Spending to and through retirement Chapter | 7

191

Carson, S., J.P. Morgan Asset Management, 2018. Three Ways to Manage Spending Volatility as Clients Transition into Retirement. Department of Health and Human Services, USA, 2016. Long-term services for older Americans: risks and financing. ASPE Issue Brief. Fisher, J., et al., 2005. The Retirement Consumption Conundrum: Evidence from a Consumption Survey. Boston College Center for Retirement Research. Fronstin, P., Vanderhei, J., 2017. Savings Medicare beneficiaries need for health expenses: some couples could need as much as $370,000 up from $350,000 in 2016. Employee Benefit Research Institute 38 (10). Greenwald and Associates, 2018. Employee Benefits Research. In: Retirement Confidence Survey. Institute for Social Research, University of Michigan, 2017. Health and Retirement Study, Aging in the 21st Century. J.P. Morgan Asset Management, 2018. 2018 Guide to Retirement. J.P. Morgan Institute, 2016. Paydays, and the online platform economy. Paychecks. J.P. Morgan Institute, 2017a. Coping with Costs, Big Data on Expense Volatility and Medical Payments. J.P. Morgan Institute, 2017b. Coping with Medical Costs throughout Life. LIMRA Secure Retirement Institute, 2015. Retirement Income Reference Book. Ramnath, S., Shoven, J., Slavov, S., 2017. Pathways to retirement through self-employment. National Bureau of Economic Research. Roy, K., 2014. J.P. Morgan asset management, retirement insights. The Lifecycle of Spending. Stein, M., 1988. The Prosperous Retirement. US Bureau of Labor Statistics, 2016. Handbook of Methods, Consumer Expenditures and Income. US Bureau of Labor Statistics, 2017. Consumer Expenditure Surveys (CE). United States Department of Labor. Zinman, J., 2009. Where is the missing credit card debt? Clues and implications. Review of Income and Wealth 55 (2).

Chapter 8

Are millennials different?* Christopher J. Kurz, Geng Li, Daniel J. Vine Division of Research and Statistics, Federal Reserve Board of Governors, Washington, DC, United States

1. Introduction Over the past decade, millennials have received a substantial amount of attention as they have transitioned into adulthood. In the fields of business and economics, the unique tastes and preferences of millennials have been cited as reasons why new car sales were lackluster during the early years of the recovery from the 2007e09 recession, why many brick-and-mortar retail chains have run into financial trouble (through lower brand loyalty and goods spending), why the recoveries in home sales and construction have remained slow, and why the indebtedness of the working-age population has increased.1 The general narrative is that the consumption behavior of millennials differs so much from that of earlier generations that the transition of this generation into the prime working-age cohort has induced meaningful changes on macroeconomic outcomes. The narrative sounds plausible, especially if there are no offsetting changes to the spending patterns of the other cohorts as they age. * We thank our colleagues at the Federal Reserve Board for helpful discussions and comments. Special thanks to James Calello and Bo Yeon Jang for invaluable research assistance and feedback. The views presented here are those of the authors and do not necessarily reflect those of the Federal Reserve Board or its staff. 1. For discussions of declining auto sales, see, for example, “Why Car Companies Can’t Win Young Adults,” Fortune (2013). For retail spending, see, “Millennials Aren’t Spending Money Like Their Parents Did,” Business Insider (2016), and “Retailers should be terrified of millennials and Gen Z,” Business Insider (2016). Similarly, an analysis by J.P. Morgan, “The State of the U.S. Consumer” (2016), also finds a larger share of “experiential” spending, with the share of spending on travel, entertainment, and dining being different between the general population and millennials, and the discussion in “NOwnership, No Problem: Why Millennials Value Experiences Over Owning Things,” found in Forbes (2015). In terms of housing construction, see “Homebuilders Are Targeting MillennialsdBut It Will Hit Their Margins” CNBC (2017) and for a summary of delayed homeownership and the possible causes, see Bleemer et al. (2017). For debt, see “Younger Generation Faces a Savings Deficit,” Wall Street Journal (2014) and, more recently, Chien and Morris (2018). Handbook of US Consumer Economics. https://doi.org/10.1016/B978-0-12-813524-2.00008-1 Christopher Kurz, Daniel J. Vine and Geng Li: 2019 Published by Elsevier Inc.

193

194 Handbook of US Consumer Economics

However, distinguishing the shifts in the population’s spending patterns that reflect the unique characteristics of its rising generation from those that reflect secular trends or cyclical forces can be challenging. First, the population includes many generations, and each is unique in some way: Each generation’s members were born within a particular range of years and were subject to a new set of initial conditions. Each generation was also surrounded by particular mix of actors, cultural changes, and world events during its formative years. As a result, change between generations is a fairly permanent source of change in the population.2 Second, economic trends, such as the introduction of new goods and technology, can affect each generation differently, with younger generations typically being more willing to adopt these innovations. Accordingly, some economic behaviors that appear to reflect the unique tastes and preferences of a new generation might actually just reflect general technological change. Third, the effects of the business cycle on economic behaviordespecially large downturnsdcan vary for households in different age groups. Some of these effects likely dissipate as the economy recovers, even though they can mimic generation-specific tastes and preferences for quite some time. But some of these effects may become part of that generation’s permanent tastes and preferences. For example, the severity of the 2007 global financial crisis and the recession that followed may have left a lasting impression on millennials, who were coming of age at that time, much like the Great Depression left a lasting impression on the Greatest Generation. In an effort to sort through these effects, this chapter uses survey and administrative data to compare the saving and consumption patterns of millennials to those of earlier generations, taking into account some important differences in their demographics (such as age, race, family composition, education, and marital status) and economic characteristics (such as income and employment). Using this information, we make an effort to distinguish the effects on household consumption behavior of generation-specific factors from those of the business cycle and secular trends. We show that there are important demographic differences between millennials and earlier generations, illustrating the work of several extant studies. However, we also show that these differences primarily reflect the continuation of existing trends in the overall population. In the economic sphere, millennials appear to have paid a price for coming of age during the Great Recession: millennials tend to have lower income than members of earlier generations at comparable ages, although the income of young households has not changed much; the difference likely reflects, in part, the rising labor force participation of women. For balance sheet variables, we show that millennials own fewer assets than members of earlier generations and also have less debt at the

2. For a discourse on generational research, see, for example, Pew Research Center (2015b).

Are millennials different? Chapter | 8

195

individual level than Generation X. The comparison is somewhat different for debt at the household level, as millennial households appear to have roughly the same debt as Generation X and higher debt than the baby boomers. Conditional on these factors, we find that the spending patterns of millennial households are not very different from those of previous generations. In particular, we find that the taste and preferences parameter of a consumption function that includes age, income, and other demographic and economic factors is not different for millennials than for members of earlier generations. We also review the detailed data on certain categories of consumer spending and the associated spending shares, and we show that there is little evidence of generation-specific preferences after age, income, and other demographic and economic factors are taken into account. For example, for spending on motor vehiclesdwhich accounts for roughly 20% of retail sales and is highly sensitive to the business cycledwe find little evidence that millennial households have significantly different tastes and preferences than households of previous generations. We find similar results for spending on food and housing-related expenses. The next section presents the definitions of generations that we use and reviews the relevant literature on consumption, age, and birth year cohorts. Section 3 follows with an overview of the data sources we employ in this chapter. Sections 4e6 provide generational comparisons of demographics, household balance sheets, and household consumption expenditures, respectively. Section 7 concludes.

2. Definitions of generations and a review of research on age, generations, and economic decisions This chapter mostly adheres to the definitions of millennials and earlier generations described in a number of Pew Research Center reports.3 Millennials are individuals born between 1981 and 1997, with ages ranging from 21 to 37 years in 2018.4 The two generations that precede millennials are Generation X, which describes individuals born between 1965 and 1980 (ages 38 to 53 years in 2018), and baby boomers, who are individuals born between 1946 and 1964 (ages 54 to 72 years in 2018). Older cohorts are the Silent Generation, which 3. See, for example, Fry (2015). 4. Recently, the Pew Research Center has attempted to redefine millennials to be the cohort ending in 1996; see Dimock (2018). We stick to the original definition due to a lack of overall consensus at this time. For example, the Census Bureau has published studies with an end date of 2000; see Census Bureau (2015). Moreover, our cohort analysis is somewhat restricted by the small sample size of the millennial cohort, and lowering the birth date range for millennials would only exacerbate the problem. Importantly, the empirical results that follow are not qualitatively different when taking the revised definition into account. Similarly, we do not see different consumption patterns when redefining each generation into young and old cohortsdthat is, doubling our generational definitions.

196 Handbook of US Consumer Economics Percent

50

Greatest Silent Baby Boomer Gen X Millennial

40

50

40

30

30

20

20

10

10

0

1960

1970

1980

1990

2000

2010

2020

2030

2040

2050

2060

0

FIGURE 8.1 Population shares of selected generations. Source: US Census Bureau, Population Division. 2014 National Population Projections.

describes individuals born between 1928 and 1945 (ages 73 to 90 years in 2018), and the Greatest Generation, which describes individuals born between 1915 and 1928 (ages 90 to 103 years in 2018). Naturally, the size of each generation affects its influence on macroeconomic aggregates. Fig. 8.1 uses census population (and projections) to plot the fraction of the total population in each generation. As the figure shows, millennials became the largest generation in the United States in 2015, overtaking the baby boomer generation, which had been the largest for roughly 60 years.5 Interestingly, Generation X never attained the status of being the largest generation. Reflecting its current size and prime working-age status, millennials tend to be the focus of news articles and industry studies on the expected effects of generational transitions on economic activity. In the economics literature, the framework most often used to tackle questions about the age-related factors that affect households’ decisions on labor and consumption is the life-cycle consumption and permanent income models introduced in the 1950s (Modigliani and Brumberg, 1954; Friedman, 1957). In modern renditions of these models, consumption is part of a dynamic optimization problem and is determined jointly with other decisions, such as labor supply, household formation, fertility, and planned bequests. Over time, academics have added many features to these models in an effort to match key properties of consumption data, including the well-known hump shape of household spending over the life cycle. Attanasio (1999) discusses many variants of these consumption models. 5. The status of being the largest generation alive is often short lived. According to Census data, Generation Z, or the “postmillennial” generation (not shown in Fig. 8.1), became the largest generation in 2016.

Are millennials different? Chapter | 8

197

Importantly, this literature suggests several reasons why younger households might choose consumption differently than older households, even conditional on the same observed current income. Young households may face borrowing constraints that older households do not, for example, or they may have different expected income growth, different socioeconomic characteristics that affect the marginal utility of consumption, or different trade-offs between leisure and labor supply. Unfortunately, the consumption decisions of younger households have not been as vigorously explored in empirical work as those of households around the age of retirement.6 The literature also suggests a few reasons why the birth year of a household might affect its consumption.7 For example, Malmendier and Nagel (2011, 2016) show that the economic conditions that an individual has experienced in the past can have long-lived, if not permanent, effects on his investment decisions and inflation expectations. This result may be particularly salient for millennials, who came of prime age during the Great Recession, when new entrants to the labor market faced historically weak labor demand and unusually tight credit conditions. The effects of these unfavorable conditions on labor force attachment and attitudes toward saving and spending may have been more permanent for millennials than for members of generations that were more established in their careers and lives at that time. In other empirical studies of the effects of age and generation cohorts on household economic decision-making, researchers have employed even larger and richer household-level data. Dettling and Hsu (2014, 2017) use the Survey of Consumer Finances (SCF) to document that the median real net worth of young households in 2013 is lower than that of previous generations when they were young. Chien and Morris (2018) conduct a similar analysis with the 2016 SCF and show that millennials tend to have less assets and slightly more debtdand hence a lower net worthdrelative to Generation X members. In addition, Paulin (2018) leverages multiple years of the Consumer Expenditure Survey (CE) to compare millennials with members of earlier generations.8 The analysis finds that millennials are more racially and ethnically diverse, more educated, and spend more on food away from home. Importantly, Paulin uses the longitudinal dimension of the survey to show that expenditure shares do not vary much between generations.9

6. See, for example, Banks and Blundell (1998) and Bernheim et al. (2001). 7. In politics, DeSilver (2014) and Ghitza and Gelman (2014) show that major events such as World War II, the Korean and Vietnam Wars, and the Watergate scandal had persistent effects on the voting behaviors of the generation becoming politically aware at that time. 8. In addition, the BLS started posting experimental tables in 2015 showing expenditures by generation. 9. Other studies include a postrecession comparison of young adults (Fry, 2013), a comparison of auto purchases between different age cohorts (Kurz et al., 2016), and a report on deficient millennial retirement saving (Brown, 2018).

198 Handbook of US Consumer Economics

As previously mentioned, this chapter brings together multiple sources of data to bear upon the question of whether millennials show consumption behavior that differs from that of members of earlier generations.10 Interestingly, a similar question was posed 20 years ago when baby boomer profligacy was being compared to the Silent Generation’s penchant for saving. Speaking to that debate, Sabelhaus and Manchester (1995) were able to separate fact from popular myth at the time and provided evidence that consumption had not increased as much as income, and that baby boomer asset accumulation had in fact outpaced that of the previous generation.

3. A comparison of demographics by generation In this section, we compare the demographic and economic characteristics of various generations. This exposition will lay the groundwork for comparing generational consumption behavior, as demographic factors are part of what determines consumer behavior. We first address the demographic factors and then turn to economic factors, such as income and balance sheets. The demographic characteristics for which we provide comparisons of the population by generation over time include race, educational attainment, and marriage status. These categories merit some attention because there has been substantial discussion of differences between the demographic compositions of millennials and those of members of earlier generations. However, it is less often emphasized that most of these differences are part of secular trends in the population rather than aberrations coming from one particular generation. Specifically, while it is apparent that millennials are the cohort that is the most diverse, most educated, and have the lowest marriage rates, it is also the case that these superlatives could have been said at some point in time about each of the earlier generations vis-a-vis its predecessors. We begin by examining race. The United States has become more racially diverse over time, meaning that the share of the white population has declined and the share of the population with other racial backgrounds has increased. The increase in diversity sources from many factors, such as immigration, interracial marriage, and differential birth rates.11 Fig. 8.2 presents the racial composition for the five largest generations alive in 2017. The data used are from the June 2017 population estimates, and race is categorized into six groups.12 As seen in the figure, each generation is more racially diverse than 10. The data sources are summarized in the data appendix. 11. See Pew Research Center (2015b) for information on immigration’s impact on diversity, Frey (2012, 2014a, and 2014b) for a review of the increase in diversity and multiracial marriage, and CDC public use files for information on birth rates by race. 12. As a robustness check, we measured these statistics at different points in timedfor instance, when the older generations were younger and perceptions of race may have been different. This exercise did not change the qualitative story of the population becoming more diverse over successive generations.

199

Are millennials different? Chapter | 8 Percent White Black Asian or Pacific Islander

Hispanic Native American Two or more

90 80 70 60 50 40 30 20 10 0

Greatest

Silent

Baby Boomers

Generation X

Millennial

FIGURE 8.2 Racial composition of the population by generation. Source: US Census Bureau, Population Division. June 2017 Monthly Population Estimates.

the generation that precedes it in age. Fifty percent of millennials are white, well below the 80% share for the Greatest Generation. The decline in the proportion of the white population between adjacent generations averages about 6 percentage points. The decline in the share of the white population is particularly stark between baby boomers and Generation X members (almost a 12 percentage point drop), and the decline is actually fairly moderate between Generation X and millennials (around 41/2 percentage points). The educational attainment as of 2017 for each generation is also markedly higher than that of the generation that precedes it in age. Fig. 8.3 presents a decomposition of educational attainment for each generation.13 As seen in the figure, 65% of millennials have an associate’s degree or higher, well above the 50% rate for the Silent Generation. While it is important to note that some generations pictured heredparticularly millennialsdmay still include some people who have not yet completed their educational arc in 2017, the comparisons of the Silent Generation with baby boomers and baby boomers with Generation X both display notable shifts in educational attainment. Specifically, the share of individuals with exactly a high school diploma decreases by roughly 5 percentage points per generation, while the share with a bachelor’s degree increases by a similar amount. Next we look at the marriage rate, which can be influenced by factors such as educational opportunity, the openness of the labor force, and cultural norms.14 Fig. 8.4 presents marriage rates from the CPS Household Surveys

13. The decomposition is performed for those older than 30 years in 2017. Different year decompositions and age cutoffs do not change the qualitative story. 14. See Goldin (2006) for a review of the evolution of employment and marriage for women.

200 Handbook of US Consumer Economics Percent High school diploma Associate degree Bachelor’s degree Postgraduate degree

45 40 35 30 25 20 15 10 5 0

Silent

Baby Boomers

Generation X

Millennial

FIGURE 8.3 Educational attainment for individuals older than 30 years by Generation. Note. Postgraduate degrees include professional and advanced degrees. Source: Flood, S., King, M., Rodgers, R., Ruggles, S., Robert Warren, J., 2018. Integrated Public Use Microdata Series, Current Population Survey: Version 6.0 [2017]. IPUMS, Minneapolis, MN. https://doi.org/10.18128/D030.V6.0.

Percent Millennial Generation X Baby Boomers Silent Great

110 100 90 80 70 60 50 40 30 20 10

20

25

30

35

40 Age

45

50

55

0 60

FIGURE 8.4 Marriage rate by age and generation. Source: Flood, S., King, M., Rodgers, R., Ruggles, S., Robert Warren, J., 2018. Integrated Public Use Microdata Series, Current Population Survey: Version 6.0 [2017]. IPUMS, Minneapolis, MN. https://doi.org/10.18128/D030.V6.0.

from 1962 to 2017 by generation and age in a cohort graph. A cohort graph presents the outcome variabledin this case the fraction of the population that is marrieddfor each generation at each age. This presentation allows for easy visual comparisons of the outcome variable between generations at comparable ages rather than at comparable points in time. For example, roughly 6% of millennials were married at age 20 (observed between 2001 and 2017),

Are millennials different? Chapter | 8

201

whereas 35% of the Silent Generation were married at that same age (observed between 1948 and 1965). Marriage rates, particularly at younger ages, have shown a pronounced downward trend in the US population over time, much like the measures of racial diversity and educational attainment discussed earlier. For people in their 20s, the largest decline in marriage rates between successive generations is between the Silent Generation and baby boomers, a transition that occurred primarily after the late 1960s.

4. Comparison of income and balance sheets by generation In this section, we compare income, asset holdings, debt, and net worth across generations. We present summary statistics of these variables for individuals and for families or households, a distinction that matters for some comparisons. We also use a linear regression model to show that controlling for observable demographic and socioeconomic characteristics does not fully explain the lower real individual income of millennials. All of the figures are adjusted for inflation and expressed in 2016 dollars.

4.1 Income Table 8.1 presents snapshot comparisons of individual labor earnings and family income of married couples in the years 1978, 1998, and 2014 from the Panel Study of Income Dynamics (PSID). Each of these years was part of an expansion phase of the business cycle. We compare the statistics from each year for the entire sample of households and for households younger than 33 yearsdthe oldest age at which we can observe millennials in these data. Individuals in this age group at each of the 3 years on the table represent baby boomers, Generation X members, and millennials, respectively, at similar young ages. To begin, we compare annual labor earnings of full-time workers who worked more than 30 hours per week (or 1560 hours per year). As shown in the top row of the table, average real full-time labor earnings of male heads of all households declined between 1978 and 1998 and then rebounded over the next 16 years. On net, real average full-time labor earnings for males increased 10% between 1978 and 2014. However, younger male workers appear to have been left out of the labor earnings increase. Specifically, the real average fulltime labor earnings of a millennial male household head in 2014 were about the same as those for a comparable male Generation X household head in 1998 and over 10% lower than those for a comparable male baby boomer household head in 1978. For female heads of all households, real average full-time labor earnings increased moderately between 1978 and 1998 and between 1998 and 2014, reflecting, in part, rising female educational attainment. However, the median

2014

1998

1978

Young households (millennials)

All households

Young households (Generation X)

All households

Young households (Boomers)

All households

Male head labor income

49.5

74.1

49.5

64.0

56.1

67.2

[40.6]

[54.7]

[44.2]

[51.5]

[53.4]

[60.7]

Female head labor income

39.1

46.6

36.5

44.4

35.4

37.1

[31.2]

[39.9]

[32.4]

[38.3]

[33.1]

[33.2]

Family income married couples

78.2

112.0

73.6

103.8

77.5

88.0

[67.2]

[87.0]

[62.8]

[81.0]

[73.6]

[79.5]

Gini male head income

0.35

0.42

0.31

0.37

0.26

0.29

Gini family income married couples

0.35

0.41

0.34

0.40

0.26

0.31

Memo: income inequality

Note: The table reports average labor earnings and family income (levels and inequality) of the youngest working cohort and the population in 1978, 1998, and 2014. In 1978, the youngest cohort was the baby boomers; in 1998, it was Generation X; and in 2014, it was the millennials. Male head labor income refers to single males or the male spouses of married couples. Female head labor income refers to single females. Family income refers to married couples. Items in brackets are medians. Source: Panel Study of Income Dynamics, Public Use Data Set. Produced and Distributed by the Institute for Social Research, University of Michigan, Ann Arbor, MI, 2018.

202 Handbook of US Consumer Economics

TABLE 8.1 Real income by year and generation (thousands of 2016 dollars).

Are millennials different? Chapter | 8

203

labor earnings of female millennial household heads in 2014 were about 3% lower than those of comparable female Generation X household heads in 1998. For families, the data show that real income of married couples grew, on net, from 1978 to 2014; this trend is seen in the sample of all households and in the sample of households headed by individuals younger than 33 years old and likely reflects the rise in the female labor force participation rate and the increase in the prevalence of dual income households. However, the net growth of real family income was smaller for young married couples than for married couples of all ages during this span of years. The PSID data also show that income inequality has increased considerably during the past few decades. For the entire PSID sample, the Gini coefficient increased from about 0.3 to above 0.4. Income inequality among adults younger than 33 years largely shows the same pattern; the Gini coefficient for millennials in 2014 is around 0.35, which is higher than the 0.25 for young baby boomers in 1978. How much of the differences in real average income between millennials and the earlier cohorts at comparable ages are due to differences regarding race and educational attainment? To address this question, we estimate the following linear regression model using pooled PSID data of households headed by individuals younger than 33 years from 1974 to 2014: LogðYi Þ ¼ a þ b1 GenX þ b2 Boomer þ gZi þ εi :

(8.1)

The model is estimated separately for male heads of household, for female heads of household, and for families. Yi denotes full-time labor earnings for household i. GenX and Boomer are generation cohort dummies, which makes millennials the omitted group. Z is a vector of demographic variables that includes age bins, educational attainment dummies, race, marital status, and family size.15 The estimated b1 and b2 coefficients for each gender/family group are reported in Table 8.2. As shown in column 1, average real labor earnings for young male household heads working full time are 18% and 27% higher for Generation X and baby boomers, respectively, than for millennials after controlling for age, work status, and a number of demographic variables. For young female heads of household working full time, these generational gaps in labor earnings are in the same direction but somewhat smallerd12% and 24%, respectively. For family income, the regression shows that Generation X and baby boomer households have a family income that is 11% and 14% higher, respectively, than that of demographically comparable millennial households.

15. We include age bins in the regression to capture any remaining changes over time in average age of the 37- and younger population.

204 Handbook of US Consumer Economics

TABLE 8.2 Regression analysis of income by gender/family status and generation. Gender or family status

Variable

Male labor income (1)

Female labor income (2)

Family income (3)

Generation X

0.181***

0.120***

0.114***

(0.010)

(0.015)

(0.014)

0.274***

0.243***

0.139***

(0.009)

(0.014)

(0.013)

Age bins

Yes

Yes

Yes

Race

Yes

Yes

Yes

Education

Yes

Yes

Yes

Work status

Yes

Yes

Yes

Marital status

Yes

Yes

Yes

Family size

Yes

Yes

Yes

R-squared

0.19

0.22

0.17

N

51,228

16,494

39,387

Boomers

Control for

Memo

*** denotes statistical significance at the 99% level. Note: The table presents coefficient estimates from regressions of various income measures on cohort dummies and household socioeconomic and demographic characteristics. The sample includes three generationsdbaby boomers, Generation X, and the millennials. In each regression, the millennials are the omitted group. Source: Panel Study of Income Dynamics, Public Use Data Set. Produced and Distributed by the Institute for Social Research, University of Michigan, Ann Arbor, MI, 2018.

4.2 Debt We now study the debt portfolios of millennials using data from the Federal Reserve Bank of New York Consumer Credit Panel/Equifax (CCP), a panel that covers individual borrowers that have a credit score. Because the panel is relatively new, we are only able to compare the borrowing of millennials with that of Generation X members.16 Table 8.3 presents the summary statistics for

16. The CCP data set starts in 1999, and it began reporting student loan data in 2004.

TABLE 8.3 Real liabilities by year and generation (thousands of 2016 dollars). 2017

2004

Young individuals (millennials)

All individuals

Young individuals (Generation X)

All individuals

Mean

43.7

47.9

49.0

48.1

Median

19.6

21.3

23.1

22.4

Share (pct)

81.8%

73.4%

80.2%

74.5%

Mean

24.3

34.0

33.7

36.1

Median

105.4

89.4

94.0

81.8

Share (pct)

18.6%

27.0%

27.7%

31.4%

Mean

5.3

4.5

5.2

4.1

Median

10.2

10.6

11.6

11.4

Share (pct)

40.4%

32.1%

35.5%

27.7%

Mean

2.3

3.0

3.4

3.9

Median

1.8

2.1

2.5

2.4

Share (pct)

58.3%

55.0%

58.4%

57.4%

Mean

10.6

5.0

4.5

1.8

Median

17.9

16.4

12.8

10.5

Share (pct)

33.5%

16.3%

19.8%

9.1%

Mean

1.2

1.4

2.2

2.2

Median

1.4

1.3

1.6

1.3

Share (pct)

32.0%

32.2%

41.5%

38.2%

No. of open accounts

3.8

3.9

4.0

4.1

No. of inquiries

2.8

2.1

4.5

3.2

Liability type Total debt

Mortgage

Auto

Credit card

Student

Other debt

Memo

Note: The table reports various components of household liabilities for the youngest working cohort and the population in 2004 and 2017. In 2004, the youngest cohort was Generation X, and in 2017 it was the millennials. Median values are conditional on having a positive balance. Source: Federal Reserve Bank of New York Consumer Credit Panel/Equifax.

206 Handbook of US Consumer Economics

borrower debt in the third quarters of 2004 and 2017; both years are from periods considered credit expansions.17 For the full CCP sample, the average and median real total debt balances for individual borrowers and the shares of the population holding debt were little changed between 2004 and 2017. Specifically, the share of consumers with a positive debt balance fell by only 1 percentage point (74.5% in 2004 vs. 73.4% in 2017), and the average total debt balance declined by only $250 ($48,123 in 2004 vs. $47,872 in 2017). In addition, the average number of open accounts per consumer edged down just slightly between those years (4.1 in 2004 vs. 3.9 in 2017). For younger borrowers, the comparison of average indebtedness shows a more substantial decline between 2004 and 2017. The real average total debt balance was around $49,000 for Generation X members in 2004 and about $44,000 for millennials in 2017. Median debt levels, conditional on having a nonzero balance, tell a qualitatively similar story. Later in the chapter, we use the SCF to study household balance sheets and find debt levels are not appreciably different between millennials and members of Generation X at similar ages. The lower levels of debt outstanding for millennial borrowers in 2017 compared with Generation X borrowers in 2004 mainly reflects lower mortgage debt, although millennials also have significantly less credit card loans and miscellaneous other debt. In 2004, 28% of Generation X members had a mortgage, well above the 19% share of millennials that had one in 2017. Accordingly, average real mortgage balances were considerably lower for millennials in 2017 than those for Generation X members in 2004 ($24,000 vs. more than $33,000). That said, the median mortgage balance for millennial mortgage borrowers in 2017 was somewhat higher than that for Generation X mortgage borrowers in 2004 ($105,000 vs. $94,000), reflecting, in part, the net increase in real house prices during the same period. For auto loans, contrary to the stories in the popular press that millennials have a more subdued demand for cars than members of earlier generations, the Equifax/CCP data show that 40% of millennials had an auto loan in 2017, compared with 36% of Generation X members in 2004. The mean outstanding balances on auto loans in the two cohorts are similar at about $5200. One loan category for which millennials in 2017 had a notably higher average balance than Generation X members in 2004 was student loans. While only 20% of Generation X members had a student loan balance in 2004, more than 33% of millennials had one in 2017. Moreover, the median balance among student loan borrowers was substantially higher for millennials in 2017 than for Generation X members in 2004 (over $18,000 vs. $13,000). Accordingly, the average student loan balance for millennials in 2017 was more than double the 17. The year 2004 was in the middle of the credit expansion prior to the 2008 financial crisis, and the credit market has largely recovered to the precrisis situations in 2017, with the exception of mortgage lending to subprime borrowers.

Are millennials different? Chapter | 8

207

average loan balance for Generation X members in 2004. The rise of student loan borrowing among young consumers reflects, in part, the rising real cost of higher education, the increase in college enrollment due to the Great Recession, and the increasingly limited capacity of parental contribution. Both credit supply and credit demand factors may have contributed to the lower amount of borrowing by millennials in 2017 compared with Generation X members in 2004. On the supply side, while financing conditions improved substantially from the aftermath of the 2008 financial crisis, lending standards remained tight for quite some time in certain market segments, such as mortgage and credit card lending to subprime borrowers. Many millennials in 2017 still lack solid credit history due to their young age and may face additional headwinds in credit markets. On the demand side, the financial crisis may have made some consumers more averse to debt, as they witnessed the destructive effects of overborrowing on personal finance and the economy in general. Indeed, the number of inquiries on credit histories, a common measure of demand for credit, was substantially lower in 2017 than in 2004, and the decline is particularly pronounced when comparing millennials with Generation X members.18

4.3 Assets and net worth We next turn to household asset holdings and net worth using data from the triennial SCF, which measures these balance sheet variables at the level of the household. Table 8.4 presents summary statistics on these balance sheet measures from three generations at points in time when each was at a similar young age. Specifically, we compare millennial households in 2016, Generation X households in 2001, and baby boomer households younger than 35 years in 1989.19 Because the youngest baby boomers were 25 years old in 1989, we apply the same age restriction to Generation X members in 2001 and millennials in 2016. As shown in the table, there was a substantial increase in mean total assets held by the entire SCF sample from about $400,000 in 1989 to around $785,000 in 2016. However, a comparison of the changes in the mean and median suggests the higher mean asset holding was due to an increase for the wealthiest households, as the median asset holding increase was much more moderate, from near $152,000 in 1989 to about $190,000 in 2017. For younger households, the mean value of assets held by millennials in 2016 was about $176,000, almost the same as baby boomers ($173,000) of comparable ages in 1989 and much lower than Generation X members in 2001

18. The Equifax/CCP data do not include any information regarding consumers’ race, educational attainments, marital status, and family sizes. As a result, we cannot run the conditional regression analysis as we did for income comparisons. 19. As mentioned previously, our analysis is similar to Dettling and Hsu (2014, 2017). Using 2013 SCF data, they find that the median real net worth of young households in 2013 is lower than that of young households in 2001, largely reflecting lower asset holdings.

2016

2001

1989

Young households (millennials)

All households

Young households (Generation X)

All households

Young households (boomers)

All households

176.3

785.0

227.4

612.5

173.2

402.1

Conditional median

55.0

189.5

104.7

200.8

63.3

152.1

Homeownership rate (pct)

40.7%

Asset type Total assets Mean

63.7%

50.2%

67.7%

47.5%

63.9%

House value Mean Conditional median

84.1

191.9

95.1

166.2

79.0

126.6

165.0

185.0

135.5

166.6

119.4

130.6

30.9

215.1

52.1

185.8

31.9

98.8

4.4

8.0

6.8

16.6

5.6

13.9

14.7%

19.7%

27.5%

30.0%

18.0

78.6

Financial assets Mean Conditional median Stock ownership (pct) Mean

10.6

103.7

13.8% 3.5

20.0% 22.0

208 Handbook of US Consumer Economics

TABLE 8.4 Real assets by year and age group/generational cohort (thousands of 2016 dollars).

Conditional median

8.0

52.0

9.3

36.6

5.6

14.9

18.8

119.1

16.8

75.0

6.6

26.4

Conditional median

15.0

60.0

11.5

39.8

7.5

20.5

Vehicle ownership (pct)

82.6%

85.2%

82.5%

84.8%

84.3%

83.8%

17.4

21.6

17.5

21.1

12.5

15.3

16.0

17.0

17.1

18.3

11.4

13.1

10.1

79.3

10.3

57.8

10.4

53.7

71.0

130.0

94.8

94.8

41.0

80.2

15.1

158.0

35.6

106.6

32.9

81.2

20.0

50.0

36.6

67.7

20.5

37.3

Retirement balance Mean

Vehicle value

Conditional median Other real estate Mean Conditional median Business and other assets Mean

209

Conditional median

Are millennials different? Chapter | 8

Mean

Continued

2016

Asset type

Young households (millennials)

2001 All households

Young households (Generation X)

1989 All households

Young households (boomers)

All households

Memo Total debt Mean Conditional median

84.6

95.5

79.4

73.9

59.3

48.7

54.0

60.0

54.2

52.4

31.5

28.0

91.7

689.5

148.0

538.6

113.9

353.3

42.2

133.1

48.0

146.8

38.8

117.3

Net worth Mean Conditional median Gini of total assets

0.70

0.81

0.69

0.75

0.69

0.73

Gini of net worth

0.75

0.83

0.75

0.78

0.74

0.76

Note: The table reports various household balance sheet elements (prevalence, levels, and inequality) for the youngest working cohort and the population in 1989, 2001, and 2016. In 1989, the youngest cohort was the baby boomers; in 2001, it was Generation X; and in 2016, it was the millennials. Conditional medians are conditional on having a positive balance. Source: Survey of Consumer Finances.

210 Handbook of US Consumer Economics

TABLE 8.4 Real assets by year and age group/generational cohort (thousands of 2016 dollars).dcont’d

Are millennials different? Chapter | 8

211

($227,000). Using median asset holdings yields very similar comparisons: The median total assets held by millennials in 2016 is significantly lower than baby boomers in 1989 and only half as big as Generation X members in 2001. The decrease in asset holdings of younger households was widespread across asset categories. Homeownership among this age group, for example, was near 50% for Generation X members in 2001 and about 41% for millennials in 2016. Accordingly, the mean real housing assets for Generation X members in 2001 was $95,000, while it was about $10,000 lower at $84,000 for millennials in 2016. Holdings of financial assets such as stocks have also declined notably for younger households in recent decades. The young Generation X households in 2001 held, on average, more than $52,000 financial assets, nearly $22,000 more than millennial households in 2016. However, millennials in 2016 held more in retirement savings than other cohorts at comparable ages; this change likely reflects, in part, the replacement over time of defined-benefit retirement pensions with defined-contribution retirement accounts. For debt, millennial households in 2016 held, on average, about $85,000, slightly more than the $79,000 than was held by young Generation X households in 2001, a finding that was noted by Dettling and Hsu (2014) and subsequently by Chien and Morris (2018).20 That said, the median levels of indebtedness among households with nonzero balances are about the same for millennial households in 2016 and Generation X households in 2001. These young millennial and Generation X households have both taken on more debt than did the baby boomers when they were at a similar age. Turning to net worth, which puts together the asset and debt comparisons described above, we find that millennials in 2016 have substantially lower real net worth than earlier cohorts when they were young. In 2016, the average real net worth of millennial households was about $92,000, around 20% less than baby boomer households in 1989 and nearly 40% less than Generation X households in 2001. Finally, we note that while the inequality of assets holdings and net worth has risen appreciably for the entire population during the past few decades, these inequality measures were largely flat among the younger households.

5. Comparison of consumption behavior by generation Having compared the demographic, economic, and household balance sheet factors of millennials with those of members of earlier generations, we now contrast the consumption patterns of these generations. We first present views of household consumption from the CE survey. Next, we test for differences 20. Chien and Morris (2018) define the generations with birth years that differ slightly from those used here.

212 Handbook of US Consumer Economics

FIGURE 8.5 Real annual expenditures per household by age and generation. Note. Annual sum of total reported household expenditures. Nominal purchases in each quarter are deflated with the chain price index for personal consumption expenditures (PCE). Generation is determined by the birth year of the male spouse, if present, or the head of household. Observations that exceed $200,000 are not shown in the figure. Source: Consumer Expenditure Surveys (1986e2016) from the Bureau of Labor Statistics and the PCE price index from the Bureau of Economic Analysis.

between generations in total household consumption. Lastly, we test for differences between generations in consumption of selected categories of household spending, such as automobiles, housing, and food.

5.1 Household spending in the CE survey by age and generation A natural way to organize consumption microdata is to construct life-cycle consumption profiles by birth cohort. Fig. 8.5 shows real annual total consumption for all households in our sample of 1986e2016 CE surveys. The generation of the household is denoted by the shading of the dots and their location along the age axis: Black dots at younger ages represent spending by millennial households, gray dots at intermediate ages represent spending by Generation X and baby boomer households, and black dots at older ages represent spending by the Silent Generation and Greatest Generation households.21 What is notable about this figure is that data points from each generation occupy a limited range of ages in the life cycle. This pattern reflects the limited time frame over which some generations are observed in this consumer survey, and it also underscores the importance of using data 21. Some households reported high levels of spending that exceeded the top of the Y-axis on the graph. One drawback of this visualization is that many of the Generation X spending points overlap with those of the baby boomers, etc.

Are millennials different? Chapter | 8

213

with a long history to make intergenerational comparisons. For example, with the oldest millennials having only just turned 37 years old in 2018, researchers are going to have to wait about 35 more years until the data allow a direct comparison of older millennials to today’s oldest baby boomers. The central tendencies of these dots organized into annual averages by age and generation are shown as lines in Fig. 8.6. The real consumer spending profiles for each generation in Fig. 8.6 differ somewhat from one another but generally show the hump shape that peaks at around $60,000 at age 50, consistent with other well-known depictions of life-cycle consumption patterns. Beyond that point, household consumption tends to decline with age. Importantly, it looks quite plausible that a single hump-shaped curve could adequately describe the collection of total household spending profiles, although the spending profiles seem to fan out a bit more for older age bins, which are primarily covered in our data by baby boomers and members of the Silent Generation and the Greatest Generation. Baby boomers also appear to have maintained a higher (and flatter) consumption profile as they have moved into retirement compared with members of the Silent Generation and the Greatest Generation. But a casual assessment of the curves in Fig. 8.6 suggests that the spending profiles of the three generations for which our data provide a fair amount of overlapdmillennials, Generation X members, and baby boomersdappear to be quite similar.

FIGURE 8.6 Real average annual expenditures per household by age and generation. Note. Averages are based on the survey weights for the consumer unit (averaged over the four surveys) and the birth year of the head of household. Nominal purchases in each quarter are deflated with the chain price index for personal consumption expenditures. Source: Consumer Expenditure Surveys (1986e2016) from the Bureau of Labor Statistics and the PCE price index from the Bureau of Economic Analysis.

214 Handbook of US Consumer Economics

5.2 An empirical assessment of generational consumption patterns Before we analyze these household consumption profiles more formally, we should point out some caveats about the construction of spending profiles and the CE data. First, the CE survey asks about consumer spending at the household (or, more precisely, consumer unit) level, and it aggregates the spending of all household members, who may vary in age. Because we use the age and generation of the head of household to represent the spending of the whole household and cannot separately track the spending of young adults who live with their parents, the panel we constructed is not well suited to study the effects on consumption of the changes over time in household formation patterns, which could potentially be an important difference between millennials and the households of earlier generations when they were young.22 Second, the construction of household consumption profiles by age group and generation requires a fair amount of parsing, which can be challenging even in a reasonably large data set such as the CE survey. Table 8.5 shows the number of consumer units of each generation that are in each of the age bins shown in Fig. 8.6. The number of households in the full sample from 1986 to 2016 is 129,200, which includes 52,000 baby boomer households, 30,000 Silent Generation households, 24,000 Generation X households, and less than 6,000 millennial households. The main reason why the sample of households headed by millennials is small is that this generation is still fairly young and is represented in the CE survey only near the end of our sample. Looking over the age bins, millennials are reasonably well represented in the 20 to 24 and 25 to 29 age bins. The starting point of our data set in 1986 similarly limits our ability to observe earlier generations at younger ages. For example, we do not observe many baby boomers younger than 30 years or Silent Generation members younger than 45 years. A consequence of these considerations is that the sample averages are noisier for millennials than for members of earlier generations and for younger households than for older households. Third, the CE collects data on household expenditures instead of consumption. Household expenditures and consumption, though intimately related, are not identical, and the differences therein are more important for housing, vehicles, and other durable goods. For example, the purchase of a car is not the same as consuming the stream of services from acquiring the asset. Bearing this distinction in mind, we will use the two terms interchangeably throughout this section. Taking those caveats into consideration, the average household spending shown by each profile in Fig. 8.6 reflects the effects of factors such as age, cohort preferences, the state of the business cycle that prevailed for each generation at each age, and measurement error and sample variability. The differences in the profiles of adjacent cohorts at comparable ages reflect the 22. See Paciorek (2016) for a discussion of the recent patterns in household formation.

TABLE 8.5 Households in the 1986e2016 consumer expenditure surveys by age and generation. Generation Generation X

Boomers

Silent

Greatest

Age

(1981e97)

(1965e80)

(1946e64)

(1928e45)

(1915e27)

0)

6.4%

4.1%

0.2

0.5

Contraction (Nowcast growth index 0) Contraction (Nowcast growth index 20 years for most

w5 years

Rich

Difficult to obtain rich data for new questions

Richness easier to obtain with creative sampling and deeper understanding of data

The traditional data source is aggregated whereas the web-scale data may or may not be aggregated as per researcher requirement.

284 Handbook of US Consumer Economics

“Are you working?,” “Did you make an attempt in the last 4 weeks to find a job?” (Baumol, 2013). The benefits of such traditional data are the careful design, with a stratified sample that ensures balanced demographics and continuity over time. However, the survey is only conducted monthly, and due to the low sample size, it may be difficult to tease out if young black males are more discouraged versus white females who are retiring because of the higher returns in the stock market affecting their wealth portfolio. Additionally, based on the BLS data, it seems that the response rate for voluntary household surveys such as consumer price index (CPI-housing) has been declining over time (Fig. 11.7), whereas the number of tweets and searches over time as well as the percentage of people who tweet has been increasing substantially (Fig. 11.8).8 In contrast to the monthly household survey, there are about 500 million tweets per day, and even if 1% of those are relevant to the US labor economy, it is around 5 million responses which is orders of magnitude larger. The other difference is that since people are free to tweet about any topic they like (Proserpio et al., 2016), they would mention some of the issues affecting them when they become important rather than wait for an official to modify the survey to include them.

FIGURE 11.7 BLS Household survey response rates are declining over time. ATUS, American Time Use Survey; BLS, Bureau of Labor Statistics; CE, Consumer Expenditure Survey; CPI, Consumer Price Index; CPS, Current Population Survey; TPOPS, Telephone Point of Purchase Survey. Source: BLS. https://www.bls.gov/osmr/response-rates/home.htm.

8. Of course, the growth of number of twitter, search, and other alternative data sources will slow down and these numbers are not expressed as a percentage of population, but the world population did not increase by 10 from 2010 to 2018, thus the growth rate for active twitter users is a good indicator of the increasing penetration in the population.

Macro forecasting using alternative data Chapter | 11

285

FIGURE 11.8 Number of active twitter users has grown dramatically over the last 8 years. Source: https://www.statista.com/statistics/282087/number-of-monthly-active-twitter-users/.

Quick warning: The alternative data sources that we mention here such as twitter, search, and click data suffer from their own problems such as spam, a high degree of noise, and the lack of independence in the social data sources where popular or trending items receive a higher ranking and hence become positively autocorrelated over time by construction. The fact that this “warnings” paragraph is small should not be taken to mean that alternative data are a panacea, or that the issues stated above are a comprehensive list; it simply reflects the fact that this is an exciting but immature area. We are still learning a lot more about these data, and it is too soon to have very firm opinions or even stylized facts that are commonly believed. Additionally, some of the practical difficulties of using such alternative data will become more obvious in the case study that is presented as a major part of this chapter. Traditional and alternative data sourcesdcomparing information contents and evaluating complementary usage: It appears that we can compare the information content of two data sets in a reasonable waydsay we are examining employment data and decide to compare the information about the employment sector in the household survey data and the self-reported employment-related tweets. We can check if the alternative data source is statistically significant in predicting the future household employment data release after controlling for Wall Street expectations to account for knowledge in the public domain as well as the previous household employment data releases to account for trend effects. There are two main problems with this: the first a statistical one, the alternative data series are short and noisy rendering the tests not so powerful and if we increased the number of alternative data series, it becomes even more difficult, and the second, a philosophical one of the noisy government data (the household employment survey) being held as

286 Handbook of US Consumer Economics

the gold standard. Additionally, the alternative data contain more detail and rich information from the entire population, and we may not be looking at the “right features.” One may argue that the government data move the markets and are important and that is a reasonable argument (in fact we advance this as well), but there may still be a circularity. If we all believe that the central bank responds to the small monthly data collected every month, those data, even if more inaccurate, will continue to have a price impact on the market where central banks can matter and be self-fulfilling.9 We could imagine a scenario where keeping everything else the same, if we all start believing that the government will respond in case a sizable number of citizens communicate that they find the labor market “difficult” and that the searches for jobs, as well as tax receipts and credit card purchases, are decreasing dramatically, those data might have a higher price impact. Thus whatever data we pay attention to as a society also becomes important. Recognizing those limitations, in practice, we look for “economic intuition” or a story about why any data matters in addition to the market-based tests described above. In our experience, these two types of data end up being complements. Functionally, having the entire population inform us of something (however subjectively) captured as alternative data ends up being more informative and exposes new trends or interesting issues, and rich and frequent detail that the researchers or policy officials may not have thought of, whereas carefully, objectively collected data that are comparable across various cycles (long time series) and cross-sectionally can be helpful in calibrating the policy response. We believe that some of alternative data will become more mainstream as they prove their usefulness over time and thus a better understanding of the substitutability and complementarity between both types of data sources will emerge. Usefulness of Uncorrelated Noise: The BLS NFP number tends to have a 90% confidence interval of 115,000 while the average release is around 90,00010 (Table 11.2). Thus, alternative data that do not rely on the same industry surveys for their data could help make the overall growth estimate more accuratedfor instance, in the nowcasting frameworkdby providing data about employment from an uncorrelated source. This lack of correlation would cause the standard error to decrease when we combine these sources

9. One can argue that the efficient market will correct for it, but if there is a large agent (central bank or a government) whose objectives are not purely economic, and there are limits to liquidity that argument may not apply. There are arguments if a central bank is even relevantdin that case, I suggest reading Romer (2016), where he discusses the role of the Federal Reserve in curbing inflation. Of course, the market may eventually converge to better data by yielding excess returns to participants who collect the better data, which is the mechanism by which the market approaches efficiency. 10. https://www.bls.gov/news.release/empsit.tn.htm.

Macro forecasting using alternative data Chapter | 11

287

appropriately. Continuing with the example above, if we could get a similarly noisy but completely uncorrelated data source, the confidence interval for the average NFP release would shrink to w 81,000 and the release of 90,000 which was not statistically significant at the 90% level would now be a significant increase in employment, thus helping us have a more accurate picture of the employment sector.

5.1 The microfoundations of macro: alternative data in the context of the Lucas and Romer critique To bring out the potential of alternative data, it might help to take a very brief (and extremely incomplete) tour through the history of economic modeling that nevertheless is useful because it provides a flavor of the challenges faced by economists when modeling our complex economic machine. With macroeconomic crises and technological progress, models may come and models may go but the need for better data goes on forever!11 There are two main kinds of models in macroeconomics: structural that explicitly uses economic theory and nonstructural that “let the data speak.” Reiss and Wolak (2005) provide a good framework to understand both. They take the case of a set of observable “endogenous” variables, y, that are related to another set of observable “explanatory” variables, x. In their framework the nonstructural approach would measure both x and y, and then estimate a conditional density using the appropriate statistical methods. The structural approach would seek to clarify how institutional and economic conditions might affect x and y and try to embed that in the model. Thus, the task of the structural model is to clearly make connections “between institutional, economic, and statistical assumptions.” The benefit of a structural model is increased context, and the inevitable consequence by design is that the structural approach has some embedded theory about how the “economic machine” should work. We human beings exhibit adaptive and complex behavior, which makes the task of modeling our aggregate (economic) behavior challenging if not impossible. For any model, there is a tension between nailing down what is happening (positive) in all its detail and what should happen (normative).12 Diebold (1998) presents a 10,000-ft view of how macroeconomic forecasting developed over time in an excellent survey paper “The Past, Present, and Future of Macroeconomic Forecasting.” He outlines how there is an “interplay between measurement and theory” and how this interplay affects the

11. Apologies for the repurposing of Alfred Lord Tennyson’s poem, The Brook. 12. Modelers are not unaware of the difficulty, and hence there is a common saying, all models are wrong but some are useful which is attributed to statistician George Box (1976). I personally find two of his phrases quite useful: a good model should seek to find an “economical description of natural phenomenon” and be “simple but evocative” (1979).

288 Handbook of US Consumer Economics

nonstructural and structural approaches to forecasting. We can see the application of Karl Popper’s philosophy of empirically motivated critical thinking in the field of macroeconomic modeling in Diebold’s paper. Diebold’s narrative follows Popper by showing how theories evolve and become more popular as they better explain current phenomenon than older models, and models seem to change more after macroeconomic crises! Keynesian macroeconomic theory was written during and after the great depression of 1930s, as Keynes advocated the use of fiscal and monetary policy to mitigate the adverse effects of downturns in aggregate demand. In response to theory, measurement techniques evolved and many statistical contributions such as those by Fisher, Neyman, and Pearson were made. However, economists were not satisfied with the Keynesian “systems of equations” approach as Prescott (1986) named them. This approach involved ad hoc postulated decision rules about consumption, investment, and treatment of expectations and did not have fundamental justification for “sticky prices” and “sticky wages.” Along with the intellectual dissatisfaction, when empirical facts diverged from the theory in the 1970s, it sent the profession in a new direction of incorporating rational expectations. In the 1970s, supply shocks resulted in the economy experiencing both “high inflation” and “low growth”da phenomenon that was deemed unlikely in the Keynesian aggregate demandedriven paradigm involving trade-off between high growth and high inflation. In his now famous “Lucas critique,” Lucas and Robert (1976) argues that “optimal decision rules vary systematically with changes in the structure of series relevant to the decision maker,” where the series refers to the relevant time series that individuals would consider while making their decision. This observation from Lucas is at the heart of the issue he raises about the “clearcut conflict” between the structural and nonstructural modeling. Lucas outlines the policy maker’s problem with nonstructural models by pointing out that purely econometric models use aggregate past data and are successful at predictions “in principle, can provide no useful information as to the actual consequences of alternative economic policies.”13 Diebold outlines how economists such as Fair and Taylor incorporated more “realistic behavior” in the “systems of equations” approach in the form of rational expectations into econometric models as well as a better feedback loop by including model assessment and fit into the framework. Thus rescued, these models are used by policy organizations including central banks. The same energy shock that sowed seeds of doubt about the existing theory resulted in more interests in nonstructural approaches. Diebold mentions an important paper by Sargent

13. Lucas provides several examples, including one of estimating actual consumption of a consumer via permanent income hypothesis. In this example, a purely statistical model to predict consumption based on past data would not capture policy changes that are known in advance, but a theoretical model would have better success.

Macro forecasting using alternative data Chapter | 11

289

and Sims (1977) titled “Business Cycle Modeling Without Pretending to Have too Much a Priori Theory.” Then Diebold moves to the theoretical flavor of the day dynamic stochastic general equilibrium (DSGE) models that allow for rational agents to make optimal decisions rather than rely on an “ad hoc” system of equation. In this neoclassical framework technology, shocks by and large explain the business cycle. Fast forward to the global financial crises of 2008, and the profession is again filled with voices for change. It is in this context that Romer’s (2016) critique can be examined. Romer makes the point that the current modeling machinery has gotten so cumbersome that model parameters cannot be statistically identified and the attribution of most of business cycle fluctuations to technology shocks reaches “bewildering conclusions.” This path of macroeconomic theory and modeling is normal for any science, whether we understand these events as the paradigms in Kuhn’s framework exposited in “The Structure of Scientific Revolutions” or we think about the incremental and better empirical content test of Popper (1961). What is common in all these tremendous intellectual contributions, particularly in the critiques from both the Nobel laureates (Lucas, 1995; Romer, 2018) is a desire to get to a “fundamental understanding.” This fundamental understanding of human behavior and embedding it into an economics framework have been a common dream of researchers often called the “microfoundations of macroeconomics” (for example, Barro, 1993). In his critique, one of the issues Romer takes with rational expectations and DSGE models is about the lack of the microeconomic evidence: “There is no microeconomic evidence for the negative phlogiston shocks that the model invokes.” The future? Now imagine, consumers were empowered to communicate real-time and individual statements about their understanding of the government’s policy and state their own response! The resulting analysis of policy impact would be unusually rich because of the individual level actions as well as a record of how people got to their decision. People would self-report (to the extent comfortable) their emotions as well as the information they sought in making decisions. Thus, we could actually see how temporary or permanent is do they believe the shock to the income is and we would also get a bit more microeconomic evidence on how people behave. Such research is now a reality and herein lies perhaps the biggest promise of these “web-scale” alternative data which capture for the first time, voluntary, individual user generated, selfreported data at a large and accessible scale. Thus, we can supplement purely theoretical arguments and our assumptions with better data. Over time as we get newer and more detailed microdata that truly describe the behavior of individuals, perhaps our theories would improve. These microfoundations of user behavior might eventually help us better predict responses to policies that cannot be modeled purely statistically as Lucas (1976) suggests, or as Soros’ reflexivity (2009) might imply, the data will change in response to the actions taken! In any case, these data will lead to interesting research.

290 Handbook of US Consumer Economics

Some caution: If we take this idea of more data to its logical conclusion, it seems that we could fulfill the dream of an accurate economic map in the spirit of the humongous cartographic maps that Borges (1658) refers to in “On Exactitude of Science.” Here we must be careful and realize that simply more frequent data may not be enough, for example, common sense would suggest that measuring how one particular business performs on a minute-by-minute basis in an economic expansion 1 day is unlikely to clarify how that business will fare in a world with different tariffs and an economic recession. However, a detailed model of individual consumer preferences and their sensitivities to price might increase our ability to price a product optimally or predict better than before how they might react to a competitive product. Additionally, it is critical to have consumers understand the value of their data as well as their rights regarding it to prevent abuse of this powerda kind of “data colonialism” (Couldry and Mejias, 2018). Using individual level psychological variables to predict national unemployment: An example of taking advantage using alternative microdata to understand more “fundamental behaviors” that lead to macroeconomic prediction can be found in Proserpio et al. (2016) where the paper predicts 1 month ahead national level unemployment rate for the United States using individual twitter data over time. Proserpio et al. (2016) analyze more than 1.2 billion tweets by over 230,000 individual users from 2010 to 2015 who lost or gained a job using a “differences in differences” estimation. They analyze the effect of job loss and gain on intuitive psychological variables such as anxiety, sadness, and anger and then define a behavioral macroeconomic model that selects the relevant variables using stepwise regression. Using only the psychological variables, their model predicts the national unemployment rate for the United States with half the MSE versus an autoregressive (AR) model. Interestingly, they also find a significant correlation between the average level of anger in the population to how difficult jobs are to obtain (“Jobs Hard to Get” measure by the Conference Board in their consumer confidence survey) (Figs. 11.9 and 11.10). Such studies were not possible, say a decade before, where not only would the lack of computational power be an issue but these types of self-reported data from large groups of population at such a high frequency were not easily available. This growth in computational power and data has enabled a rich literature in the field of social computational sciences and in the more traditional domains of economics.

6. An framework for alternative data The JPM Big Data and AI Strategies Guide (2017) suggests there are three main types of data available: individual activity, business processes, and sensor data. Examples of individual activity data would be data like social media, news and

Macro forecasting using alternative data Chapter | 11

Achievement

1.45%

291

Anger

1.50% 1.40% 1.45%

1.35%

1.40%

1.30%

1.35%

1.25% 1.20% –12–10–8 –6 –4 –2 0 2 4 6 8 10 12

Employment

Unemployment

–12–10 –8 –6 –4 –2 0

Employment

2

4 6

8 10 12

Unemployment

FIGURE 11.9 Users’ psychological variables before and after the employment economic shock which happens at time 0 (in months). From Proserpio, D., Scott, C., Jain, A., 2016. The Psychology of Job Loss: Using Social Media Data to Characterize and Predict Unemployment. ICWSM.

FIGURE 11.10 The rolling 1-month-ahead predictions of US unemployment rate from the behavioral macroeconomic model that relies only on the psychological variables from 1.2B tweets. From Proserpio, D., Scott, C., Jain, A., 2016. The Psychology of Job Loss: Using Social Media Data to Characterize and Predict Unemployment. ICWSM.

reviews, and web searches, business process data would be data like credit cards or accounting data, and sensor data would be satellite data. The JPM guide estimates the current size of the Big Data, related technology and analytics market at $130B, and it is expected to grow to over $200B by 2020. There are more than 500 data providers, and they fall into providers of raw data, semiprocessed data, and final predictions or signals for investment industry. The formal framework we will adopt in this chapter is something my team uses in practice. This framework is based on our goal of modeling macroeconomic data, and we will provide both practical and theoretical motivation

292 Handbook of US Consumer Economics

for it. In a practical sense, imagine we have a certain data budget and we are trying to assess how much money should we pay for a source of data a vendor is presenting to us. A natural question we would ask is: How much incremental information will this data source provide relative to the amount of money we would pay for it?

In the context of modeling various sectors of the economy (or the stock market) one can divide this incremental information question into two subquestions: Question 1: How much extra information (edge) versus existing data sources does a data set provide for a given sector or a stock? Question 2: How many sectors or stocks (breadth) does it cover? If we are thinking in terms of cross-sectional stock prediction, these questions will appear to be in the spirit of the “fundamental law of active management” (Grinold, 1989) which measures the “information ratio” or volatility-scaled excess active returns achieved by an investment manager that can be described as the information coefficient (edge or skill) multiplied by the breadth (typically the square root of the number of independent forecasts). More formally, the precision and recall language is used in the field of machine learning (ML) and is also related to type I and type II errors of statistics. Precision ¼ true positive=ðtrue positive þ false positiveÞ

(11.1)

Recall ¼ true positive=ðtrue positive þ false negativeÞ

(11.2)

Thus precision (say in a classification problem sense) is the fraction of values identified in a certain class that do in fact belong to that class, and recall is the fraction of values of the certain class identified by the classifier. The sense in which we are using precision and recall in the data context is slightly different than a classification problem. When evaluating a data source, precision and recall refer to our expectation of the incremental value generated from the data source based on how we believe data intrinsically relate to the phenomenon we are trying to model. For instance, we are aware that most people use credit cards, and all firms fulfilling a certain asset size criterion with some equity securities must file accounting statements like 10-K and 10Q. These time series for the accounting and credit card data also covers several business cycles which are helpful for modeling purposes. In addition, we may have beliefs about the quality of the data: How precisely do they actually measure the phenomenon. Does this credit card data provider accurately capture the spending on most consumers? We may match it to existing data and verify various statistical properties of the data. In practice, substantial due diligence is required before making a data purchase and even with continued

Macro forecasting using alternative data Chapter | 11

293

FIGURE 11.11 Classification of data sources along expected precision and recall axes. Italics font, ex ante data; regular font, ex post data.

usage. After the purchase, we match if our expectations of the information content of the data were “well formed“ (Fig. 11.11 and Table 11.4). For purposes of exposition, we have divided the data into four main boxes with high and low precision, and high and low recall based on the author’s beliefs. For instance, we believe that currently satellite data is helpful only for a few sectors of the economy such as retail or energy since it can capture the number of vehicles parked in malls or the equipment being used by energy companies.14 Over time, we expect that technological improvements such as better satellite pictures of cloudy days will increase the precision of data and the time series will naturally grow in length and cover more business cycles. Additionally, intelligent data aggregation has tremendous potential to increase the recall or the population coverage of alternative data. For instance, if we want to get a better understanding of the overall economic activity, we will benefit from matching credit card data, tax data, and companies accounting data by each district. Social or user-generated online data such as search and tweets are different in one crucial aspectdthey are more predictive of user’s future actions (ex ante) rather than an extremely fast capturing of the actions that have already occurred (ex post). To take the example of search and employment, as Chancellor and Counts (2018) noted80% of job seekers use search engines to 14. There are always game theoretic aspects to these or almost any data, where we may believe that some companies could respond to increase the particular metric being measured (like encouraging more cars to park in its lots) rather than the output (actual sales) for short-term gains or to diffuse the information content in the data. For now, we are ignoring these aspects.

294 Handbook of US Consumer Economics

TABLE 11.4 Classification of alternative data.

Data type

Precision (accuracy for the sector)

Recall (number of sectors covered)

Timing

Comment

Satellite parking lot

Low

Low

Ex post

We expect fast improvements in accuracy in this sector

E-commerce transaction

High

Low

Ex post

Footfall traffic

High

Low

Ex post

Mobile data

Low

High

Ex post

Email receipt

High

Low

Ex post

Product review

High

Low

Ex post/Ex ante

Initial reviews may predict future sales and reviews

Twitter

Low

High

Ex ante

Investor mental state from tweets can help predict future actions

Credit and debit card

High

High

Ex post

Payment systems

Low

High

Ex post

News

High

High

Ex post

Web search

Low

High

Ex ante

Microstructure data

Low

High

Ex post

Tax return data

High

High

Ex post

Accounting data

High

High

Ex post

Mortgage application data

High

Low

Ex post

Point of sale data

High

Low

Ex post

Intercompany payments

Low

Low

Ex post

Order and shipment tracking

High

High

Ex post

People generally search before they act

Some types of data from JPM Big Data and AI Guide (2017). Estimation of precision and recall from the author’s research and judgment.

Macro forecasting using alternative data Chapter | 11

295

find new employment, thus the data source has a high recall. Given the inevitable automation that must occur to process such large quantities of data as a search engine and the arbitrary and ambiguous nature of queries, there is inevitably some noise. However, there is an interesting forward-looking quality to search data. A user currently searching for employment is more likely to find employment (everything else equal) in the future and spend more money on consumption. The increase in search for employment is noisily measured but predictive of the future credit card spending that will be quite precisely measured, but occurs later. Thus, there is a natural complementarity between the quadrantsdespecially the ex ante data and noisy data like search and ex post and precise data like credit cards or accounting. In this chapter, we will mainly explore the application of web search data to predicting macroeconomic variables.

7. Predicting data releases with search data Web searches are queries users type into a search engine to obtain information. Choi and Varian’s (2011) seminal work analyzed aggregated Google Trends (now Google Insights) data using a query index. As they describe it, “the query index starts with a query share: the total query volume for search term in a given geographic region divided by the total number of queries in that region at a point in time. The query share numbers are then normalized so that they start at 0 in January 1, 2004. Numbers at a later data indicate the percentage deviations from the query share on January 1, 2004.” Using these queries, Choi and Varian predicted the future releases of present economy activity’s several indicators such as retail sales, automotive sales, home sales, and travel. They insist they are not forecasting the future but “predicting the present” by merely aggregating and counting faster than the government which releases the various economic indicators mentioned above with a lag. The search data we will use will be based on the Microsoft search engine Bing. Fig. 11.12 shows that the Bing search data do not have large geographic or demographic biases by comparing the population of Bing users to the US population. Fig. 11.13 shows that the market share of Bing over time.

FIGURE 11.12 Demographic and Geographic distribution of Bing searches versus US population. Source: MSFT and US Census. Numbers are in percentages.

296 Handbook of US Consumer Economics

FIGURE 11.13 Search Engine market share over time. Source: comscore.com. Microsoft sites correspond to Bing.

As discussed, employment concerns policy makers as well as investors and individuals. Accordingly, we will model the NFP data release. For modeling NFP, we will follow the careful curation procedure of Chancellor and Counts (2018) instead of the aggregation of Choi and Varian (2011) that tends to lose a lot of the detail and introduces considerable noise due to heavy normalization described above. The data contain a sample of English language queries from January 2012 to March 2018 from mobile and desktop devices. They filter the queries for the appearance of four keywords: “job,” “jobs,” “career,” and “careers.” Other terms such as “employment” generated more false positives and hence were dropped. Subsequently, a high-level job category classifier that follows the BLS job classification on a by and large basis was used. There are 18 potential categories as well as the generic job query category that just captures the nonspecific queries like “job in Seattle.” Two researchers generated an initial set of 500 job titles take from BLS and Census data and searches of job sites such as Glassdoor.com. Subsequently, the two researchers hand-annotated which of the 18 BLS categories, the random sample of 250 queries matched. These lists were updated iteratively until they found less than 10 new titles per 250 searches. Additionally, Chancellor and Counts validated these searches on a data set with 10,000 searches by running a string matching system (Table 11.5).

7.1 Why curate? The Google Flu story As is clear from the details of the employment data curation mentioned previously, data curation is a painstaking process. Part of the reason we go

Macro forecasting using alternative data Chapter | 11

297

TABLE 11.5 Examples of searches for the 15 job categories in the employment sector. Raw #

%

Examples from our data set

Generic job queries

184,500,000

82

“Careers in nz,” “fedex careers,” “most commonly asked interview questions for a job”

Job-Specific queries

40,500,000

18

e

Total

225,000,000

100

e

Architecture/ Engineering

648,000

1.6

“Entry level biomedical engineering jobs,” “auto-cad jobs in central IN,” “engineering careers firearms”

Art

1,539,000

3.8

“Freelance writing jobs for beginners,” “voice acting careers,” “winterthur museum curator job”

Business

3,118,500

7.7

“vp of operations jobs,” “marketing and product preference jobs,” “hr career Springfield ma”

Construction

1,134,000

2.8

“Construction laborer jobs in reno nv,” “welder jobs in Wisconsin,” “construction inspection jobs 06,415”

Education

6,520,500

16.1

“Community college professor jobs,” “Atlanta nanny jobs,” “washoe county school district careers”

Finance

4,090,500

10.1

“Financial banking jobs in vt,” “mortgage lender careers,” “medical insurance specialist job”

Food

8,10,000

2.2

Health care

12,555,000

31.0

“Surgical tech jobs,” “mental health jobs westemmass,” “clinic job m jax”

Leisure/ Hospitality

1,822,500

4.5

“Hollywood casino jobs,” “fitness jobs in new hampshire,” “laundry jobs in hotel”

Manufacturing

1,377,000

3.4

“Machine operator jobs in Columbia sc,” “jobs in shipfitting in jacksonville” “machinist jobs in nj”

Retail

1,255,500

3.1

“Electric boat jobs,” “clothing store job applications online,” “retail career at outlets near me”

“Bartender jobs in minneapolis” “foodservice jobs,” “craigslist dishwasher jobs”

Continued

298 Handbook of US Consumer Economics

TABLE 11.5 Examples of searches for the 15 job categories in the employment sector.dcont’d Raw #

%

Examples from our data set

Science

8,50,500

2.1

“Psychology research associate jobs,” “jobs in r&d in dc,” “boston scientific careers”

Technology

1,701,000

4.2

“Computer jobs in the army,” “software architect career,” “sql dba jobs near me”

Transportation

3,078,000

7.6

“Chicago airport runway jobs,” “cdl jobs in boise id,” “railroad jobs in kansas”

Source: Chancellor, S., Counts, S., 2018. Measuring employment demand using internet search data. In: Proceedings of the 35th Annual ACM Conference on Human Factors in Computing Systems. CHI.

through this effort is to guard as much as possible against outcomes like Google Flu as elegantly covered by David et al. (2014) in “The Parable of Google Flu: Traps in Big Data Analysis.” Google built an indicatordGoogle Flu Trends (GFT)dto predict influenza like illnesses (ILI) that were supposed to predict the future Centers for Disease Control and Prediction (CDC) numbers. After initial success, in February 2013, GFT was predicting more than double the proportion of doctor visits for ILI. The authors talk about two main reasons for the failure: the first is big data hubris, the idea that big data are a substitute for traditional data collection that ignore foundational issues of measurement, and the second, the algorithm dynamics that were changed as Google changed its main productdthe search enginedit changed the data generating process. The way we attempt to address the first issue of hubris is to work with subject matter experts and recognize that the data generating process are not accurate and can change. Constant monitoring of data properties as well as getting as much detail as possible on the data generating process (the search engine in this case) are the main methods we use. To address the second issue of algorithm dynamics changing, we update and select our own keywordsdas detailed previouslydrather than rely on a passive usage of the Bing search engine.

7.2 Modeling differences rather than levels When making a model to understand an economic sector better, we have the choice of modeling levels or differences. While, there is no strict rule,

Macro forecasting using alternative data Chapter | 11

299

currently we lean toward modeling differences. The authors Lazer et al. raise a conceptual point about the autoregressive (AR) model capturing 90% of the trend and hence question the potential benefit of using alternative data at higher granularity. They go on to suggest that alternative data may be better for providing details rather than economics metrics of interest. While, we agree with their idea about using search or social data having the ability to provide detail, but believe predicting the deltas or the first difference of (Value (t þ 1)  Value (t)), rather than only Value (t þ 1) may address their primary concern. More formally, imagine Value (t þ 1) which could stand for the level of unemployment has a quite autoregressive nature Valueðt þ 1Þwa þ b  ValueðtÞ þ Eðt þ 1Þ

(11.3)

where b is close to but smaller than 1 and the R sq. is close to 85%e90%, it may be better to define the innovation R (t þ 1) as: Rðt þ 1Þ ¼ Valueðt þ 1Þ ValueðtÞ

(11.4)

We could model R(t þ 1) which will have better statistical properties and considerably less predictability than the AR model. Back to our employment sector, it means modeling the monthly increases in jobs such as NFPs rather than the current unemployment rate. In fact, the delta modeling may also be more interested in economics and finance since it is the unpredictable part of the economic data that matters to economic decisions and moves asset prices rather than the largely known trend as Cochrane (2001) explains. We also find that modeling deltas tends to be more robust. We can imagine that changes in search happen for various reasons discussed previously such as changes in search engine behavior, user technological preferences, seasonality, and the economic phenomenon being modeled as informally summarized in Eq. (11.6) below. DsearchwDsearch engine behavior þ Duser technological preferences þ seasonality þ Deconomic behavior þ eðtÞ (11.5) As Lazer et al. mention that, abrupt search engine behavior change can have major impacts on estimation. We find this to be true when we examine the correlations of NFP to the levels of employment-related searches versus deltas or the month over month changes in searches. If the level of searches is not adjusted for Bing market share which was growing rapidly, we find very different results than if we do adjust the searches for the Bing market share. In contrast to levels, deltas also tend to be more robust to the misspecification as we can see in Fig. 11.14. The grey-colored bars are the correlations with various employment search categories levels and NFP, whereas the red bars represent the correlation between monthly search changes (deltas) and NFP. Both sets of red bars have negative signs across

300 Handbook of US Consumer Economics

FIGURE 11.14 Comparison of correlation of NFP with levels and deltas when adjusting (dark grey) or not adjusting (light grey) for market share. NFP, nonfarm payrolls.

various categories confirming an intuition that as the labor market improves and more people are employed, the number of searches per job decreases (Fig. 11.14). When we examine the grey bars which represent the correlation of the search levels, we see that the search levels change sign depending on the normalization. Additionally, the relationship between employment search and NFP stops having the intuitive negative sign with about half the categories showing a positive sign and the other half a negative sign- essentially the noise has drowned out the signal for employment search levels. However, for the case of employment search deltas whether we adjust the employment searches for the Bing market share or not the results are quite close. Of course, the more formal identification would attempt to estimate Eq. (11.5) or some similar setup.

7.3 Housing, retail, and auto sectors with alternative data Like employment, we curate search data15 for other economic sectors such as home sales, retail sales, and auto sales. The various categories and their correlation with the corresponding government data release and the 10th and 90th percentile of the bootstrapped confidence intervals are displayed in Table 11.6. As an example of the data cleaning process, we show the raw Bing searches related to existing homes. The particular terms related to homes were chosen carefully in a manner similar to employment by a team of data scientists, researchers, and economists familiar with estimating macroeconomic housing statistics. The raw searches display considerable seasonal patterns as well as 15. Ethics Review and Data Protection. This study was found in line with the Common Rule for exemption by the Microsoft Research Ethics Advisory Board under protocol 7. Our data were gathered historically; there was no interactions with users by changing search results. All data were anonymized and aggregated to county level. No session information is used in our data set. Our use and storage of this data is in agreement with Bing’s End User License Agreement and Privacy Policy.

TABLE 11.6 Correlation between curated search features for economic categories and traditional economic data. Median (50th percentile) correlation

10th percentile

90th percentile

Employment (nonfarm payrolls)

Overall employment related Searches

0.30

0.43

0.19

All users

0.18

0.32

0.02

Architecture and engineering

0.37

0.48

0.26

Art

0.20

0.36

0.04

Business

0.22

0.35

0.08

Construction

0.15

0.29

0.01

Education

0.21

0.35

0.05

Finance

0.29

0.40

0.17

Food

0.30

0.42

0.16

Health

0.24

0.35

0.13

Leisure

0.14

0.29

0.02

Manufacturing

0.07

0.21

0.06

Retail

0.15

0.32

0.02

Science

0.04

0.24

0.16

Technology

0.25

0.38

0.12

Transportation

0.26

0.41

0.12

Government

0.30

0.45

0.15 Continued

301

Search field

Macro forecasting using alternative data Chapter | 11

Official data category

TABLE 11.6 Correlation between curated search features for economic categories and traditional economic data.dcont’d 10th percentile

90th percentile

Search field

Home sales (pending home sales)

Existing homes

0.24

0.06

0.40

New homes

0.15

0.01

0.29

Real estate website activity

0.05

0.09

0.20

Home financing

0.25

0.04

0.44

Home refinancing

0.38

0.19

0.55

Home bankruptcy

0.10

0.08

0.27

0.10

0.26

0.10

Home inspection and appraisal

0.17

0.02

0.30

Home foreclosure

0.20

0.04

0.34

Home closing

0.01

0.18

0.18

Realtors

Retail sales (retail sales ex auto and gas)

Beauty services

0.22

0.05

0.37

0.11

0.35

0.15

Restaurants

0.21

0.04

0.37

Noncyclical services

0.18

0.01

0.35

Cyclical services

0.06

0.08

0.21

Durable retail goods

0.15

0.01

0.29

Nondurable retail goods

0.14

0.03

0.31

Discount retail goods

0.26

0.10

0.41

Fast food

302 Handbook of US Consumer Economics

Median (50th percentile) correlation

Official data category

0.21

0.00

0.38

Utilities

0.33

0.20

0.44

Mature entertainment

0.09

0.10

0.29

Motor vehicle

0.12

0.03

0.26

Auto websites

0.61

0.68

0.52

Auto makers

0.51

0.35

0.65

Auto reviews

0.59

0.70

0.48

Auto price

0.23

0.04

0.40

Auto comparison

0.40

0.26

0.51

Auto insurance

0.63

0.54

0.74

Auto financing

0.91

0.89

0.93

New car dealers

0.73

0.62

0.86

Used car dealers

0.78

0.70

0.84

Motorcycle dealers

0.79

0.73

0.84

The bootstrapped 10th and 90th percentile are also presented.

Macro forecasting using alternative data Chapter | 11

Auto sales (annualized total US auto sales monthly)

Shopping malls

303

304 Handbook of US Consumer Economics

FIGURE 11.15 Raw Bing searches related to existing homes. Source: MSFT. January 2012 is normalized to 100 for graphical purposes.

“spikes” which indicate an unusually high volume of searches. We adjust the data for the market share of Bing, winsorize it to remove the high search volumes since the trending nature of searches implies that popular results will be more likely to be shown to users and hence the relationship between the volume of searches and the economic phenomena will be different at different levels of search, remove the seasonality and average the data over a month to relate it to the monthly economic releases from the traditional data sources. Fig. 11.15 shows the cleaned level of search data corresponding to home searches and we relate it to the actual level of the US home sales. The relationship seems reasonable upon visual inspection. The actual modeling does not use levels but the month-over-month changes or deltas for reasons described above. Table 11.6 shows the correlation across employment search features with NFPs, month-over-month retail sales changes (excluding autos and gas), pending homes sales, and auto sales. We see that the correlations follow intuitive signs with searches for employment declining when the economy improves, but searches for cars, housing, and auto features by and large increasing in the same condition (Fig. 11.16). In the next section we use the curated data to make a model.

8. Modeling case study: Nonfarm payrolls For this section, we will imagine an analyst with reasonable proficiency in statistical learning as well as macroeconomics who is tasked with making a model from alternative search data in a live prediction environment. Our analyst faces many decisions, and the frameworks are well revealed from giving specific examples of these decisions.

Macro forecasting using alternative data Chapter | 11

305

FIGURE 11.16 Raw searches for existing homes on Bing (January 2012 is normalized to 100).

Our analyst begins by gathering what are all the possible alternative data sources she is being asked to use. In this case, she first starts with what she believes gives her better ex ante prediction and will then move to complement the noisier ex ante prediction with the higher accuracy and recall but ex post data such as credit cards or accounting. Fig. 11.17 shows the various data sources available as twitter data where we pick specific tweets similar to Proserpio et al. (2016), and for search data we could use impressions which is the content that appears when we type in a search query, or we could also use actual clicks which show a higher intention by the user, also available are the so called “local searches,” which are searches near the user’s zip code, which can indicate a higher intention for the information sought, whether it is purchasing a local

Retail Sales

Housing Stats

Employment

Auto Sales

Twitter Firehose

II. Search Results Impressions

III. Search Results Clicks

V. Internet Explorer (Website Visits)

Sample: #jobopportunities #job posting “got a new job” “I got laid off ” “I lost my job”

Sample: Unemployment queries Food stamps queries Welfare Disability work

Sample:

50 states unemployment claim websites Dwd.Wisconsin.gov/u iben

Local Searches

VI. IE Toolbar

FIGURE 11.17 The low precision, high recall, and ex ante data for employment available to the analyst.

306 Handbook of US Consumer Economics

FIGURE 11.18 An indicative timeline of NFP initial and final data release, and the prediction from economists’ consensus and search-based variables. NFP, nonfarm payrolls.

product or applying for unemployment claims at the local office. Internet explorer is the browser that Microsoft offers, and our analyst has access to the number of aggregated, anonymized total clicks on various websites and she can use it to obtain the total clicks on various unemployment claim websites. Now our analyst must decide which particular government data release related to employment she would choose to model. There are many official data releases such as unemployment rate, NFPs, initial jobless claims (IJC), wages, and there are other statistics (Baumol, 2013 is a great source) such as Help-Wanted Online Advertising, Corporate Layoff and Hiring Announcements, Mass Layoff Statistics, and even the ADP National Employment report. Among these statistics as Baumol suggests, IJC and unemployment rate and NFP are important, the rest are helpful characterizations but do not move the markets or get covered by the media as much. The unemployment and NFP series come out in the same release called the employment situation that is typically available at https://stats.bls.gov/news.release/pdf/empsit.pdf. Now between unemployment rate and NFP, we pick NFP per the reasoning in the previous section of both predicting and using data that are differenced. The NFP release (Fig. 11.18) informs us of the jobs created each month and has larger changes, rather than the overall level of unemployment that has a high AR component to it and will not change much as compared to the last month. There are two main surveys conducted by the BLS: the household (around 60,000 households) and the establishment survey (around 440,000 corporate and government work sites). The NFP number comes from the establishment survey. Our analyst also checks as suggested in Jain (2019) that the NFP data release does indeed move the market. That market move is a clear indication of the information content in the data release since the financial markets would react to new information per the vast literature on event studies. Fig. 11.19

Macro forecasting using alternative data Chapter | 11

307

FIGURE 11.19 Movement in the USD versus NFP surprises (Actual releasedWall Street consensus). NFP, nonfarm payrolls. Data: Bloomberg and time period from January 1998 until today. A. Jain, Does Web Search Know Something Wall Street and Government Statistics Do Not? Presented at the AI and Data Science in Trading Conference. New York on March 19, 2019.

shows the result of the event study analysis of NFP surprises versus price returns for the dollar index.

8.1 Interpretable versus Blackbox or top-down versus bottom-up models via Kuhn Now our analyst must decide which statistical learning procedures to use. Here she faces the trade-offs between exposition or the ease of explaining her model to the team and better prediction with more flexible and recent techniques from the machine learning (ML) literature that are harder to explain to nonexperts or interpret with a narrative. Fig. 11.20 borrowed from James et al. (2015) shows the trade-off that is explained quite well in the book. There is a deeper philosophical question of why we make models. One answer is “top down” or to develop intuition about the dynamics of something incredibly complex to allow for better decisions. For example, many economics models capture some essential part of the complex economic machine in a simple enough framework that allows us to ask scenario-based questions: Given this simplification of the central bank into three variables, say employment, inflation, and interest rate that are connected in this simplified way such as a Taylor rule, what would be the effect of increased employment on the nominal interest rates? Such models tend to have an

308 Handbook of US Consumer Economics

FIGURE 11.20 The interpretability versus flexibility trade-off for common statistical learning techniques. Source: James, G., Witten, D., Hastie, T., Tibshirani, R., 2015. An Introduction to Statistical Learning with Applications in R. Springer.

underlying economic narrative or paradigm around which the economics profession can coordinate their research efforts, even if the predictive power of these models is low. From reading the previous section on the Lucas critique, we can see the “top-down” models are more likely to be structural in nature. Kuhn (1962) in his seminal work, The Structure of Scientific Revolutions, addresses the progress of scientific knowledge within the paradigm and a particular paradigm may not be rejected until the “anomalies” relative to that paradigm become so difficult to handle that the profession moves toward a different paradigm that offers better and simpler solutions. This dynamic of coordination and communication around a paradigm applies even to the particular group the analyst might work in! And hence the group may be more accepting of a model they can interpret more easily and everyone can agree on rather than a model which they cannot understand. Another answer of why we make models is “bottom up” to predict the outcome of a complex process better than our intuition! Most humans cannot process differential equations or invert a 50 by 50 matrix in their minds, and hence we have computer programs to help us with that. A parallel in the world of modeling is that we may believe NFP is affected by many variables such as weather, people’s attitudes about jobs, many kinds of searches, we may also posit that the real relationship between searches and NFP is nonlinear and is better modeled empirically using a flexible ML method that can completely fit the data and extract every ounce of predictability from the data rather than have a simple mathematical model of dynamics. Typically, the “bottom-up” models tend to be more nonstructural but no model is purely devoid of economic or social psychology theory or completely identified with it. So the “top-down” or “bottom-up” is a useful alternative characterization.

Macro forecasting using alternative data Chapter | 11

309

8.2 The practical reason: modeling noise in small data sets Away from the theoretical concerns above, the practical reason for simpler models in our case is that macroeconomic data are noisy as shown above and just like the parable of Google Flu we have almost an infinite number of potential search variables that can fit these noisy data, and if we allow the usage of extremely complex nonlinear techniques then we will end up fitting noise and may be unable to detect the true data generating process via our model. Thus, the other reason most live and practical prediction scenarios do not prefer Blackbox techniques is because of their concern that the curse of high dimensionality as James et al. (2015) refer to it along with the noisy data and a limited time period (there are no recessions in our data set from 2012 to 2018) would result in worse predictions.

8.3 The five keys: clean data, internal consistency, shrinkage, bootstrapping, and ensembling In this section before we continue with the investigations into modeling NFP with our analyst, we frame the five key principles that we find useful across the many kinds of models we make. Cleaning (and visualizing) the data is critical, and we address the failures associated with not doing it properly in the next section. Visualization of the data is a part of cleaning the data and is also helpful when developing “stories” or hypothesis to be tested. Internal consistency refers to a unified and logical approach from data to model to testing. For example, if we decide to proceed with a structural or a top-down model to model growth shocks, then it should apply across all sectors. If we find that increasing searches signal higher propensity to buy cars because we believe people form intention for future actions by gathering information via searches, then the same idea with the same sign should apply to the housing market. A model with a positive coefficient for search on cars but a negative one on housing would find it difficult to pass the internal consistency test. Diebold (1998) defines shrinkage as the idea of coaxing or “shrinking” parameter values in certain directions,” for instance, in the direction of a prior we believe in like 0. Shrinkage (by and large) reduces the MSE by lowering the variance considerably at the potential cost of a small increase in bias (say if the coefficient shrank to 0 is not exactly 0). Shrinkage has long been known to be practically useful as Diebold (1998) mentions the case of vector autoregressions using Bayesian shrinkage producing “drastically superior forecasts over the unrestricted vector autoregressions.” In the high-dimensional problems that we will tackle, we find that reducing the number of parameters often turns out to be quite useful and the resulting model with fewer parameters also has the benefit of being more interpretable. Polson (2017) provides an interpretation of Ridge regression as a Bayesian hierarchical model with a normal likelihood and prior.

310 Handbook of US Consumer Economics

Alternative data sources tend to have “spiky” distributions, as we can see in Fig. 11.16 that plots the searches for new homes. More technically, we are not fully aware of the statistical properties of these data which also have small sample sizes in at least one dimension and hence asymptotic inference may be unreliable in addition to being analytically complicated. Kogan (2010) suggests the use of simulation methods such as bootstrap to deal with analytically challenging problems and to adjust for bias and improve the precision of asymptotic approximations in small samples such as confidence intervals and test rejection regions, etc. The bootstrap confidence interval is naturally accurate asymptotically since we are just repeatedly sampling the given distribution and it has advantages such as the one pointed out by Kogan (2010) that for a “t-statistic” the bootstrapped distribution is more accurate than the largesample normal distribution. Additionally, with these noisy data, the possibility of “influential points” affecting estimates is quite high, especially when considering nonlinear models. Thus, repeated sampling or bootstrapping is a key ingredient for inference and understanding how the statistical procedures we are using interacts with the data we have. Ensembling is simply combining outputs of different models to produce one prediction, and the fundamental idea is of variance reduction by combining imperfectly correlated outputs. Better results from ensembling are obtained when combining various internally consistent models built on different philosophies. For example, we believe that ensembling an internally consistent “linear” top-down Lasso model and an internally consistent “bottom-up,” “nonlinear” random forest or deep learning model is fundamentally more robust than simply averaging two similar models. Opitz and Maclin (1999) express this idea eloquently, “research has demonstrated that a good ensemble is one where the individual classifiers in the ensemble are both accurate and make their errors on different parts of the input space.”

8.4 The model overconfidence metric Given the discussion above, our analyst decides to explore most of the techniques in the James et al.’s (2015) book for modeling NFP and will use the data from 2012 to 2016 as the modeling data set and the rest of it as a holdout sample to test the efficacy of the techniques. Table 11.7 provides an example of the type of notes the analyst should make while thoughtfully utilizing each technique. After the analyst utilizes various techniques, she decides to take the ratio of the in-sample error (in-sample MSE) and the out of sample or the mean squared error when taking the model generated to fit the holdout sample to generate the “model overconfidence metric (MOM).” Formally, she defines the MOM metric as Model Overconfidence Metric ¼ Out of Sample MSE=In Sample MSE (11.6)

TABLE 11.7 Example of notes for each technique for modeling nonfarm payrolls. Modeling methodology

Particular technique

Linear regression

Linear regression

All

Subset Selection

Bootstrapped correlation variable selection

Overall, Architecture and Engineering, finance, technology, and government

Variables chosen

Decision criterion

Bootstrapped correlations and wide coverage

Insample MSE

Holdoutsample MSE

Overconfidence measure

3405

9110

2.7

With average correlation of 0.56 among variables and only 59 data points versus 18 possible variables using all variables will compound the overfitting. Lack of interpretation and lower the power of the test. We find that the overall F statistic of the regression is not significant.

4807

8265

1.7

We examine the bootstrapped confidence intervals and aim for a wide variety of fields in addition to just the overall job-related searches.

Comments

Continued

TABLE 11.7 Example of notes for each technique for modeling nonfarm payrolls.dcont’d Modeling methodology

Elastic net

Particular technique

Variables chosen

Decision criterion

Insample MSE

Holdoutsample MSE

Overconfidence measure

Linear regression with cross validation

Archeng, overall

MSE and fivefold cross validation

4630

7275

1.6

Best subset selection

Archeng, Science, overall

Min Cp

4117

7723

1.9

Forward and backward selection

Archeng, Science, overall

4117

7723

1.9

Lasso

Overall, food, government, Archeng

4559

6543

1.4

Comments We chose between models suggested by manual variable selection above.

Macro forecasting using alternative data Chapter | 11

313

FIGURE 11.21 Graphical representation of the error in holdout versus in-sample via MOM metric.

For example, a model that has a low in-sample MSE, say due to overfitting of the data, and has a high out of sample MSE would be deemed quite overconfident due to a high value of the MOM metric. Similarly, a model that has low out of sample error versus the in-sample error would score low and hence would be a better performing model. Below, we reproduce the metric for this case study (Fig. 11.21 and Table 11.8).

8.5 Discussion of case study results Challenges for linear regression: We find that in this case study, simple linear regression which tends to be one of the favorite techniques of the economics profession does not perform well, neither do naı¨ve decision trees, or even boosted trees, which typically guard against overfitting by iterating slowly. The boosted trees did perform better after many iterations to find suitable step sizes. We believe there are two chief culprits: first, the high dimensionality of our variables and second, the noisiness of the macro data which can result in local overfitting. The majority of these results are actually keeping in line with known literature. As far as linear regression is concerned, Polson (2017) actually points out that even in simulated data we do better than simple linear regression by shrinking the coefficients to reduce the variance. The typical maximum likelihood estimator or the ordinary least squares is designed to have zero bias, which means that it can suffer from high variance. According to him, the main advantage of linear regression is interpretability! The usefulness of cross validation and shrinkage: We find that the typical aggregation techniques like principal components perform a bit better but still the out of sample MSE is almost two times the in-sample MSE even after doing fivefold validation. Combining some economic intuition by selecting

314 Handbook of US Consumer Economics

TABLE 11.8 MOM metric for the particular model of NFP using search variables.

Modeling methodology

Model overconfidence metric (MOM) ¼ holdout MSE/MSE model (smaller is better)

Holdout MSE

Random forest

1.3

6801

Ridge

1.4

6394

Lasso

1.4

6543

Linear regression with cross validation

1.6

7275

Linear regression with variable selection

1.7

8265

Best subset selection

1.9

7723

Pruned boosting

1.9

6877

Principal components regression

2.0

8126

Pruned decision tree

2.0

6919

SVM regression

2.2

7994

Partial Least Squares regression

2.5

8945

Linear regression

2.7

9110

Decision tree

4.4

8888

Naive boosting

5.6

13,650

All techniques used fivefold validation, the modeling sample was 2012e2016, holdout sample was from January 2017 onwards.

some variables was helpful, as was cross validation and best subset selection for linear regression. The improvement of linear regression with the addition of cross validation showcases the importance of bootstrapping as a philosophy, especially when dealing with social data that can be nonnormal and also in providing a sense of how dependent models may be on “influential data points” in small samples. The success of best subset selection suggests the other theme of “shrinkage” which reduces the number of parameters to estimate resulting in more robust models. The theme of shrinkage being helpful can also be seen in the fact that ridge and Lasso techniques that actively use a penalty term as the number of terms in a model grows, show the best results along with techniques based on random forest.

Macro forecasting using alternative data Chapter | 11

315

Random forest-bootstrapping nonlinearity with no theory: We see that random forest techniques perform quite well, and since a random forest is essentially a decision treeebased technique it uses bootstrapping with decorrelated iterations (where new variables have to be selected each iteration). This indicates that there are nonlinearities in the data, especially since the random forestebased techniques selected some different variables than the linear methods, but the nonlinearities can be modeled robustly only when we use bootstrapping and go through all possible variables. The good performance of a purely empirical nonlinear technique such as random forest which does not start ex ante with some economic theory is also interesting and similar to the recent encouraging results in the field of deep learning, a very nonlinear technique that seems to have useful applications for noisy and complex data where no particular theory is obvious (Heaton et al., 2016). In our case study of modeling NFP, we find that random forest performs better than the linear regressionebased techniques and also selects different variables as compared to the linear regression.16 Ensembling helps: As suggested in the previous section, we find that combining the two approaches of ensembling random forest and Lasso-based methods we get a good mix of parsimonious linear and nonlinear methods and the resulting out of sample MSE is much lower than with either technique (Fig. 11.22). Our informal results for which techniques performed best for the particular case of modeling NFP from 2012 to 2018 using specific curated search variables have been examined much more formally in the statistical learning literature such as Caruana and Niculescu-Mizil (2006) where they perform a formal largescale empirical comparison between supervised learning methods, SVMs, neural nets, logistic regression, naı¨ve Bayes, memory-based learning, random forests, decision trees, bagged trees, boosted trees, and boosted stumps using a variety of performance criterion. Like our case study, they find that random forests and bagged trees perform quite well across a variety of problems and do not require as much calibration. Other techniques perform much better for specific problems and under specific types of calibrations (Fig. 11.23).

9. Live production results Having discussed the various possible models for NFP, for the purposes of live production, we find the best results with a forward moving ensemble of Lasso and random forest models, and we find that the model improves with each data point and is able to adapt to the dynamic social data. In addition to NFP, we extend a similar modeling approach to other sectors using some of the features shown in Table 11.6. We note the actual performance of indicators such as employment, 16. A comment on tree-based methods from the case study is that sorting based on entropy or node purity measures tends to result in substantially different models for noisy and high-dimensionality problems.

316 Handbook of US Consumer Economics

FIGURE 11.22 Example of selecting the right number of principal components by fivefold cross validation while modeling NFP.

FIGURE 11.23 Example of picking the optimal value for lambda for the Lasso Technique for modeling NFP.

retail, home, and consumer credit up to 4 weeks before the data release in Table 11.9. We find that in terms of MSE, they are about the same or worse than Wall Street consensus. However, in terms of directional accuracy, these numbers tend to be better than Wall Street consensus, especially on the downturns, but the samples are too small for a high degree of statistical confidence in that result.

TABLE 11.9 Performance of employment, housing, and retail indicators in data upticks and downticks across 1e4 weeks before the data release. Street accuracy (overall)

Accuracy (4 weeks ahead upward)

Wall Street accuracy (upward)

Accuracy (4 weeks ahead downward)

Wall Street accuracy (downward)

Nonfarm payrolls

79%

77%

77%

77%

81%

77%

Retail Sales (ex-auto and gas)

73%

69%

74%

78%

72%

60%

Pending home Sales

76%

73%

80%

84%

73%

62%

Consumer credit

100%

100%

100%

100%

100%

100%

Data source: MSFT and Bloomberg.

Macro forecasting using alternative data Chapter | 11

Data release

Accuracy (4 weeks ahead overall)

317

318 Handbook of US Consumer Economics

What is remarkable, however, is the ex ante or predictive nature of the search dataebased measure since they are already at their optimal performance 4 weeks before. More data over time does not seem to help them, including weeks 3 through 1 does not seem to improve the directional accuracy in our sample. Hence our characterization of search data is noisy (low precision), but wide recall and ex ante.

9.1 Prediction in practice: the main mistakes17 Modeling and predictions are conducted in a team environment, and the overall process can be sketched out in the loop shown in Fig. 11.24 where generating hypothesis, data exploration, and cleaning are part of the initial development, and testing of the hypothesis partly happens in the holdout sample, but partly in the live production environment. Typically, because of all the data mining concerns, in such a team environment the performance impact of actual live 12 data points and prediction is much higher than the same number of 12 data points of a holdout sample. Strong process controls and awareness of potential mistakes can reduce this performance gap between the holdout sample and the live performance of the model, and because of the noise in the series, a high overall process quality is the best insurance a team has of being able to stay committed to the model since judging performance in real time is difficult.

FIGURE 11.24 The typical model development loop followed.

17. These are mostly mistakes I have either made myself directly or seen firsthand!

Macro forecasting using alternative data Chapter | 11

319

There are four main categories of mistakes in the model development loop described above: 1) 2) 3) 4)

Data processing mistakes Statistical learning mistakes Conceptual mistakes Organizational mistakes

Data processing mistakes such as not cleaning the data properly (say not removing GDP figures like 400% growth in some quarter or not investigating if 6700 searches in one evening for pizza in a village of population 400 are correct) are fairly common due to the unstructured nature of the data and the domain knowledge required to spot them. Thus, data cleaning is something the entire team should be aware of and help. As Hadley Wickham (2014) states data cleaning is not studied enough when in fact 80% of data analysis is data cleaning. No data is unconditionally clean, each data set tends to be cleaned relative to the problem being modeled, and any decisions on cleaning data are decisions on how to model the data. I believe it is critical to have the person modeling the data be very involved with the cleaning of it hence critiques of the academic economics profession where data cleaning, modeling, and theorizing sometimes seem too detached for practical purposes resonate (Orphanides, 2001; Romer, 2016). Statistical learning mistakes refer to either technical mistakes like using linear regression to model unit root processes or using ML techniques that one does not conceptually understand and having a high degree of overconfidence in the model. Conceptual mistakes are more along the lines of inherent contradiction in our goals and processesdfor instance, modeling business cycle dynamics and using those to arrive at asset allocation decisions and removing the financial crises such as 2007e08 from the data. Occasionally, such processes are justified by the desire to not have outliers influence statistical inference, but there is a philosophical issue that for the purposes of an asset allocation model or making central bank policy these crises are the times with substantial impacts on outcomes and any description of reality that does not contain these is inherent flawed. Organizational mistakes render teams ineffective for two principal reasons: the first, a lack of focus, no tangible goal is often a telling symptom of this, and the second, because they get mired in organizational frictions. Different skill sets are necessary to perform the various functions in the model development loop above, especially as alternative data source, high computing power, and ML techniques have become popular in the finance and economics domains. These different skills come together as a part of a team. In Table 11.10, we list the typical archetypes that might fulfill different roles in the process. Developing common ground and constant communication for these various types of expertise is critical for functioning better as a team. We represent the ideal process in Fig. 11.25 as the intersection of domain expertise, the research mindset where we can keep iterating and one failed

320 Handbook of US Consumer Economics

TABLE 11.10 Archetypes of the desired skills. Archetype

Expertise

Typical strength

Typical challenge

Domain expert

Institutional and markets knowledge

Good intuition

May not recognize own cognitive biases

Respects how difficult forecasting better than the wisdom of the crowds is

May be suspicious of ML methods

Data and machine Learning (ML) techniques

Skilled at data cleaning, efficient and standardized data storage

Likely to overfit the data

Can glean insights from data using visualization and modeling

Might make suboptimal model due to low context about the problem

Systematic thinking in a conceptual framework

Designs an efficient research agenda

May not be open to new findings that go against favorite theoretical framework

Mitigates overfitting

Prefers to engage in long-term projects that take time to pay off

Technology expert

Researcher

For illustration purposes, no offense intended!

FIGURE 11.25 The ideal process combines a clear goal, lean organizational structure, and as much overlap between different skills as possible. The closer the team to the center, the better the results.

Macro forecasting using alternative data Chapter | 11

321

experiment does not doom the enterprise, along with newer data sets and artificial intelligence (AI)-based techniques such as ML that are supported by a robust technological architecture. The closer the team can get to the center of the diagram, the better the research and product output.

9.2 Public benefits of microfoundations of macro Real people plan their lives based on their understanding of the economic opportunities around them which are communicated (at least in part) through economic data. Institutional actions such as central bank policy, fiscal aid, and even bailouts, and other aid from, say the World Bank, also depend on data to know the location and the extent of any socioeconomic problem. Thus, economic measurement has a big role to play in the social and democratic sphere since it influences the actions and relationships of individuals, institutions, and communities. All actors could use better data, and economic noise is not evenly distributed: The first asymmetry of measurement: Economic data are noisier in times of recessions and low growth. The second asymmetry of measurement: Poor countries have much worse economic data. The first asymmetry of measurement was shown in Table 11.2, where we showed that both the volatility as well as the revisions are higher in times of low growth in the United States, one of the most powerful and sophisticated countries in the world. So even as citizens are losing employment in certain states, the central bank or the government cannot act for 3e6 months due to lack of data. Poor data have also created longer-term problems, for instance, the Federal Reserve policy makers’ belief that the economy was operating much lower than its potential in the 1960s and 1970s might have contributed to the overheating and high inflation of the economy later (Orphanides, 2002). Another area for improvement is state-level GDP which is only released on a yearly basis! State-level GDP seems to be a critical input to help the government decide on a response to any economic issue! Now instead of the GDP, we can use other indicators such as NFP, unemployment rate, average house worked in manufacturing, and real wages that are released monthly (Fig. 11.26). Novak (2013) uses these monthly indicators to compute a proxy for the state-level economic activity and finds that while they are reasonably effective on an overall basis with a correlation of 0.69, for states with more agriculture and mining, the correlation can be quite low, below 0.40, and hence would make it difficult to gauge the economic conditions especially in a situation where the national economy is performing well and these sectors are suffering. On average the same states, more reliant on agriculture and mining, are the states that tend to be economically challenged, thus the lack of good data would impact them disproportionately more, the second asymmetry of economic measurement.

322 Handbook of US Consumer Economics

FIGURE 11.26 Novak (2013). Correlations between state GDP and state coincident indices are not uniform. This might make it harder to spot economic problems particular to the yellow (white in print version)- and light green (light grey in print version)- colored states. GDP, gross domestic product.

Given the amount of effort needed for such a complex calculation as GDP, it is not surprising that less developed countries would face challenges. However, the extent and the fundamental nature of the measurement problem in developing countries is surprising. Diane Coyle (2014) presents the case of Ghana whose GDP increased by 60% in November 2010! This happened as Ghana updated the weights used to compute the price index for calculating real GDP for the first time since 1993 (Fig. 11.27). The new GDP calculation in turn changed the official classification into a “low-middle income” country. This is not a one-off issue. Alwyn Young (2012) notes how sub-Saharan living standards have been growing at about 3.5 the rate indicated in international data sets for the last

48.14 billion USD (2013) Kenya 55.24 billion USD

60B 50B

Ghana 48.14 billion USD

40B 30B

Mali 10.94 billion USD

20B 10B 0 1960

1970

1980

1990

2000

2010

FIGURE 11.27 The dramatic change in Ghana gross domestic product upon updating weights from Coyle (2016). Data from World Bank.

Macro forecasting using alternative data Chapter | 11

323

two decades! Young says that for the poorest regions of the world such as 45 sub-Saharan African countries, 24 of those countries did not have any benchmark study of prices in the popular Penn World Tables. Colyle (2014) agrees, “the number of countries for which GDP data were available had reached only sixty as late as 1985. Thirty years’ worth of annual data for sixty countries is not much when it comes to testing detailed causal explanations of growth.” But these data or lack thereof are used by economists and “experts” to make international comparisons, tell stories, and, most alarmingly, decide policy!

9.3 Two main contributions: accurate measurement and more detail Measuring and visualizing real activity directly: Better measurement is not a new idea. In fact, even official GDP measurement apparently came about when American Congress tasked Simon Kuznets in 1932 to estimate the national income over the past few years. Only after seeing the actual data he produced, did the full extent of the depression become clear.18 Young (2012) offers a potential solution to the sub-Saharan country GDP measurement problem in the previous section by estimating their GDP from demographic and health surveys that collect irregular but in-depth data. He uses four areas: (1) ownership of durables, (2) housing conditions, (3) children’s nutrition and health, and (4) household time and family economics and relates them to GDP. By using alternative data creatively, perhaps in the spirit of Young, we may be able to help direct aid efforts better and complement and sanity check the government statistics. These data would measure and show real activity directly which is frequently measured and does not rely upon price levels or government data collection that could be politically distorted and would provide detailed information in real time.19 By avoiding the problems associated with measuring price levels, it could provide a complement to the computed GDP numbers. Imagine mapping all mobile phone, search, twitter, web browser, and mobile location data relevant to employment at an aggregated, anonymized level. This map of people’s behavior could help policy makers understand the effects of a financial crisis in remote areas in real time and divert aid as needed. This tool could be tremendously useful for both policy makers as well as citizens. Such direct mapping of activity could help mitigate concerns about GDP no longer being valid for “digital world” of Wikipedia and Linux (Colyle, 2014).

18. https://www.economist.com/briefing/2016/04/30/the-trouble-with-gdp. 19. There is already some excellent work in the field, for example, using mobile data to understand poverty in developing countries such as Blumenstock (2016). Billion Prices Project at MIT (http://www.thebillionpricesproject.com) is another great example of the use of such data that offered an alternative to CPI for the public when the Argentinian government was manipulated by the government.

324 Handbook of US Consumer Economics

Visualization is a key part of this solution since most of us understand pictures much better than words or mathematics. Thus, all citizens could participate in a more informed manner in public debates in the era of “fake news” and such politicization of data where the former chief of Greece’s statistical agency is being criminally prosecuted “for getting the numbers right.“20 More detail and better democracy: The second contribution is in providing a platform where every citizen can choose to have an economic voice. Selfreporting of issues that citizens face and collecting granular data will prove more helpful for policy makers rather than simple aggregates. The then chair of the Federal Reserve Ben Bernanke (2012) acknowledged concerns with purely aggregate statistics while dealing with the aftermath of the global financial crisis21d“aggregate statistics can sometimes mask important information. For example, even though some key aggregate metricsdincluding consumer spending, disposable income, household net worth, and debt service paymentsdhave moved in the direction of recovery, it is clear that many individuals and households continue to struggle with difficult economic and financial conditions.” It costs very little to tweet something about our employment situation. In Proserpio et al. (2016), these individual level tweets about employment or the lack thereof are “heard,” and by providing the kind of authentic, direct, and individual feedback to the policy maker could make the entire process more democratic.

9.4 Mitigating data colonialism? The current technology platforms where we ordinary citizens lead our normal lives leave a trail of “data exhaust” which helps companies extract value by creating opportunities to know us, the consumers, holistically in a way that did not exist before. If our data, a “natural resource,” is taken from us without our informed consent and used to the companies’ benefit and our detriment at a large scale,22 and this starts seeming “normal” or a “way of life,” it is dangerous and undemocratic. In fact, I will go out on a limb and say that historically the combination of a pure profit motive and vast asymmetry of power (say via technology) seems not to have led to “moral” choices that benefit the entire humanity.23 In a useful and timely article, Couldry and Mejias (2018) point out that “unlike oil, data are not a substance found in nature. It must be appropriated.” (emphasis ours). In that article, the authors explore parallels of today’s economic machine with historic colonialism to show how it normalizes resource appropriation and redefines social relation so that dispossession seems natural

20. https://www.ft.com/content/31995e48-6073-11e6-b38c-7b39cbb1138a. 21. https://www.federalreserve.gov/newsevents/speech/bernanke20120806a.htm. 22. For example, to sell us more products made by the same company at higher prices due to an information monopoly via network effect. 23. Reading the history of the East India Company in India and China may be interesting.

Macro forecasting using alternative data Chapter | 11

325

(emphasis ours). They define data colonialism as followsd“Data colonialism combines the predatory extractive practices of historical colonialism with the abstract quantification methods of computing.” So how can we make appropriation, dispossession, and asymmetric exploitation of our data not the normal state of affairs? The topic is too big and complex for any one person to have an effective solution, but too important to not suggest something in the hopes of being at least a small catalyst for discussion, I would like to suggest two keys: The first key is to provide better information and protection or rights to the consumer regarding their data. This seems to be increasing with changes in data privacy regulation such as General Data Protection Regulation,24 which is the “most important change in data privacy regulation in 20 years.” The second key lies in doing something positive and communal that brings us all together instead of a purely exploitative practice. We can do this by increasing universal or population-wide benefits of and access to this technology and information. As a self-serving example, consider creating the detailed map of economic activity suggested above, along making the aggregated underlying data and code freely available in a transparent manner. It will help citizens and central banks in poor countries especially in times of recession. Such public good projects would create goodwill. Platforms like Google, MSFT Bing, Facebook, Amazon, Twitter, Instagram, and Uber have information that is definitely valuable to the owners but combined may be even more valuable to society. As an initial step, by providing aggregated, anonymized, and lagged data, these firms can keep a reasonable amount (per regulators) of competitive advantage in terms of understanding the microlevel consumer behavior intact and simultaneously contribute enormously to broader society, central banks as well as ordinary citizens.

Acknowledgments I believe there are no sole authors, and this chapter is no exception. I want to thank amazing colleagues at MSFTdthe first being Scott Counts, my co-conspirator on many projects and a great source of humor, enthusiasm, and guidance; Jens Nordvig at ExanteData for critical revisions and encouragement; Geraint Jones for thoughtful comments; Nikita Artizov for defending R programming and suggesting I put more emphasis on the social good section; Lisa Schirf for her editing advice; Martin Ryan for being a key collaborator over the years; Chris Quirk for his NLP guidance and humor; the researchers at the NY Fed, DC Fed, St. Louis Fed, especially Kevin Kliesen for his encouragement; the Bank of Canada research team, especially Bob Fay at CIGI for his comments; other authors of the book attending the conference; the BingPredicts team for their diligence with the data; Ben Mandel and Andrew Haughwout for excellent comments and editing; my family; and my wife Sara for her patience, many readings, and constructive criticism. All errors are my own.

24. https://eugdpr.org/.

326 Handbook of US Consumer Economics

References Aruboa, S.B., Diebold, F.X., Scott, C., 2008. Real-Time Measurement of Business Conditions. https:// www.philadelphiafed.org/-/media/research-and-data/real-time-center/business-conditions-index/real-time-measurement-of-business-conditions14.pdf?la¼en. Barro, R.J., 1993. Macroeconomics, fourth ed. The MIT Press. Baumol, B., 2013. The Secrets of Economic Indicators: Hidden Clues to Future Economic Trends and Investment Opportunities, third ed. Pearson Education, Inc. Blumenstock, J.E., 2016. Fighting poverty with data. Science 353 (6301), 753e754, 2016. Borges, J.L., 1658. On exactitude of science. Viajes de varones prudentes (Libro IV, Cap. XLV, L_erida). Box, G.E.P., 1976. Science and statistics. Journal of the American Statistical Association 71, 791e799. Box, G.E.P., 1979. Robustness in the Strategy of Scientific Model Building. In: Robustness in Statistics. Academic Press, pp. 201e236. Caruana, R., Niculescu-Mizil, A., 2006. An empirical comparison of supervised learning algorithms. In: Proceeding of the 23rd International Conference on Machine Learning, Pittsburgh, PA. Chancellor, S., Counts, S., 2018. Measuring employment demand using internet search data. In: Proceedings of the 35th Annual ACM Conference on Human Factors in Computing Systems. CHI, 2018. Choi, H., Varian, H., 2011. Predicting the Present with Google Trends. Technical report, Google. http://people.ischool.berkeley.edu/whal/Papers/2011/ptp.pdf. Cochrane, J.H., 2001. Asset Pricing (Chapter 1), and (Chapter 20). Princeton University Press. Colyle, D., 2014. GDP: A Brief but Affectionate History. Princeton University Press. Couldry, N., Mejias, U.A., September 2, 2018. Data Colonialism: Rethinking Big Data’s Relation to the Contemporary Subject. http://journals.sagepub.com/doi/10.1177/1527476418796632#_i1. Coyle, D., 2016. https://www.weforum.org/agenda/2016/04/the-trouble-with-gdp-and-emergingmarkets/. David, L., Kennedy, R., King, G., Vespignani, A., 2014. Parable of Google Flu: Traps in Big Data Analysis. Science March 14, 2014. https://gking.harvard.edu/files/gking/files/0314policyforumff. pdf. Diebold, F.X., 1998. The Past, Present, and Future of Macroeconomic Forecasting. Journal of Economic Perspective 12 (2), 175e192. Goel, S., Hofman, J.M., Lahaie, S., Pennock, D.M., Watts, D.J., October 12, 2010. Predicting consumer behavior with Web search. Proceedings of the National Academy of Sciences 107 (41), 17486e17490. Granger, C.W.J., 1969. Investigating causal relations by econometric models and cross-spectral methods. Econometrica 37 (3), 424e438. https://doi.org/10.2307/1912791. JSTOR 1912791. Grinold, R.C., 1989. The fundamental law of active management. The Journal of Portfolio Management Spring 15 (3), 30e37. https://doi.org/10.3905/jpm.1989.409211. Heaton, J.B., Polson, N.G., Witte, J.J., 2016. Deep Learning for Finance: Deep Portfolios. https:// papers.ssrn.com/sol3/papers.cfm?abstract_id¼2838013. https://www.federalreserve.gov/newsevents/speech/bernanke20120806a.htm. Jain, A., 2019. Web Search Know Something Wall Street and Government Statistics Do Not? Presented at AI and Data Science in Trading Conference, New York. James, G., Witten, D., Hastie, T., Tibshirani, R., 2015. An Introduction to Statistical Learning with Applications in R. Springer.

Macro forecasting using alternative data Chapter | 11

327

Kliesen, K.L., 2014. A guide to tracking the U.S. Economy. Federal Reserve Bank of St. Louis Review 96 (1), 35e54. First Quarter. Kogan, L., 2010. Teaching Notes for 15.450. https://ocw.mit.edu/courses/sloan-school-ofmanagement/15-450-analytics-of-finance-fall-2010/lecture-notes/MIT15_450F10_lec09.pdf. Kolanovic, M., Krishnamachari, R.T., 2017. Big data and AI strategies: machine learning and alternative data approach to investing. In: Global Quantitative & Derivatives Strategy. J.P. Morgan. Kuhn, T.S., 1962. The Structure of Scientific Revolutions. University of Chicago Press. Lucas, Jr., Robert, E., 1976. Econometric policy evaluation: a critique. In: Brunner, K., Meltzer, A.H., (Eds.), The Phillips Curve and Labor Markets, Carnegie-RochesterConference Series on Public Policy, vol. 1. Elsevier, New York, pp. 19e46. Nallareddy, S., Ogneva, M., August 2015. Predicting Restatements in Macroeconomic Indicators Using Accounting Information. http://ssen.com/abstract ¼ 2444014. Newey, W., West, K., 1987. A simple, positive semi-definite, heteroskedasticity and autocorrelation consistent covariance matrix. Econometrica 55 (3), 703e708. Novak, J., 2013. The Effectiveness of the State Coincident Indexes. Special Report. Philadelphia Federal Reserve. https://www.philadelphiafed.org/-/media/research-and-data/publications/ research-rap/2012/the-effectiveness-of-the-state-coincident-indexes.pdf. Opitz, D., Maclin, R., 1999. Popular ensemble methods: an empirical study. Journal of Artificial Intelligence Research 11, 169e198. Orphanides, A., 2001. Monetary policy rules based on real-time data. American Economic Review, Papers and Proceedings 91 (4). Orphanides, A., 2002. Monetary-policy rules and the great inflation. The American Economic Review 92 (2), 115e120. Orphanides, A., Williams, J.C., 2007. Robust monetary policy with imperfect knowledge. Journal of Monetary Economics 54 (5), 1406e1435. Polson, N., 2017. Teaching Notes for Probability 41901. http://faculty.chicagobooth.edu/nicholas. polson/teaching/41900/beamer41901-2.pdf. Popper, K., 1961. The Poverty of Historicism, second ed. Routledge, London. Prescott, E.C., 1986. Theory Ahead of Business Cycle Measurement. Staff Report 102. Federal Reserve Bank of Minneapolis. Barro 1993. Microfoundations. Proserpio, D., Scott, C., Jain, A., 2016. The Psychology of Job Loss: Using Social Media Data to Characterize and Predict Unemployment. ICWSM. Reiss, P.C., Wolak, F.A., 2005. Structural econometric modeling: rationales and examples from industrial organization. In: Handbook of Econometrics, vol. 6A. Elsevier. B.V. https://web. stanford.edu/wpreiss/makeit.pdf. Romer, P., January 5, 2016. The Trouble with Macroeconomics. Commons Memorial Lecture of the Omicron Delta Epsilon Society. https://paulromer.net/wp-content/uploads/2016/09/WPTrouble.pdf. Sargent, T.J., Sims, C.A., 1977. Business cycle modeling without pretending to have too much a priori economic theory, Working Papers 55. Federal Reserve Bank of Minneapolis. Soros, G., 2009. The Crash of 2008 and What it Means. Pereseus Book Group, New York. Wickham, H., 2014. Tidy data. Journal of Statistical Software. http://vita.had.co.nz/papers/tidydata.pdf. Young, A., 2012. The African growth miracle. Journal of Political Economy 120 (4).

Chapter 12

Regional price parities in the United States Bettina H. Aten Bureau of Economic Analysis, Suitland, MD, United States

The success and expansion of the International Comparison Program (ICP) has led to an increase in interest and effort on the estimation of subnational price levels and purchasing power parities (PPPs). The ICP highlighted a difficulty that large countries such as Brazil, Russia, India, and China face during the price-collection phase, namely how to obtain average prices when there are large disparities in many types of expenditure categories, such as housing prices between rural and urban settings. The fact that such disparities were in evidence led to more research on within-country PPPs or regional price parities (RPPs). The difference between RPPs and the PPPs is that the former are in the same currency, while PPPs are usually converted to a reference country or currency by the exchange rate, such as the United States Dollar or the Euro. This chapter describes the methodology used to estimate the RPPs and their effect on measuring regional incomes within the United States.

1. Introduction In April 2014, the RPPs and the price-adjusted estimates of regional personal income became official statistics of the United States Bureau of Economic Analysis (BEA) and are now published annually on a regular basis (Aten and Figueroa, 2014, 2015). The estimates are based on Consumer Price Index (CPI) microdata which are not publicly available and on rent data from the census that are accessible to users.1 The results shown here are for the latest year, 2015, highlighting the RPPs for several expenditure categories and RPP1. The CPI microdata are available to us through an interagency agreement, between the Bureau of Labor Statistics (BLS) and BEA, while the rent data are also microdata but closely match those from the PUMS (Public Use Microdata Sample) of the Census Bureau.

Handbook of US Consumer Economics. https://doi.org/10.1016/B978-0-12-813524-2.00012-3 Copyright © 2019 Elsevier Inc. All rights reserved.

329

330 Handbook of US Consumer Economics

adjusted regional incomes in current dollars.2 This is an updated version of an article published in Social Indicators Research (Aten, 2017a, 2017b). There have been many studies on measuring or controlling for regional price differentials within countries, see for example, the comprehensive literature review in Deaton and Dupriez (2011), Aten and Heston (2005), and Biggeri and Laureti (2015). Geographically large countries such as India (Deaton, 2003; Deaton and Tarozzi, 2005), Brazil (Deaton and Dupriez, 2011; Aten, 1999), Indonesia (Arndt and Sundrum, 1975), the United States (Koo et al., 2000), and Australia (Wingfield et al., 2005; Waschka et al., 2003) as well as geographically smaller ones like the United Kingdom (Fenwick and O’Donoghue, 2003), Germany (Roos, 2006), the Philippines (McCarthy, 2010), and Italy (Biggeri and Laureti, 2009, 2015) appear to have significant subnational differences. One of the major challenges both across and within countries is to obtain consistent and representative price data from multiple sources and outlets, and to match these observations with weights that reflect actual consumption patterns. There are services that are particularly difficult to compare, such as construction, housing, education, medical, and government services, and the documentation and research on handling these categories of consumption is extensive, dating back to the origins of the ICP (Kravis et al., 1982). The CPI calculated by most national statistical offices is the most obvious source of price data, but it is often collected in a manner that makes it difficult for estimating spatial price differences. Researchers have turned to private survey companies, to scanner data and Internet scraping projects, but their data are not necessarily representative of the entire country or of the average consumer; for example, see (Dunn et al., 2012) on medical data. Deaton (2006), Deaton and Dupriez (2011), and more recently the World Bank (2015) have matched detailed consumer expenditure household surveys with ICP product specifications, thus tracking both the spending patterns of particular populations and their relative price levels. Another motivation for obtaining accurate measures of regional price levels is their relevance to assessing poverty levels and income inequality indicators within countries. The former has a long history in the United States, beginning with the landmark study in the 1960s (Orshansky, 1969). More recently, following a report by the National Academy Panel on Poverty and Family Assistance (Citro and Kalton, 2000), the Census Bureau began publishing research and estimates on alternative poverty measures that take into account geographic price differences (Short and O’Hara, 2008; Garner and Short, 2009; Renwick, 2011; Short, 2015, Renwick et al., 2017).

2. BEA publishes constant dollar RPP-adjusted regional incomes, termed Real Regional Incomes, using the annual US personal consumption expenditure (PCE) chain price index as the current to constant adjustment factor. In this chapter, we focus only on the current dollar incomes.

Regional price parities in the United States Chapter | 12

331

The BEA, in a joint project with the Bureau of Labor Statistics (BLS), first estimated RPPs for consumption goods and services for 38 metropolitan and urban areas of the United States for 2003 and 2004 (Aten, 2005, 2006). These areas, for which BLS produces the CPI, represent about 87% of the total population. The method was expanded to cover the remaining nonmetropolitan portions of each state. Estimates for 2005 and 2006 were reported in the Survey of Current Business in November 2008 (Aten, 2008, Aten and D’Souza, 2008). Experimental estimates for 2007 incorporate the multiyear American Community Survey (ACS) from the Census Bureau, as do official estimates for 2008 forward.3 The current methods and results differ from versions prior to 2014 because now they include a two-stage, rolling average estimation process. The first stage consists of estimating annual multilateral price level indexes for CPI areas and for several consumption expenditure classes such as apparel, food, and transportation.4 In the second stage, the price levels and expenditure weights are allocated from CPI areas to all counties in the United States.5,6 They are then recombined for regions, such as states and metropolitan areas, and merged with data on rents from the Census Bureau’s ACS. The ACS provides a broader geographic coverage than the CPI areas, including countylevel data, thus allowing us to augment the allocated CPI price levels with observed housing observations. The final RPPs are obtained by combining the 5 years of the first-stage results, plus the annual rent indexes, and calculating the multilateral aggregate price index for all goods and services and rents. For example, the 2010 RPP is a 5-year average of the 2008e12 CPI-derived price indexes for goods and services excepting rents, plus the observed 2010 rent indexes from the ACS. Note that both prices and weights are used as inputs when creating a price indexdwhether in a time-to-time index or a spatial index such as the aggregate RPPs.7 In the case of the RPPs, we use expenditure weights derived from the Consumer Expenditure survey and controlled to the national Personal

3. Publications describing these results may be found at https://www.bea.gov/research/meet-theresearchers/bettina-h-aten. 4. The 38 CPI index areas are designed to represent the US urban and metropolitan population. Of the 38 areas, 31 represent large metropolitan areas, 4 represent small metropolitan regions, and 3 represent urban nonmetropolitan regions. For more information on these BLS-defined areas, see www.bls.gov/cpi. A list of the counties sampled in each area can be found in Aten (2005). 5. Expenditure weights used in the CPI are known as cost weights and are derived from BLS Consumer Expenditure (CE) survey data. See the “Consumer Price Index” in the BLS Handbook of Methods, Chapter 17 at www.bls.gov. 6. For a description of input data and methods used to estimate RPP expenditure weights, see Figueroa et al. (2014). 7. Relative price levels and RPPs are conceptually similar, but the RPPs refer to the aggregate price levels, multiplied by 100.

332 Handbook of US Consumer Economics

Consumption Expenditure (PCE) totals published by the BEA. The two vary in their distribution as they measure different consumption concepts. The RPPs are only consistent with the weights used as inputs, and care must be taken when users apply them to a different distribution of expenditure totals (see Figueroa et al., 2014; Renwick et al., 2017). The following sections describe in more detail the use of the price levels and expenditure data from the CPI and the housing data from the ACS, how their geographies are reconciled, and how the overall indexes are computed.

2. Price levels for CPI areas CPI price data cover a wide array of consumer goods and services, ranging from high-expenditure goods, such as new automobiles, to low-expenditure services, such as haircuts. Over a million price quotes are collected each year and are classified into more than 200 item strata, each consisting of detailed entry level items (ELIs), which may be further divided into Clusters. The item strata can be combined into nine expenditure groups: apparel, education, food, housing, medical, recreation, rents, transportation, and other goods and services.8 Because the CPI was not designed to measure geographic price level differences, items with identical characteristics are not always priced in all areas. Therefore, for the ELIs in the 75 highest item strata (accounting for roughly 85% of expenditure weights), we estimate hedonic regressions which take into account the variation in the characteristics of the sampled items. For the “Women’s Tops Excluding Active and Outerwear” Cluster, for example, we use a hedonic price model to adjust for the type of clothing (jacket, sweater, or blouse), the fiber content, the length of the sleeves, the closure type, the size range, the brand category (exclusive/luxury, national, or private), country of origin, and the type of outlet where it was sold. An example of an item-specific hedonic regression may be found in Aten (2006). For the remaining item strata, we use a shortcut approach consisting of a weighted regression on areas and ELIs (and Clusters when available) as independent variables. This regression is a modified version of the weighted country-product-dummy (CPD-W) approach as originally described in Summers (1973) and Kravis et al. (1975), and more recently in Rao (2004), Sergeev (2004), and Diewert (2005). Overall results do not differ greatly whether detailed hedonic regressions are run on all item strata, or on only the top 75 in combination with this shortcut approach (Aten, 2006). After the area-ELI price levels are estimated, they are aggregated to yield area-item strata price levels, using the CPD-W approach, with weights corresponding to the importance of the ELIs within the item strata. One advantage

8. See the “Consumer Price Index,” in the BLS Handbook of Methods, Chapter 17 at www.bls.gov.

Regional price parities in the United States Chapter | 12

333

of the CPD-W approach is that it can yield weighted average prices of the item stratum or of the ELI-Cluster, simply by taking the antilog of the corresponding coefficients. This also facilitates a thorough outlier checking process that we apply to the ELIs and item strata, described in detail in Aten et al. (2011). It is modeled after the Quaranta tables first used in the OECD-Eurostat basic heading level price comparisons.9 We flag observations that are (1) very large or small relative to the mean in that area and ELI; (2) large or small relative to the variance of the ELI observations; or (3) large or small once they have been adjusted for the relative price level of the area. It is an iterative process that looks at the raw price data as well as the relative prices after the hedonic adjustment. The outcome of this processdhedonic and CPD-W regressionsdis a matrix of 38 by 200-plus area-item price relatives, with the weighted geometric average of the 38 areas in each item strata indexed to one. Any of the 38 areas could be the numeraire, but their geometric average provides a more neutral interpretation. In order to obtain price (and quantity) indexes that combine the 38 by 200plus area-item price relatives with their corresponding expenditure weights, there are several aggregation methods that could be used. The most well known are the EKS-To¨rnqvist and Fisher indexes, the CPD-W approach, the GAIA index, and the Geary index. The advantages and disadvantages of these methods and their corresponding statistical and economic properties have been thoroughly reviewed elsewhere; see Kravis et al. (1982), Diewert (1999) and Balk (2009), for example. Aten and Reinsdorf (2010) have shown that the Geary RPPs for the 38 BLS index areas do not deviate significantly from the other possible indexes.10 The Geary multilateral price level index, PGeary, is given by: N P

PCGeary ¼ n¼1 N P n¼1

pn ¼

M X C¼1

ðpqÞCn pn qCn

ðpqÞCn M P C

PGeary

d¼1

qdn

9. The process is modeled after the Quaranta method used by the Organisation for Economic Co-operations and Development and Eurostat (OECD/Eurostat, 2012) and the International Comparison Program of the World Bank (World Bank, 2015). 10. The Geary formula is solved simultaneously for the area RPPs and the expenditure class price levels (notation and formulas follow Deaton and Heston, 2010).

334 Handbook of US Consumer Economics

where p is the relative price of the item stratum or expenditure class, p is the national relative price of the item stratum or expenditure class, q is the notional quantity, equal to the expenditures divided by the area relative price: (pq)/p. The superscripts c and d are the BLS areas, which take a value of 1 through M ¼ 38, and the subscript n is the item stratum, which takes a value of 1 to N, depending on the desired aggregation level.11 Note that (pq)Cn are the nominal expenditures, or weights, for each area c and item strata n, and (pnC) are the price relatives derived from the hedonic or shortcut regressions, also for each area and item strata. For example, when we estimate a PGeary, for each of the 38 areas in the formula above, we choose to do so for 16 expenditure classes related to Food, Apparel, Transportation, Housing, Education, Recreation, Medical, Rents, and Other. Each of the classes is broken down into Services and Goods, with the exception of Apparel, with only a Goods component, and Rents, with only a Services component. Thus, N varies by the number of item strata in its class. Rents consists of N ¼ 3 item strata: (1) Rent of Primary Residence, (2) Owner’s Equivalent Rents, and (3) Owner’s Equivalent Rents for Secondary Residence, when it exists. The other 15 expenditure classes have more item strata, such as Food Goods, with N ¼ 56, and Food Services with N ¼ 6 item strata. The Geary indexes, in addition to being base-invariant and transitive, like all other multilateral methods, are additive, or matrix consistent. The latter means that one can easily obtain subgroups or aggregates of the indexes once the pn and the overall price index across all N ¼ 207 are estimated, for example, Rents versus Nonrents, or Goods versus Services. Renwick et al., (2017) describe an index consisting solely of Food, Apparel, and Rents that was used for comparisons of expenditures of low-income consumers in a poverty analysis.

3. Regional price parities for states and metropolitan areas The second stage consists of expanding the Geary indexes from the geography of the 38 BLS areas to other more commonly used geographies, such as states and metropolitan areas. It is also here that the Geary price indexes (PGeary) for each area and for the 16 expenditure classes are augmented by more detailed Rent price levels and expenditure data from the Census Bureau, and where an index combining all classes into a single “All Item” RPP index is calculated.

11. There were a total of 207 item strata (N) between 2008 and 2015.

Regional price parities in the United States Chapter | 12

335

The process begins with the allocation of price levels and expenditure weights from CPI areas to counties. Price levels for each county are assumed to be those of the CPI sampling area in which the county is located. For example, counties in Pennsylvania are assigned price levels from either the Philadelphia or Pittsburgh areas or form the Northeast small metropolitan area. Rural counties are not included in any of the 38 urban areas for which stage one price levels are estimated, therefore these counties are assigned price levels of the urban area that (1) is located in the same region and (2) has the lowest population threshold.12 Expenditure weights in the second stage include CPI data for rural regions, and thus in combination with the 38 urban areas, cover all US counties.13 Weights are allocated from each CPI area and rural region to the component counties in proportion to household income.14 The county-level allocations undergo two adjustments. First, rent weights are replaced with directly observed rent expenditures from the 5-year ACS file plus imputed ownerequivalent rent expenditures. The latter are obtained as follows: 1. The ratios of monthly tenant rents to owner-equivalent rents in the BLS CPI housing file are estimated for several types of housing units, from studio apartments to detached houses with three or more bedrooms. The components of these ratios, that is, tenant rents and owner-equivalent rents for each housing type, are the weighted geometric means of all the observations in the CPI. 2. The ratios are multiplied by the observed unit rents in the ACS, resulting in an estimated monthly owner-equivalent value for each housing type, by county.15,16

12. Price levels in rural counties in the South, Midwest, and West regions are assumed to be the same as those in the BLS urban, nonmetropolitan area for the region. BLS has no urban, nonmetropolitan area for the Northeast, so rural counties are assumed to have the same price levels as those in the BLS-defined small, metropolitan area for the Northeast. 13. They are derived from the biannual Consumer Expenditure survey but are modified to be consistent with the CPI and are called cost weights. 14. The allocation uses county-level ACS money income. Money income is defined as income received on a regular basis (exclusive of certain money receipts such as capital gains) before payments for personal income taxes, social security, union dues, Medicare deductions, etc. Therefore, money income does not reflect the fact that some families receive part of their income in the form of noncash benefits. For more information, see www.census.bov. In past papers, population was used to distribute the weights; for a comparison, see Figueroa et al. (2014). 15. Unit rents are the sum of rent expenditures divided by the number of units of each housing type for each area. 16. In earlier work (Aten, 2005, 2006), we imputed BLS owner-equivalent rent price levels to other geographies. Here, we only use the BLS data to obtain owner-equivalent rent expenditures; we do not impute owner-equivalent rent price levels.

336 Handbook of US Consumer Economics

3. This estimated owner-equivalent value is multiplied by 12 and by the number of owner-occupied housing units in each type, yielding an annual estimate of owner-occupied housing expenditures, by county. Note that the ratio of tenant rents to owner-equivalent rents is across all BLS sampling areas, that is, there is only one scalar for each of the seven housing types. This vector of scalars is applied to different geographies in the ACS file, with only the distribution of rents and number of units varying across geographies.17 Total expenditures are the sum of the observed annual rents and the estimated owner-occupied expenditures from step 3 above. The second adjustment to the county-level weights is to control the national shares of the 16 expenditure classes to BEA’s PCE shares. The adjustment shifts the distribution of weights across expenditure classes, notably reducing the share of rents expenditures from total consumption in the United States from 30.2% to 20.6%.18 Conversely, the shares of Food, Recreation, Education, Apparel, and Other goods and services are higher in BEA’s national accounts. Once the price levels and expenditure weights have been obtained for each class, year, and county, we combine the 5 years and obtain their weighted geometric means.19 Rent price levels are estimated directly from tenant rent observations in the ACS: annually for states, and across 3 years for metropolitan areas. These replace the rent price levels estimated from the CPI. No imputation of owner-occupied rents is used in the price levels, instead we use rent price levels for both renters and owners.20,21 The rent price levels are quality-adjusted estimates using a hedonic model that controls for basic unit characteristics such as the type of structure, the number of bedrooms, the total number of rooms, when the structure was built, whether it resides in an urban or rural location, and if utilities are included in the monthly rent.22 Additional 17. For more information on how the RPP program estimates expenditures on owner-occupied rents, see Figueroa et al. (2014). 18. The adjustment is based on BLS research providing PCE-valued weights for CPI item strata (Blair, 2012). 19. The weighted geometric means are the least squares marginal means across the 5 years, obtained using a weighted CPD with a dummy variable for the year. 20. In Aten and D’Souza (2008), the imputation for county-level owner-occupied rent levels used owner’s monthly housing cost data from the 5-year ACS housing file, together with the annual CPI Housing Survey from BLS. In more current work (Aten et al., 2011, 2012, 2013), only observed price levels from the ACS were used, making no imputations for the owner-occupied rent levels. 21. ACS data for 2012 did not incorporate a revision made by BEA to its MSA definitions (see Survey of Current Business, “Comprehensive Revision of Local Area Personal Income”, December 2013, page 17.) Among other changes, the revision designated 23 new MSAs. ACS rents for these MSAs were estimated from ACS data for state metropolitan and nonmetropolitan portions. These revisions have now been included in all the estimates going back to 2008. 22. The rent observations are the “Contract Rents” in the ACS.

Regional price parities in the United States Chapter | 12

337

research comparing rent price levels using the ACS and CPI housing surveys is available in Martin et al. (2011). The second multilateral aggregation uses the 5-year weighted geometric mean of the 15 expenditure classes derived from the BLS CPI, together with the 1-year state-level rents and 3-year metropolitan area rents from the Census ACS to estimate the final all items RPPs. This implies that for the reference year 2013, for example, final state-level RPPs are composed of rent price levels in 2013 plus an average of the price levels for goods and services other than rents between 2011 and 2015. Table 12.1 below illustrates the rolling average combinations of CPI and ACS data. The second-stage Geary multilateral is analogous to the first stage, but instead of M ¼ 38 areas and N ¼ 207 item strata as inputs, and Geary price indexes for 16 expenditure classes as outputs, the inputs are 51 states or 383 metropolitan areas, and N ¼ 16 expenditure classes. The resulting PGeary is a single “All Items” price index for each state and each metropolitan area. In addition, because of the additive properties of the Geary discussed in Section 1, we are able to obtain the subaggregates of the 16 expenditure classes very easily, and do so for several groups: Goods, Services, Rents, as well as eight subgroupsdFood, Apparel, Transport, Housing (excluding Rents), Education, Recreation, Medical, and all Other Goods and Services.

4. Selected results An important application of the RPPs is to control for price level differences across regions when measuring economic activity such as income levels. This can be done by dividing a current dollar measure, such as income levels, by the

TABLE 12.1 Reference years and source data. Year

2015

2014

2013

2012

2011

2010

Source data

Initial

Revised

Final

Final

Final

Final

BLSeCPI

2011e15

2011e15

2011e15

2010e14

2009e12

2008e11

MSAs

2013e15

2012e14

2011e13

2010e12

2009e11

2008e10

States

2015

2014

2013

2012

2011

2010

ACS-Rents

Based on data from the Bureau of Economic Analysis, Regional Prices Branch.

338 Handbook of US Consumer Economics

RPP for each geographic unit. However, as the RPPs themselves are based on expenditure weights specific to the CPI, applying the RPPs to any other set of expenditure values will require a rebalancing of the final RPPs.23 The rebalancing ensures that the totals in both current nominal values and in RPPadjusted values will be equal, that is, the total value that is being measured does not change by virtue of the RPP adjustment, and the US RPP will remain equal to one.24

4.1 Regional price parities for states Table 12.2 shows the overall or All Items RPPs for 2015 for each state and the District of Columbia.25 The All Items RPP is subdivided into three groupings: Goods, Services other than Rents, and Rents, followed by eight additional subgroups: Food, Apparel, Transport, Housing (exclusive of Rents), Education, Recreation, Medical, and Other goods and services. These groups correspond to the CPI expenditure classes. The original 16 categories consist of the subgroups further divided into Goods and Servicesdexcept for Apparel which only consists of Goods, and Rents which are only Servicesdbut are not shown here due to space constraints. The first line of Table 12.2 shows the RPPs for the total across states, with the All Items RPP equal to 100, by construction. That is, the sum of nominal expenditures across states for Goods, Services plus Rents is equal to the sum of RPP-adjusted expenditures. The same is true for the 16 expenditure categories: the sum of nominal expenditures across states will equal the sum of their RPP-adjusted expenditures. Rents are 101.1, meaning they are on average 1.1% higher across all states than the price level of other goods and services. The RPP for Goods on the other hand, are slightly below the average price level, at 99.4, and Services overall are just at 100, indicating that prices relatives of Services are generally higher than that of Goods when calculated across all states. By looking at the other lines, corresponding to each individual state, one can see the variability of the RPPs, from a low of 86.2 in Mississippi to a high of 118.8 in Hawaii, followed closely by the District of

23. The weights in the first stage are CPI-based costs weights, derived from the Consumer Expenditure weights, but adjusted to reflect sampling and periodic differences. The weights in the second stage are controlled to the US Personal Consumption Expenditure totals and distribution across the 16 classes, so they no longer reflect the CE distribution. 24. This is also the reason why users should not “mix and match” RPPs that belong to different geographic units, such as MSAs with states: the total of the United States will be the same, but the distribution of the expenditure weights will vary according to the geographic units that are chosen. 25. Results are available for MSAs and other geographic divisions, but for conciseness, are not shown in this chapter. The full tables are available on the web at www.bea.gov. See box titled “Data Availability” on how to access the data.

TABLE 12.2 Regional price parities by state, 2015. Regional price parities

Goods

Services (excluding Rents)

Transport

Housing (excluding Rents)

Rents

Food

Apparel

Education

Recreation

Other

Medical

100

99.4

100.0

101.1

99.9

100.0

99.6

99.4

100.3

99.5

99.7

99.3

Alabama

86.8

95.9

93.8

62.8

97.0

92.7

97.4

92.2

93.5

96.3

92.6

93.0

Alaska

105.6

101.0

96.9

139.4

99.9

93.3

99.5

95.0

93.5

94.7

107.2

122.1

Arizona

96.2

98.1

97.4

91.4

95.0

101.5

96.4

98.6

91.2

99.3

101.8

108.1

Arkansas

87.4

94.7

93.9

63.9

96.1

89.2

98.9

91.2

92.0

95.2

91.6

93.0

California

113.4

103.6

106.1

147.3

105.7

107.6

103.6

111.0

100.0

102.4

102.7

102.2

Colorado

103.2

100.1

100.1

114.7

98.6

99.3

96.2

96.0

109.5

99.4

102.4

111.8

Connecticut

108.7

104.5

108.6

116.8

105.7

107.0

105.5

110.8

105.7

106.8

112.6

97.5

Delaware

100.4

99.7

103.1

97.6

101.0

91.9

102.2

105.6

103.3

103.7

97.9

98.9

District of Columbia

117.0

105.9

109.7

154.3

109.7

117.0

101.0

106.2

113.8

116.1

98.5

108.5

Florida

99.5

98.2

97.2

105.4

101.9

96.5

101.7

93.9

93.0

97.8

93.2

93.8

Georgia

92.6

96.8

95.2

81.1

94.5

98.8

96.8

96.4

99.1

98.2

91.9

93.3

Hawaii

118.8

109.2

104.3

163.4

110.6

113.6

102.2

113.6

100.1

102.8

111.1

98.6

Idaho

93.4

98.0

97.3

78.7

97.6

94.4

96.9

95.3

94.7

95.1

100.2

115.9

All items United States

Continued

TABLE 12.2 Regional price parities by state, 2015.dcont’d Regional price parities

Goods

Services (excluding Rents)

Transport

Housing (excluding Rents)

Rents

Food

Apparel

Education

Recreation

Other

Medical

99.7

100.1

99.4

99.4

96.1

107.4

95.9

95.9

109.4

101.5

99.0

110.2

Indiana

90.7

97.2

93.6

74.9

93.8

95.8

94.3

93.6

99.1

96.4

96.1

102.9

Iowa

90.3

95.4

91.7

75.3

92.5

90.6

93.8

91.6

96.0

93.2

96.5

100.0

Kansas

90.4

95.8

93.5

74.6

95.3

95.1

92.5

91.9

95.1

95.1

95.5

103.4

Kentucky

88.6

94.3

93.6

68.9

95.2

88.6

98.5

91.2

92.0

94.9

91.5

92.8

Louisiana

90.6

96.2

93.8

76.2

97.2

93.4

97.1

92.4

93.8

96.6

92.9

93.0

Maine

98.0

98.5

98.6

95.8

98.2

89.6

97.9

103.4

99.2

98.5

107.1

89.8

Maryland

109.6

103.4

106.7

123.9

107.8

107.5

101.9

102.0

110.2

107.9

101.7

101.2

Massachusetts

106.9

100.7

105.4

123.3

103.6

100.1

99.2

106.8

108.7

101.9

112.7

90.4

Michigan

93.5

97.7

96.5

81.1

93.5

95.8

97.7

99.6

96.8

99.5

97.0

102.1

Minnesota

97.4

100.8

94.7

95.0

98.5

99.0

94.8

95.3

101.4

96.9

104.0

101.3

Mississippi

86.2

93.9

93.9

63.1

95.5

87.1

99.7

90.7

91.0

94.5

91.1

93.0

Missouri

89.3

95.2

92.2

73.6

96.1

96.4

91.8

90.3

96.8

93.3

92.3

95.9

Montana

94.8

98.5

95.3

85.5

96.3

95.0

95.2

92.6

94.7

95.4

101.4

121.1

All items Illinois

Nebraska

90.6

95.9

92.0

76.3

92.8

91.6

93.9

92.1

96.5

94.0

96.2

100.7

Nevada

98.0

96.8

101.8

95.3

100.8

93.1

101.1

102.2

94.4

94.7

96.8

101.6

New Hampshire

105.0

100.1

103.6

118.1

102.2

97.5

98.8

105.8

106.3

100.9

111.0

90.4

New Jersey

113.4

102.7

113.4

132.8

104.1

98.6

105.7

113.8

119.4

110.8

111.4

107.1

New Mexico

94.4

97.3

100.1

81.2

99.5

93.7

99.5

99.4

94.5

94.9

98.3

107.6

New York

115.3

108.6

111.5

133.9

109.4

113.2

108.9

115.4

106.0

109.0

118.7

98.8

North Carolina

91.2

96.0

93.8

78.7

97.1

92.9

97.3

92.2

93.6

96.4

92.8

93.0

North Dakota

92.3

95.2

91.5

86.4

92.4

90.1

93.7

91.3

95.7

92.9

96.6

99.7

Ohio

89.2

96.0

91.6

72.9

92.9

95.2

92.8

92.7

96.4

94.1

96.9

96.0

Oklahoma

89.9

95.4

93.9

72.0

96.6

91.1

98.1

91.7

92.9

95.8

92.1

93.0

Oregon

99.2

98.7

98.7

101.5

98.0

101.4

97.5

95.8

97.1

99.2

97.8

109.5

Pennsylvania

97.9

99.6

101.7

88.7

102.2

92.9

99.6

105.7

104.3

100.3

101.2

91.6

Rhode Island

98.7

98.3

98.3

100.2

97.9

89.0

97.8

103.2

98.8

98.4

106.3

89.8

South Carolina

90.3

96.3

93.8

76.3

97.3

93.6

97.0

92.4

93.8

96.6

93.0

93.0

South Dakota

88.2

95.0

91.3

68.5

92.2

89.7

93.6

91.1

95.4

92.6

96.8

99.4

Tennessee

89.9

95.9

93.8

73.7

97.0

92.6

97.4

92.1

93.4

96.3

92.7

93.0

Texas

96.8

96.8

99.1

92.9

97.0

102.7

101.0

95.5

98.1

95.5

96.2

97.0

Utah

97.0

97.1

100.8

91.2

100.0

93.4

100.1

100.5

94.4

94.8

97.7

105.2 Continued

TABLE 12.2 Regional price parities by state, 2015.dcont’d Regional price parities

Goods

Services (excluding Rents)

Transport

Housing (excluding Rents)

Rents

Food

Apparel

Education

Recreation

Other

Medical

101.6

98.4

98.3

117.0

98.0

89.1

97.8

103.3

98.7

98.4

106.9

89.8

Virginia

102.5

99.5

100.6

111.8

101.9

102.4

99.1

97.1

102.6

103.8

94.9

98.6

Washington

104.8

103.8

101.5

113.2

103.4

106.4

99.7

100.8

99.9

104.7

100.6

114.0

West Virginia

88.9

94.6

95.3

66.0

96.3

89.1

100.0

91.6

93.2

95.9

91.5

94.3

Wisconsin

93.1

96.2

93.3

85.9

93.3

93.5

94.3

92.2

95.1

95.2

102.2

100.1

Wyoming

96.2

98.4

95.9

91.5

96.7

94.8

95.7

93.4

94.6

95.3

101.0

119.5

Maximum

118.8

109.2

113.4

163.4

110.6

117.0

108.9

115.4

119.4

116.1

118.7

122.1

Minimum

86.2

93.9

91.3

62.8

92.2

87.1

91.8

90.3

91.0

92.6

91.1

89.8

Range

32.6

15.2

22.0

100.7

18.3

29.9

17.1

25.1

28.3

23.5

27.6

32.4

All items Vermont

Based on data from the US Bureau of Economic Analysis; Bureau of Economic Analysis, Regional Prices Branch.

Regional price parities in the United States Chapter | 12

343

Columbia (117.0), New York (115.3), and 113.4 in California and New Jersey. Alabama (86.8) and Arkansas (87.4) have also relatively low RPPs. The range for Rents is much wider: between Hawaii (163.4) and Alabama (62.8). Accounting for over 20% of average expenditures, Rents have the greatest impact on the overall RPP. Medical RPPs also have a large spread, which may reflect the variance in some of the comparison-resistant services, such as Hospital Services. The overall Medical RPP is 99.3, below that of Rents and the lowest of the groups, perhaps in part because only the cost to the consumer is included in the CPI. Medical insurance costs are not included, and this is a particularly hard to measure category, with a range of RPPs close to that of Rents. Not surprisingly, many of the predominantly rural states, or states with just one or two larger metropolitan areas, such as Alabama, Mississippi, Arkansas, West Virginia, South Dakota, and Kentucky have low rent price levels (RPPs below 70) and also relatively low All Items RPPs (below 90). On the other hand, states like New Jersey, New York, California, the District of Columbia, and Hawaii all have rent RPPs above 130 and All Items RPPs above 110. The RPPs for the subgroups, with the exception of Rents, are 5-year averages of CPI price levels. Thus, year to year differences during this 5-year period (2011e15) will only reflect shifts in the expenditure weights, and not actual price level differences of Goods and Services other than Rents.26

4.2 Adjusted personal incomes for metropolitan and nonmetropolitan portions of states27 Each county in the census is designated as metropolitan or nonmetropolitan, allowing us to subdivide each state into two portions. We examine the difference it makes to total and per capita regional incomes when we adjust them by their corresponding RPPs.28 Table 12.3 shows the results. The first five columns are the metropolitan portions of states and the next five the nonmetropolitan portions. Delaware, the District of Columbia, New Jersey, and Rhode Island do not have nonmetropolitan portions. The third column of numbers in each set is the RPPs divided by 100, the relative price levels, and the adjusted values are the nominal values divided by the price levels, in both

26. The main reason for using 5-year averages of CPI price levels is for consistency and robustness of the estimates. In some cases, the number of observations for which we can obtain overlap across characteristics and item definition is small. Pooling the data was found to be an effective way to control for sparseness in the geographic coverage for the purposes of the RPPs (see Aten and Reinsdorf, 2010). 27. Personal income is defined as the income received by all persons from all sources. It is the sum of net earnings by place of residence, property income, and personal current transfer receipts. This chapter uses personal income estimates released by BEA’s Regional Income Division on September 30, 2014. For more information, see www.bea.gov/regional. 28. There is a third designation, Micropolitan, but BEA only publishes personal income for metropolitan and nonmetropolitan portions of states.

TABLE 12.3 Adjusted incomes by state: metropolitan and nonmetropolitan portions, 2015. Income Total $billion

Per capita $000s

Price level

Adjusted income

Income

RPP/ 100

Total $billion

Total $billion

Per capita $000s

Metropolitan

Price level

Adjusted income

Per capita $000s

RPP/ 100

Total $billion

Per capita $000s

Nonmetropolitan

1

Alabama

148

40

0.88

169

46

1

Alabama

38

33

0.82

46

40

2

Alaska

29

58

1.09

27

54

2

Alaska

13

53

0.99

13

54

3

Arizona

259

40

0.96

269

42

3

Arizona

11

31

0.86

13

36

4

Arkansas

80

43

0.89

90

49

4

Arkansas

36

32

0.84

44

38

5

California

2098

55

1.14

1848

48

5

California

36

43

0.97

37

45

6

Colorado

250

53

1.04

240

50

6

Colorado

32

47

0.96

34

49

7

Connecticut

234

69

1.09

215

63

7

Connecticut

11

58

1.03

10

57

8

Delaware

44

47

1.00

44

47

9

District of Columbia

50

74

1.17

42

63

10

Florida

896

46

1.00

899

46

8

Florida

24

34

0.90

26

37

11

Georgia

363

43

0.94

387

46

9

Georgia

55

31

0.85

65

37

12

Hawaii

59

51

1.22

49

42

10

Hawaii

10

38

1.00

10

39

13

Idaho

43

39

0.94

46

42

11

Idaho

21

38

0.90

23

42

14

Illinois

596

53

1.01

589

52

12

Illinois

56

38

0.85

66

45

15

Indiana

222

43

0.92

242

47

13

Indiana

55

37

0.85

64

44

16

Iowa

86

47

0.93

93

50

14

Iowa

57

45

0.86

66

52

17

Kansas

98

50

0.92

106

54

15

Kansas

39

41

0.85

46

48

18

Kentucky

110

43

0.90

122

47

16

Kentucky

60

33

0.85

71

38

19

Louisiana

173

44

0.92

189

48

17

Louisiana

27

35

0.83

32

42

20

Maine

36

45

0.99

36

46

18

Maine

21

39

0.94

23

42

21

Maryland

330

56

1.10

299

51

19

Maryland

7

49

0.90

8

54

22

Massachusetts

420

63

1.07

393

59

20

Massachusetts

6

60

1.01

6

60

23

Michigan

362

45

0.94

384

47

21

Michigan

65

36

0.87

75

41

24

Minnesota

227

53

1.00

228

54

22

Minnesota

53

43

0.87

61

49

25

Mississippi

52

38

0.88

58

43

23

Mississippi

52

32

0.83

63

39

26

Missouri

206

46

0.90

228

50

24

Missouri

51

33

0.84

61

39

27

Montana

16

45

0.97

17

46

25

Montana

28

41

0.92

30

45 Continued

TABLE 12.3 Adjusted incomes by state: metropolitan and nonmetropolitan portions, 2015.dcont’d 28

Nebraska

62

51

0.93

67

55

26

Nebraska

32

48

0.86

37

55

29

Nevada

113

43

0.98

115

44

27

Nevada

12

43

0.93

12

46

30

New Hampshire

47

57

1.07

44

53

28

New Hampshire

25

51

1.00

25

51

31

New Jersey

538

60

1.13

475

53

32

New Mexico

54

39

0.96

56

41

29

New Mexico

25

36

0.89

28

41

33

New York

1103

60

1.17

944

51

30

New York

53

38

0.94

57

41

34

North Carolina

340

43

0.92

370

47

31

North Carolina

75

34

0.85

88

40

35

North Dakota

19

52

0.93

21

56

32

North Dakota

23

59

0.91

25

65

36

Ohio

420

45

0.90

468

51

33

Ohio

89

38

0.85

105

44

37

Oklahoma

125

48

0.91

137

52

34

Oklahoma

47

37

0.86

55

42

38

Oregon

154

46

1.00

154

46

35

Oregon

25

37

0.92

27

41

39

Pennsylvania

581

51

0.99

589

52

36

Pennsylvania

56

38

0.91

62

42

40

Rhode Island

53

50

0.99

53

50

41

South Carolina

166

40

0.91

182

44

37

South Carolina

24

32

0.82

29

39

42

South Dakota

21

52

0.92

23

56

38

South Dakota

20

44

0.84

23

52

43

Tennessee

228

45

0.91

251

49

39

Tennessee

50

33

0.84

59

39

44

Texas

1167

48

0.98

1194

49

40

Texas

118

39

0.88

134

44

45

Utah

106

40

0.97

109

41

41

Utah

13

41

0.94

14

44

46

Vermont

11

52

1.03

11

51

42

Vermont

19

47

1.00

19

47

47

Virginia

399

55

1.05

381

52

43

Virginia

38

36

0.88

43

40

48

Washington

351

54

1.05

333

52

44

Washington

29

41

0.95

31

43

49

West Virginia

44

38

0.89

49

43

45

West Virginia

24

33

0.87

27

39

50

Wisconsin

203

48

0.94

215

50

46

Wisconsin

62

41

0.87

70

47

51

Wyoming

11

60

0.98

11

61

47

Wyoming

22

55

0.95

24

58

Maximum

2098

74

1.22

1848

63

Maximum

118

60

1.03

134

65

Minimum

11

38

0.88

11

41

Minimum

6

31

0.82

6

36

Range

2087

36

0.34

1837

23

Range

112

29

0.21

128

29

Total metropolitan

13,803

50

1.02

13,561

49

Total nonmetropolitan

1744

38

0.88

1987

43

US total

15,548

48

1.00

15,548

48

Based on data from the Bureau of Economic Analysis, Regional Prices Branch.

348 Handbook of US Consumer Economics

total and per capita terms. For example, Alabama’s metro portion had a total income of $148 billion and a per capita of $40,000, with adjusted values of $169 billion and $46,000 respectively, while the nonmetropolitan portion of the state had a total income of $38 billion and $33,000 per capita, valued at $46 billion and $40,000 per capita when adjusted by the price levels. The last lines show the price levels for the metropolitan portions (1.02) and nonmetropolitan portions across all states (0.88). The weighted average for both is 1.00, with total personal income in the United States equaling $15,548 billion, or $48,451 per capita. This is the same for both nominal and adjusted income, by construction, as explained in the introductory paragraph. The difference in per capita incomes decreases between the metropolitan and nonmetropolitan portions of states after the RPP adjustment, from $12,000 (50e38) to $6000 (49e43) in rounded values. This differenceda smaller gap between metro and nonmetropolitan portionsdwhen incomes are adjusted by the RPPs is true for all states except North Dakota and Utah, where per capita nonmetropolitan incomes are actually higher than those in the metropolitan portions of the state, and the gap increases. Fig. 12.1A and B graphs the relationship between the RPPs and totals and per capita personal incomes for 2015 from Table 12.3. The price levels (RPPs/ 100) are plotted on the horizontal axis while incomes are on the vertical axis. The metro portions of states are shown as blue triangles, and the nonmetropolitan portions as red circles. The range of price levels in the metro portions of states is much larger than that of the nonmetropolitan portions. From 1.22 in Hawaii to 0.88 in Alabama metro portions, and 1.03 in Connecticut to 0.82 in Alabama and South Carolina nonmetropolitan portions of the states. In Fig. 12.1A, the red circles (nonmetropolitan) have a slightly downward trend, but in per capita terms, shown in Fig. 12.1B, this trend is upward as one would expect. That is, price levels and per capita adjusted incomes move in the same direction. One exception seems to be Hawaii, where price levels are among the highest relative to other states, but both total and per capita incomes are among the lowest. Looking back at Table 12.2, Hawaii not only has the highest relative rent price level, it is among the highest for all other expenditure classes, and has the highest Food and overall Goods price level, nearly 10% more than the US average, likely due to higher transport costs from the mainland.

5. Concluding remarks The RPPs currently reflect differences in the price levels of consumer goods and services. They are estimated from individual price observations in the CPI survey conducted by the BLS and from household level rent data in the ACS of the Census Bureau. A major constraint of the CPI survey is the sampling design that is limited to 38 urban and metropolitan index areas and targets time-to-time consistency rather than spatial consistency. On the other hand, the ACS has very detailed geographic identifiers, but the price observations are

Regional price parities in the United States Chapter | 12

(A)

349

States (metro and nonmetro porons) RPP vs. Adjusted Total income 2015

$2,000

CA

Adjusted Total Income ($billions)

$1,800 $1,600 $1,400 TX

$1,200 NY

FL

$1,000 $800

PA

$600

OH

$400

IL

AZ AL AR

$200

Metro Nonmetro

NJ VA MA WA MD CO CT

NC GA

ID

MN NV MT

NHAK

VT

HI

DC

$0 0.80

0.85

0.90

0.95

1.00

1.05

1.10

1.15

1.20

1.25

Price Level (RPP/100) with US = 1.00

States (metro and nonmetro porons) RPP vs. Adjusted per capita income 2015

(B)

Adjusted per capita Income ($thousands)

$70

ND

DC

CT

$65

MA WY

$60

CT

$55

NY Metro

$50

CA

$45

Nonmetro

$40

HI HI

$35

GA

AZ

FL

$30 0.80

0.85

0.90

0.95

1.00

1.05

1.10

1.15

1.20

1.25

Price Level (RPP/100) with US = 1.00

FIGURE 12.1 (A) Regional price parities (RPPs) versus Adjusted Total Incomes, State metropolitan and nonmetropolitan portions, 2015. (B) RPPs versus Adjusted per capita Incomes, State metropolitan and nonmetropolitan portions, 2015. Based on data from the Bureau of Economic Analysis, Regional Prices Branch.

restricted to the housing sector, specifically rents, utility costs, and selected owner-occupied housing costs, such as insurance payments and taxes. The robustness of the RPPs would benefit from a place-to-place survey of the goods and services sampled by the CPI, particularly for hard-to-measure items in the service sector, such as education, food, and medical services, and in geographic areas that are sparsely populated and less well represented in the national surveys, such as rural areas. Augmenting the price observations,

350 Handbook of US Consumer Economics

possibly by web-scraping and using third-party sources of information could provide additional consistency checks on the RPPs. Even with these constraints, the RPPs reflect the common perception of relatively high price levels in states with large metropolitan areas, such as New York, New Jersey, California, Maryland, Connecticut, Massachusetts, and the District of Columbia. The smaller states in the east and west coasts also have higher rents, for example, New Hampshire, Vermont, as well as Hawaii and Alaska (small only in terms of population, not size). As shown in Table 12.3 and Fig. 12.1B, the relative price levels tend to increase with increasing per capita incomes and mirrors the tendency in international comparisons of real incomes and PPPs. One important area for future research is the treatment of owner-occupied housing expenditures. There are no observed owner-occupied housing observations in the CPI, only imputed values derived from rental housing observations and adjustments for utility cost. These imputed values reflect the shelter flow-of-costs, a concept that has been extensively documented and explained elsewhere (Poole et al., 2005; BLS, 2011). Currently, we use the relationship between renters and owners found in the CPI files at the national level to impute owner-occupied expenditures to all households. Research is underway showing how the ACS public use file (PUMS) could be used to obtain detailed rental equivalence measures of owner-occupied housing expenditures and utility costs below the national level (Aten, 2017), and alternative concepts, such as user costs and opportunity cost methods for owner-occupied homes, are being compared using the ACS microdata files as well as the PUMS files (Aten, 2018). A separate issue with respect to housing expenditures is to reconcile BEA’s PCE weights in the national accounts with the BLS expenditure weights in the Consumer Expenditure (CE) survey. A partial concordance was done by Blair in 2012 and showed that the national share of rents out of total expenditures is significantly lower in the PCE than in the CE, impacting the RPPs (Figueroa et al., 2014). Currently, we redistribute expenditures in the second-stage aggregation to match the national distribution of PCE, but it would be interesting to create a new set of RPPs based on a more detailed concordance at the item strata level in the first stage, and on subnational distributions of PCE. Data availability RPPs, adjusted incomes in constant dollars (real regional incomes), and implicit regional price deflators are available through the BEA website. Data are available for 2008 to 2015 for states, state metropolitan and nonmetropolitan portions, and metropolitan areas at www.bea.gov. To access the data, select the “Interactive Data” tab at the top of the homepage. At the next screen, select “GDP & Personal Income” under Regional Data. Data are available in two formats through these links:

Regional price parities in the United States Chapter | 12

351

Data availabilitydcont’d -

Begin using the data: interactive tables where users specify data type, region, and time period. - Download complete data sets: flat files accessed through real personal income and RPPs menus. For further information about these data, email the Regional Prices Branch at [email protected].

Acknowledgments The author gratefully acknowledges the contribution of Eric Figueroa to the development, preparation, and production of the RPPs. This work would not be possible without the collaboration of the Bureau of Labor Statistics and the Census Bureau. In particular, we thank the staff of the Consumer Price Index (CPI) program in the Office of Prices and Living Conditions at BLS and the staff of the Social, Economic, and Housing Statistics Division of the Census Bureau for their technical and programmatic assistance.

Disclaimer The BEA Regional Price Parity statistics are based in part on restricted access Consumer Price Index data from the Bureau of Labor Statistics. The BEA statistics expressed herein are products of BEA and not BLS.

References Arndt, H.W., Sundrum, R.M., 1975. Regional price disparities. Bulletin of Indonesian Economic Studies 11 (2), 30e68. Aten, B.H., 1999. Cities in Brazil: an interarea price comparison. In: Heston, A., Lipsey, R. (Eds.), International and Interarea Comparisons of Income, Output, and Prices, National Bureau of Economic Research, Studies in Income and Wealth, vol. 61. The University of Chicago Press. Aten, B., 2005. Report on Interarea Price Levels, 2003. Working Paper 2005e11. Bureau of Economic Analysis. Aten, B., 2006. Interarea price levels: an experimental methodology. Monthly Labor Review 129 (9). Bureau of Labor Statistics, Washington, DC, September. Aten, B., 2008. Estimates of State and Metropolitan Price Parities for Consumption Goods and Services in the United States, 2005. Bureau of Economic Analysis. Aten, B.H., 2017a. Rental Equivalence Estimates of National and Regional Housing Expenditures. Working Paper WP2017-5. BEA. Aten, B.H., 2017b. Regional price parities and real regional income for the United States. Social Indicators Research 131, 123. https://doi.org/10.1007/s11205-015-1216-y. Aten, B.H., 2018. Valuing Owner-Occupied Housing: An Empirical Exercise Using the American Community Survey (ACS) Housing Files. Working Paper WP2018. BEA. Aten, B.H., Figueroa, E.B., 2014. Real personal income and regional price parities for states and metropolitan areas, 2008e2012. Survey of Current Business 94, 1e8.

352 Handbook of US Consumer Economics Aten, B.H., Figueroa, E.B., 2015. Real personal income and regional price parities for state and metropolitan areas, 2009e2013. Survey of Current Business 96. Aten, B.H., Heston, A., 2005. Regional output differences in international perspective. In: Kanbur, R., Venables, A.J. (Eds.), Spatial Inequality and Development. Oxford University Press. Aten, B., Reinsdorf, M., 2010. Comparing the consistency of price parities for regions of the U.S. In an economic approach framework. In: 31st General Conference of the International Association for Research in Income and Wealth. Aten, B., D’Souza, R., 2008. Regional price parities: comparing price level differences across geographic areas. Survey of Current Business. Aten, B., Figueroa, E., Martin, T., 2011. Notes on Estimating the Multi-Year Regional Price Parities by 16 Expenditure Categories: 2005-2009. Bureau of Economic Analysis. Aten, B.H., Figueroa, E.B., Martin, T.M., 2012. Regional price parities for states and metropolitan areas, 2006e2010. Survey of Current Business 92, 229e242. Aten, B.H., Figueroa, E.B., Martin, T.M., 2013. Real personal income and regional price parities for states and metropolitan areas, 2007e2011. Survey of Current Business 93, 89e103. Balk, B., 2009. Aggregation methods in international comparisons: an evaluation. In: Prasada Rao, D.S. (Ed.), Purchasing Power Parities of Currencies, Recent Advances in Methods and Applications. Edward Elgar Publishing. Biggeri, L., Laureti, T., 2009. Are integration and compariosn between CPIs and PPPs feasible? In: Biggeri, L., Ferrari, G. (Eds.), Price Indices in Time and Space. Springer. Biggeri, L., Laureti, T., 2015. Sub-national PPPs: methodology and application by using CPI data. In: Paper Presented at the Workshop on Inter-country and Intra-country Comparisons of Prices & Standards of Living, Arezzo, Italy. September 2014. www.polo-uniar.it. Blair, C., 2012. Constructing a PCE-Weighted Consumer Price Index. National Bureau of Economic Research (NBER) Working Paper (March);. www.nber.org. BLS Handbook of Methods, 2011. Bureau of Labor Statistics. https://www.bls.gov/opub/hom/pdf/ cpihom.pdg. Citro, C.F., Kalton, G. (Eds.), 2000. Small-Area Income and Poverty Estimates: Priorities for 2000 and beyond. National Academy Press, Washington, DC. Deaton, A., 2003. Prices and poverty in India, 1987-2000. Economic and Political Weekly 38 (4), 362e368. Deaton, A., 2006. Purchasing power parity exchange rates for the poor: using household surveys to construct ppps. In: Research Program in Development Studies. Princeton University. Deaton, A., Heston, A., 2010. Understanding PPPs and PPP-based national accounts. American Economic Journal: Macroeconomics 2 (4), 1e35. Deaton, A., Dupriez, O., 2011. Spatial price differences within large countries. In: Research Program in Development Studies. Princeton University and World Bank working paper. Deaton, A., Tarozzi, A., 2005. Prices and poverty in India. In: The Great Indian Poverty Debate. MacMillan, New Delhi. Diewert, W.E., 1999. Axiomatic and economic approaches to international comparisons. In: Heston, A., Lipsey, R. (Eds.), International and Interarea Comparisons of Income, Output, and Prices, National Bureau of Economic Research, Studies in Income and Wealth, vol. 61. the University of Chicago Press. Diewert, W.E., 2005. Weighted country product dummy variable regressions and index number formulae. Review of Income and Wealth 51 (4), 561e569. Dunn, A.C., Liebman, E., Shapiro, A., 2012. Developing a Framework for Decomposing MedicalCare Expenditure Growth: Exploring Issues of Representativeness. Bureau of Economic Analysis.

Regional price parities in the United States Chapter | 12

353

Fenwick, D., O’Donoghue, J., 2003. Developing estimates of relative regional consumer price levels. Economic Trends 599, 72e83. Figueroa, E., Aten, B., Martin, T., 2014. Expenditure Weights in the Regional Price Parities. Bureau of Economic Analysis. Garner, T., Short, K., 2009. Accounting for Owner-occupied dwelling services: aggregates and distributions. Journal of Housing Economics 18, 233e248. Koo, J., Phillips, K., Sigalla, F., 2000. “Measuring regional cost of living” research department, federal reserve bank of Dallas. Journal of Business & Economic Statistics 18 (1), 127e136. Kravis, I.B., Kenessey, Z., Heston, A., Summers, R., 1975. A System of International Comparisons of Gross Product and Purchasing Power. John Hopkins Press. Kravis, I.B., Heston, A., Summers, R., 1982. World Product and Income, International Comparisons of Real Gross Product. John Hopkins University Press. Martin, T., Aten, B., Figueroa, E., 2011. Estimating the Price of Rents in Regional Price Parities. Bureau of Economic Analysis. http://bea.gov/papers. McCarthy, P., 2010. Asia and pacific region, sub-national purchasing power parities e case study for the Philippines. In: Paper Presented at the 2nd ICP Technical Advisory Group Meeting, Washington DC, February 17e19. OECD/Eurostat, 2012. Eurostat-OECD Methodological Manual on Purchasing Power Parities. OECD Publishing, Paris. Orshansky, M., 1969. How poverty is measured. Monthly Labor Review 92, 37. Poole, R., Frank, P., Verbrugge, R., 2005. Treatment of owner-occupied housing in the CPI. Bureau of Labor Statistics. http://www.bls.gov/bls/fesacp1120905.pdf. Rao, D.S.P., 2004. On the equivalence of weighted country-product-dummy (CPD) method and the Rao system for multilateral price comparisons. Review of Income and Wealth 51, 571e580. Renwick, T., 2011. Geographic Adjustments of Supplemental Poverty Measure Thresholds: Using the American Community Survey Five-Year Data on Housing Costs. Working Paper U.S. Census Bureau. Renwick, T.J., Figueroa, E.B., Aten, B.H., 2017. Supplemental Poverty Measure: A Comparison of Geographic Adjustments with Regional Price Parities vs. Median Rents from the American Community Survey: An Update. Working Paper. BEA. Roos, M., 2006. Regional price levels in Germany. Applied Economics 38 (13), 1553e1566. Sergeev, S., 2004. The Use of Weights within the CPD and EKS Methods at the Basic Heading Level. Working Paper, Statistics Austria. Short, K., 2015. The Supplemental Poverty Measure: 2014 (Report number P60-254, Census Bureau. Short, K., O’Hara, A., 2008. Valuing housing in measures of household and family economic wellbeing. In: Annual Meeting of the Allied Social Sciences Associations. Society of Government Economists, New Orleans. Summers, R., 1973. International price comparisons based upon incomplete data. The Review of Income and Wealth 19 (1). Waschka, A., M, W., Khoo, J., Quirey, T., Zhao, S., 2003. Comparing living costs in Australian capital cities. In: A Progress Report on Developing Experimental Spatial Price Indexes for Australia. Australian Bureau of Statistics. Wingfield, D., Fenwick, D., Smith, K., 2005. Relative regional consumer price levels in 2004. Economic Trends (615), 36e46. World Bank, 2015. Purchasing Power Parities and the Real Size of World Economies. A Comprehensive Report of the 2011 International Comparison Program (ICP). The International Bank for Reconstruction and Development/The World Bank.

Chapter 13

Measuring prices and real household consumption of medical goods: service-based versus disease-based approaches Ralph Bradley1, Brett Matsumoto2 1

Retired, Littleton, NH, United States; 2Division of Price and Index Number Research, Bureau of Labor Statistics, Washington, DC, United States

1. Introduction Nominal US health care spending has grown from 5% of nominal GDP in 1960 to 17.9% in 2016. This is over a threefold increase. In 2017, the Bureau of Economic Analysis (BEA) reported that health care consumption expenditures were 22% of all Personal Consumption Expenditures (PCE). This has led to an ongoing debate about the reasons for both this extraordinary growth and extraordinary share of the economy. How much of this growth is attributable to health care inflation and how much of it is attributable to real health care consumption? Nominal measurement is relatively easy as one can aggregate all the expenditures made for health care goods and services. Real consumption estimation is more difficultdthe prices, the utilization, and the quality of medical goods and services are constantly changing and are not always easily observable. Measuring real health care consumption essentially comes down to forming an appropriate price index for deflating nominal health care expenditures. If published health care price indexes are upwardly biased, then we are actually getting more from our health care spending than the published statistics show and the reverse is true if published health care price indexes are downwardly biased. In either situation, biased health care price indexes will misinform the highly controversial and hotly contested national health care debate. Equally important is that as health care is such a large fraction of US Handbook of US Consumer Economics. https://doi.org/10.1016/B978-0-12-813524-2.00013-5 Copyright © 2019 Elsevier Inc. All rights reserved.

355

356 Handbook of US Consumer Economics

Gross Domestic Product (GDP) and personal consumption, a biased measure of real health care consumption will bias both the estimates for real personal consumption and real GDP. Presently, all Federal health care statistics are reported on a service basis and are broken into categories such as physician services, inpatient hospital services, and pharmaceuticals. Many studies, starting with Scitovsky (1967), show that relying on this approach leads to both mismeasured and less informative health care price indexes. The service-based approach does not allow us to analyze either the nominal or real expenditures on a disease basis and as long ago as 1967, it has been recognized that “. the average consumer of medical care is not as interested in the price of a visit or hospital day as he is in the total cost of an episode of illness.”1 When health care expenditures and price indexes are reported by disease, we show in this chapter that we can break down changes in nominal health care spending into pure price growth, disease prevalence growth, and real consumption growth. This allows us to better diagnose our health care economy. Relying solely on a service-based approach treats medical goods and services as consumer goods when they should be viewed as inputs into the healing of a disease, which is the ultimate consumption item that we should be measuring. This chapter reviews the issues related to measuring prices and real consumption in the medical sector, describes the weaknesses in the current service-based methods to measure real consumption, and finally outlines how the disease-based approach promises to provide a much better measure of how much real health care consumption our economy is receiving for the dollars spent on health care. We find that disease-based price indexes that account for utilization shifts grow more slowly than fixed-basket service price indexes. Unlike earlier findings from earlier time periods, we show that rising disease prevalence is the main driver of growth in aggregate nominal medical spending. Section 2 of this chapter establishes the microtheory behind the diseasebased approach with the use of the Grossman (1972) model where medical goods and services are used as inputs to provide healing from a disease. Section 3 describes current service-based methods currently used by Federal statistical agencies and documents their shortcomings. Section 4 reviews the previous literature that shows that the disease-based approach can correct for the shortcomings of the service-based approach. Section 5 provides detail on the methods recommended by the Committee on National Statistics (CNSTAT) to construct disease-based price indexes in their publication, At What Price (2002). Section 6 describes the data set that is used to both generate diseasebased price indexes and to decompose aggregate nominal spending growth for each disease into an inflation effect, prevalence effect, and a real consumption

1. US Department of Health, Education and Welfare (1967).

Service-based versus disease-based approaches Chapter | 13

357

effect. Section 7 presents the results. Finally, Section 8 describes issues with the current disease-based price index methods and outlines future work to improve current methods.

2. Basic theoretical framework This section presents a simple model of household behavior to set up the theoretical foundations of a cost of living (COL) index for medical goods and services.

2.1 Utility and health Following the Grossman (1972) model, individuals receive utility from consumption and health.2 There is no direct utility derived from the consumption of medical goods and services. Individuals derive utility from medical goods and services indirectly through their effect on health. At the start of period t, individual i receives a health shock mdi;t for a given disease d.3 Then, individuals choose a medical care treatment bundle for each disease. The level of consumption of nonmedical goods is determined by the individual’s period budget constraint. Given the treatment bundles chosen, the individual achieves a health status given by the health production function:    d d hdi;t ¼ f Zi;t (13.1) mi;t ; qdt where Z is a vector of medical care inputs. Let H denote the vector of diseasespecific h’s. The utility-maximizing individual chooses the level of health inputs subject to the current level of health technology qdt. Let C denote the consumption of nonmedical goods and services. The price of medical goods and services is pzt , and the price of nonmedical goods and services is pct . Finally, let Yi,t denote the individual’s income. The individual’s problem is to maximize u(Ci,t, Hi,t), subject to the budget constraint, P zd c d Yi;t ¼ pt  Ci;t þ pt  Zi;t . Statistical agencies can observe nominal health d PP zd d expenditures pt Zi;t , but are not able to easily measure H. One purpose of i

d

2. Health plays multiple roles in the Grossman model. We present a consumption version of the model where health enters the utility function and ignore the health as investment feature of the model. Standard price index theory assumes a static decision problem and the investment feature of the model makes the individual’s problem dynamic. 3. The random health shock is necessary to explain heterogeneity in medical consumption. At the individual level, the ex-post consumption will differ from the ex-ante expected level of consumption depending on the realization of the shock. At the aggregate level, the decision of how to handle the uncertainty (e.g., defining indexes as ex-post realized vs. ex-ante expected) should not make too much of a difference because the average ex-post consumption will be approximately equal to the average ex-ante expected consumption.

358 Handbook of US Consumer Economics

a medical price index is to debate nominal health expenditures to recover the unobserved real household consumption of H.

2.2 Medical price and cost of living indexes In this section, we derive a medical price index using the economic (or COL) approach.4 Assume that individual preferences can be represented by a constant elasticity of substitution (CES) utility function. CES preferences have two useful features. In practice, statistical agencies form price indexes in two stages. Lower-level indexes are created for item categories and these are aggregated in a second stage to form an all-item index. The two-stage approach requires separability in preferences, which CES utility satisfies. The second feature is that CES preferences are homogeneous, which simplifies the COL index. The utility for individual i in period t is given by " #1 r r    r X d u Ci;t ; Hi;t ¼ ac Ci;t þ (13.2) ahd hi;t d

The COL index is defined as the ratio of the expenditure function in a comparison period to the expenditure function in the base period for a given level of utility, where the expenditure function is the minimum expenditure required to reach a given level of utility. This measures the change in expenditure that would be required to reach the original level of utility after a change in prices. For simplicity, assume the base period is period 0 and the d comparison period is period 1. If the price of a unit of health stock is ph , then the COL price index is given by " #1  c  1s P hd . 1s 1s þ p1 ac p1 ah d   d c hd c hd I p0 ; p0 ; p1 ; p1 ¼ " (13.3) #1 ;  c  1s P hd  1s 1s p0 ac p0 ah d þ d

1 : The homogeneous utility function causes the reference level wheres ¼ r1 of utility to drop out of the COL index. This index cannot be calculated without additional assumptions because there is no market for health capital hdt d and no price pht .5 The price of health capital can be defined implicitly from the

4. See Diewert (1987) for a discussion of the various approaches to price indexes. 5. If the price of health were observed, the COL index could be calculated even without knowing the parameters of the utility function as a SatoeVartia index, which only requires information on the prices and expenditures in each period. This is because of the CES preferences as shown by Sato (1976).

Service-based versus disease-based approaches Chapter | 13

359

medical goods and services that are inputs into the health production function. If we assume a CES production function: !1   g g X d d d d ; (13.4) f Zt ¼ qk;t zk;t k

where

zdk;t

refers to the kth element of Z, then a unit of hd has a price of !1 X . 1u 1u d pht ¼ ; (13.5) pzt k qdk;t k

1 : Denote the proportion of individuals with disease d as pd , where u ¼ g1 then the aggregate medical price index is given by P d hd p p1 d zk Iðfpt gjck; t ¼ 0; 1Þ ¼ P d hd (13.6) p p0 d

To calculate this index, one must be able to estimate the time-varying parameters of the health production function. Although the data requirements are a major practical barrier, there are several reasons why the COL approach to price indexes may not be applicable to medical goods and services. First, the COL approach assumes that individuals are choosing the optimal level of medical care given their constraints. Medical goods and services are often selected by the individual’s doctor who introduces a principal-agent problem. Second, individuals may not face the true marginal cost of treatment because of third party payers for medical care (e.g., private insurance or government). Private insurance introduces additional complications due to adverse selection and moral hazard (Rothschild and Stiglitz, 1976; Arrow, 1963; Pauly, 1968, 1974; Akerlof, 1970). Finally, a major limitation of the COL approach to price indexes is that individuals are assumed to face a repeated static optimization problem.6 In medical care, the one period static model assumes that current period medical care is consumed because of the effect on current period health. This ignores the medical consumption as investment feature of the Grossman model. In a full dynamic model, the benefits of current medical consumption include the improvement of future health. The static framework may be sufficient to study treatment of acute conditions, but treatments of chronic conditions and preventative care are difficult to capture in a static framework.

6. Some researchers have extended the COL approach to price indexes to a dynamic framework. For some examples, see Alchian and Klein (1973), Pollack (1975).

360 Handbook of US Consumer Economics

2.3 Decomposing nominal expenditures A benefit of disease-based price indexes is that they can be used to decompose total nominal expenditure to treat a disease. The growth in nominal expenditures on disease d is given by P zd d pt Zi;t d d Etd pht Popt pdt ht i ¼ ¼ : (13.7) P d d d d Et1 pht1 Popt1 pdt1 hd pzt1 Zi;t1 t1

i

where h denotes the average health outcome. The above identity is useful for determining the parts of nominal expenditure growth for treating.disease d, . d d d d Et Et1 , from period t1 to t that come from inflation, pht pht1 , US population growth, Popt/Popt1, the growth in the rate of disease prevalence, .  d d pdt pdt1 , and real consumption per patient, ht ht1 .7 The total number of individuals treated for disease d in period t is Ntd ¼ Popt  pdt . The reason that it is important to decompose Ntd into Popt  and pdt is that it is useful to understand how Ntd is changing. If pdt pdt1 is greater than one, an increasing fraction of the population is either contracting or being diagnosed with disease d, or in other words, the population is getting relatively sicker with disease d. Health care experts might then be motivated to redirect research into finding reasons for this increase in prevalence. Likewise, . if inflation growth, pht

d

d

pht1 , is the key driver, then research is more effec-

tively directed at finding the causes of this inflation growth rather than prevalence.8 Here we present a price index approach to decomposing the growth in nominal expenditures. Other studies use the growth in the average treatment . d d d ht and (i.e., cost per patient in place of pht pht1 d ht1  d . . d .  d d d average cost growth ¼ pzt Z t Ntd pzt1 Z t1 Nt1 ). Starr et al. (2014), Roehrig et al. (2009), Thorpe et al. (2004), and Bundorf et al. (2009) have done decompositions using the average cost approach. A limitation of this approach is that it does not decompose price change from a change in quantity (output). When using a price index approach, it is possible to decompose the two. . d d pht1 is the price index for disease d with the base period t  1 and the comparison period 7. pht t. 8. Even if inflation is the primary driver behind nominal expenditure growth, health care experts still might find that it is more effective to reduce nominal expenditure growth by attempting to reduce treatment prevalence.

Service-based versus disease-based approaches Chapter | 13

361

The results from previous decomposition studies vary. Thorpe et al. (2004) conclude that the growing prevalence of chronic disease is the largest contributor to historical health care cost growth, while the rest conclude that it is the growth in the average cost of treating a patient. The studies are conducted during different time periods and this may influence the different results. Starr et al. (2014) cover the longest time period that starts in 1980 and ends in 2006. None cover the period after 2008 when the health care expenditure growth began to slow and become closer to real GDP growth.

3. The current service-based approach to medical measurement All Federal statistical agencies report medical expenditures, employment, and price indexes by category/industry of medical goods and services such as pharmaceuticals, hospitals, and physicians services. Both the National Health Expenditure Accounts (NHEA) and the US Bureau Economic Analysis (BEA) use a variety of data sources such as the US Economic Census that survey medical establishments to estimate national expenditures. Once nominal expenditures are estimated, the next challenge is to deflate these nominal levels into an estimate of real medical consumption. To do this, both BEA and the Bureau of Labor Statistics (BLS) generate price indexes for each type of medical good and service. BLS’s medical price indexes are computed from a series of household and medical establishment surveys, while the data sources for BEA medical price indexes are a combination of statistics that are generated by other Federal agencies, including BLS’s medical price indexes. We review the current method of generating medical price indexes and then discuss what they do and do not measure.

3.1 Current methods The estimation of medical prices in the Consumer Price Index (CPI) starts with two household surveys.9 The Consumer Expenditure Survey asks respondents how much they spent on different categories (called item strata) of consumption goods and services. These expenditures are used to weight the lowerlevel indexes corresponding to particular item strata when forming the aggregate all-goods index. The second survey is the Point of Purchase Survey and is used to select the outlets where price information will be collected. For each expenditure, say for a physician visit, the respondent provides the medical provider name, address, and the total amount paid from all payers. When this survey is completed, medical expenditures are aggregated by each outlet identified in the survey, and the expenditure share for each outlet is computed. 9. For more detail on BLS price index construction, see Chapter 17 of the BLS Handbook of Methods (Bureau of Labor Statistics, 2016).

362 Handbook of US Consumer Economics

The next step involves selecting medical outlets such as physician offices, hospitals, and pharmacies. The probability that a medical outlet mentioned in the household survey is selected is proportional to its medical expenditure share. If an outlet is selected, then a particular item with a fixed set of characteristics that is sold by the outlet is selected. For example, if a physician office has been selected, then examples of items are an office visit for a patient with an Humana employee group health plan, physician services for a heart transplant in a hospital setting for n patient with an individual insurance plan, or if a pharmacy has been selected, an example of an item is a 30-day supply of Lovastatin with a particular National Drug Code for a Medicare enrollee. Once the item has been selected, its characteristics are fixed and each month the same item with the same characteristics is repriced. The selected item stays in the sample for a maximum of 5 years. The reason that these characteristics are fixed is that if there is a price change for this particular item, it cannot be the results of characteristic or quality change. It can only be attributed to inflation. BLS uses these repricings to generate a monthly index for physician services, hospital services, pharmaceuticals, medical durables, and dental services for 38 areas. A national medical price index is then constructed by computing a weighted average price index from all the various area service medical price indexes. The resulting price index is a “fixed basket” of medical goods and services. Insurance is priced indirectly in the CPI. Most of the weight for the expenditure on insurance premiums is reallocated to the individual medical goods and services. The medical component of the Producer Price Index (PPI) differs from the CPI in that government payers are included. Given the prominence of government in the medical sector, the medical component of the CPI is weighted much lower (approximately 6.7%) than the medical component of GDP. The PPI is also restricted to domestic producers, which excludes medical goods (such as pharmaceuticals) that are imported. The BEA estimates the value of nominal medical expenditures as a part of the National Income and Product Accounts. Medical expenditures are a component of PCE. A variety of data sources are used to estimate medical expenditures.10 The primary data sources are the economic census and the retail trade surveys (and service surveys), which are conducted by the Census Bureau. These data are supplemented with other government data or private sector (or trade group) data. The nominal expenditures in each good or service category are deflated using a corresponding component index from the CPI or PPI. For example, pharmaceutical expenditures are deflated using the CPI for prescription drugs. The individual goods and services quantities and prices are aggregated to form the estimate for real GDP and the GDP deflator.

10. See Chapter 5 of Bureau of Economic Analysis (2014) for details.

Service-based versus disease-based approaches Chapter | 13

363

The price indexes associated with the PCE differ in several ways from the aggregate CPI or PPI, even though the lower-level indexes are largely the same. The most significant difference is because of the different scope of the indexes. The medical services in the PCE include goods and services paid for by the government and employer. Therefore, the weight of the medical sector in the PCE is much larger (approximately 20%) than in the CPI. The PCE deflator also uses a different equation that allows the weights to differ in consecutive periods.11

3.2 What current price indexes measure and do not measure The current medical price indexes tell us if on average the cost to purchase a fixed basket of medical goods and services is rising. However, they do not tell us everything. They cannot tell us what part of nominal expenditure growth is coming from changing quality (characteristics), change in utilization, and change in disease prevalence. Under the current method, the household respondent is not asked about their diseases and the medical outlets are not asked for the diseases that they treated or how their treatment protocols have changed. As treatments for diseases change over time, various inputs become more or less prevalent in the treatment. When considering the total price of treatment, the substitution across different categories of medical goods and services can lead to a lower treatment price, although the price of individual medical inputs increases. To see what is not measured, consider the expected value of the BLS medical price index. Assume that the elements of Z, indexed by k, correspond to the item categories in the BLS medical price indexes. Let m index the individual items within the service category k. There is a pd0 probability that an individual with disease d is selected in the household survey in time period t ¼ 0 (the base period). Let pdmj0 be the conditional probability that item m is selected given that the individual has disease d in period 0. The expected expenditure for an ind dividual with disease d on item m in period 0 is Em;k;0 . The probability that an item is selected to be part of the basket of goods in period 0 is P d d d p0 pmj0 Em;k;0 Em;0 pm;k;0 ¼ P ¼ PdP (13.8) d Em;0 pd p d E m

m d

0 mj0 m;k;0

11. The CPI uses a fixed-quantity Lowe index, where the quantity weights are updated every 2 years. The PCE deflator is based on Fisher’s ideal index, which is a chain-weighted index. A similar concept is employed in the chained CPI, which uses expenditure data in adjacent periods and the Tornqvist formula. The Fisher and Tornqvist indexes are superlative indexes and are considered the best approximation of the true COL index.

364 Handbook of US Consumer Economics

The expected price index for service k in time t is PP d pm;k;t pm;k;0 d m P d Ik;t ¼ P pm;k;0 pm;k;0

(13.9)

d m

Notice that this method integrates out the set of diseases twicedfirst when generating the probability that an item is selected and second when computing the service index (this second integration only enters if the price of the item depends on the disease being treated). This method has the potential for mismatch because the probability an item is selected does not fully depend on the disease of the respondent. It is possible that an individual with asthma is selected in the point of purchase survey and the provider for this individual is selected for the sample. When selecting an item from the provider, it’s possible that a procedure to treat diabetes is selected. In expectation, this mismatch may not be a problem, but given the limited sample sizes in the price indexes, the treatments for certain conditions may not be represented in the index. The aggregation of the lower-level service price indexes into a single medical index uses the expenditure shares from the Consumer Expenditure Survey as weights. This introduces an additional potential for mismatch as the items selected within a service price index may not reflect the treatments for the diseases of the individuals in the Consumer Expenditure Survey. Using the service price index for service k, the estimate of real consumption in time t is P d Popt Ek;t d Zk;t ¼ : (13.10) Ik;t The expected decomposition of the growth in aggregate nominal expen! P d ditures Ek;t ¼ Popt Ek;t is d

Ek;t Zk;t Ik;t ¼ Ek;t1 Zk;t1 Ik;t1 Z

(13.11) I

k;t k;t where Zk;t1 is the expected estimate of real consumption change and Ik;t1 is the estimate of aggregate price change. Note that real output is defined in terms of the total inputs to the health production function. An improvement in medical technology (more health output for a given set of medical service inputs) will lead to a decrease in real output, if the price of the inputs remains the same. There are several reasons that this decomposition could be biased. First, this fixed-basket price index could suffer from substitution bias. This common

Service-based versus disease-based approaches Chapter | 13

365

source of bias in price indexes is fully described in the Boskin Commission report (Boskin et al., 1996). Second, medical technology and disease prevalence is constantly changing. This decomposition in time period t is based on disease prevalence and treatment probability distributions in time 0. These distributions can and do change from period 0 to period t, even if this decomposition is not biased. It does not fully inform us about all of the causes of real expenditure growth. We cannot determine the effects of the change in disease prevalence or effects of technical innovation on real consumption growth

Zk;t Zk;t1

.

4. The disease-based approach There are many reasons why controlling for disease and pricing the entire treatment of a disease   is important. The production   function for converting medical inputs

d Zi;t

into additional health

hdi;t

depends on the disease

being treated. The medical inputs can also differ across diseases. The current service-based methods do not control for disease as diseases are integrated out. Reporting medical expenditures and price indexes by disease can give us much better information about the factors behind aggregate health care expenditure growth. Medical services are not final outputs from the point of view of the individual but instead are inputs into the production of health. When one is ill, a comb inflation of these medical inputs, such as an office-based visit, lab tests, and prescriptions, is used to treat the disease according to a production function for that The consumer desires the healing or the addition to  disease.  health capital hdi;t

that comes from the utilization of these inputs. From a

generic price index perspective, it is the healing of the disease that is the real consumption, but there is no one as firm as one is in other industries that purchase these inputs to deliver the healing output to the consumer. Therefore, there is no observable market price for the delivery of this healing. Instead, these medical inputs are treated by statistical agencies as final outputs because prices are easier to observe, and there is no connection made between the input purchases and the final delivery of healing. For nonmedical industries such as the automobile industry, the producer purchases all the inputs and produces a final output such as an automobile. If the automobile industry was managed as the health care industry, the consumer would be buying the steel, plastic, and labor inputs to make her car.12

12. The consumer would also have to get a prescription or authorization from some “knowledgeable” provider.

366 Handbook of US Consumer Economics

4.1 Single-disease price indexes As far back as 1967, there was a realization that medical goods and services were merely inputs and that constructing a separate index for each good and service would not tell us the “total price” of treating various diseases. Scitovsky (1967) is the earliest study that compares a price index based on the total cost of the disease to a price index based on pricing only services. Later studies began to look at constructing price indexes for single diseases. A common theme of these single-disease studies is that a new technology is introduced that lowers the cost of treatment, improves outcomes, or both. Traditional price indexes may overstate inflation in these instances, as a fixedbasket index (even with regular updates to the basket) will not be able to fully account for the substitution to the new technology. Disease-based price indexes, by capturing the full cost of treatment, are able to account for this substitution between different treatment options. If the new treatments also improve outcomes, this quality improvement will need to be accounted for in a true COL index. Shapiro and Wilcox (1996) study cataract treatment and find that the CPI service price index is not accounting for the substitution of a lower-cost outpatient-based treatment for the higher-cost inpatient hospital treatment. The improvement in cataract surgery procedures also means that individuals do not wait for their vision to deteriorate as much as in the past before getting the procedure. To fully capture the value to the consumer of the new treatment, it is not enough to just consider the cost of treatment, the shorter treatment time, or the reduction in serious complications. By having the surgery earlier, individuals are able to have better vision for a much longer period. Although they recognized the issue of quality adjustment, they were not able to come up with an adequate solution. However, just by defining the good as the treatment of cataracts and capturing the total cost of treatment over time, they find that the CPI service price index overstates inflation. In a series of papers, Berndt and coauthors construct alternative price indexes for the treatment of depression (Frank et al., 1998, 1999; Berndt et al., 2001a, 2001b, 2002). They conclude that current price indexes do not account for the ability to rely more on a new generation of antidepressants and rely less on office therapy visits. They define the health outcome as a successful treatment, and each treatment option (antidepressants, therapy, or both) achieves a successful outcome with some probability. Differences in quality are reflected in the probability of a successful outcome, and the price index measures changes in the expected cost of a successful treatment as individuals substitute to different treatment bundles. Cutler et al. (1998, 2001) show that the current hospital price indexes do not incorporate the recent improvement in heart attack treatments. The new heart attack treatments increase overall life expectancy. Forming a COL index for heart attack treatment requires placing a value on the nonmarket good of an

Service-based versus disease-based approaches Chapter | 13

367

additional year of life. In forming their indexes, Cutler et al. (1998) use a range of values for an additional year of life. Although the prices of medical goods and services increased over the time period studied, a COL index for heart attack treatments decreased once the improvement in health outcomes is accounted for. Finally, the BLS has taken some steps to move toward the pricing of the treatment of a disease rather than pricing the individual inputs. The hospital components of the PPI and CPI consider the entire hospital bill for an inpatient stay to be the unit. This involves pricing at the level of the diagnostic-related group where available. Future work would be needed to extend this framework to include medical goods and services that are provided in other environments that also contribute to the treatment of a specific condition. The single-disease studies show different strategies to account for changes in quality. The issue of quality adjusting medical prices will be discussed further in Section 8.2. The first step is to create a general disease-based price index framework, which we describe in the next section. The issue of quality adjustment is one that is currently unresolved, both in the current price indexes and in the experimental disease-based price indexes.

4.2 General disease-based price indexes The studies discussed in the previous section have lead to a growing consensus that medical price indexes should change from being service-oriented to being disease-oriented. In the 2002 publication of At What Price, the CNSTAT makes the following recommendation: BLS should select between 15 and 40 diagnoses from the ICD (International Classification of Diseases) chosen randomly in proportion to their direct medical treatment expenditures and use information from retrospective claims databases to identify and quantify the inputs used in their treatment and to estimate their cost. On a monthly basis, the BLS could reprice the current set of specific items (e.g., anesthesia, surgery, and medications), keeping quantity weights temporarily fixed. Then, at appropriate intervals, perhaps every year or two, the BLS should reconstruct the medical price index by pricing the treatment episodes of the 15e40 diagnosesdincluding the effects of changed inputs on the overall cost of those treatments. The frequency with which these diagnosis adjustments should be made will depend in part on the cost to BLS of doing so. The resulting medical CPI price indexes should initially be published on an experimental basis. The panel also recommends that the BLS appoint a study group to consider, among other things, the possibility that the index will “jump” at the linkage points and whether a prospective smoothing technique should be used.

368 Handbook of US Consumer Economics

After this recommendation was made, several studies have generated disease-based indexes. While they mostly conclude that current servicebased medical price indexes are upwardly biased, they all use different methods and different data. Song et al. (2009) were the first to complete the study. It follows the CNSTAT recommendation using insurance claims data for self-administered plans in three cities. They found that indexes generated using the CNSTAT recommendation were lower than the service price indexes, but the differences were not statistically significant. Aizcorbe and Nestoriak (2011) use a claims database to generate a price index for various disease episodes. They find that medical service price indexes are higher than the disease-based price indexes. Bradley et al. (2010) follow the method put forth by CNSTAT and use the Medical Expenditure Panel Survey (MEPS) to quantify the use of inputs used to treat each disease. Insurance claims data are expensive, and under current Federal budgetary constraints, it is not likely that the official price indexes will be able to utilize claims data. Although the MEPS has some limitations, it seemed to be the most costeffective approach to accessing this input data. The MEPS input data are combined with CPI price indexes to generate the disease-based price indexes. The service-based price index does generate bias (on average 0.36% per year) when the total price paid is the measure. If the direct out-of-pocket price or price corresponding to the scope of the CPI (out of pocket plus any out-of-pocket insurance premiums that go toward paying medical claims) is used, the bias can be negative or not significantly different from zero. Aizcorbe et al. (2011) construct various types of disease-based price indexes by running an insurance claims grouper through the MEPS physician, inpatient, outpatient, and emergency room event files. They construct price indexes for the grouper episode disease classifications, the Clinical Code Classification, and at the chapter level of the ICD manual. Finally, Bradley (2013) shows how to construct timely disease-based price indexes with data from the BLS CPI databases and the MEPS database.

5. BLS experimental disease-based price indexes In this chapter, we present the disease-based price indexes and nominal expenditure decompositions using the method of Bradley (2013). The BLS currently produces experimental disease-based price indexes using this method. Relative to other methods used to generate disease-based price indexes, the method of Bradley (2013) has the advantage of being a timely index.13 The BEA also produces an experimental disease-based price index as a part of their Healthcare Satellite Account (Dunn et al., 2015).

13. The experimental indexes are updated monthly and generally within a week of the release of the CPI at https://www.bls.gov/pir/diseasehome.htm.

Service-based versus disease-based approaches Chapter | 13

The CNSTAT Price Index for disease d from month s to t is P d d pk;t zk;yðtÞ k ; Pdt ¼ P pdk;s zdk;yðsÞ

369

(13.12)

k

where y(*) refers to the year of the month. The variable zdk;yðtÞ is the average quantity of medical service k used to treat disease d in year y(t). Unlike a traditional fixed-basket index, different quantities enter the numerator and denominator of the index. CNSTAT recommended updating the quantities on a yearly basis, so within the year the quantities are held constant. In the first month of a new year when the quantities change, the index will jump because it will not only reflect the change in prices from s to t but also the change in quantities from y(s) to y(t). The index smooths this jump by dividing the quantity change over the entire year.14 The utilization data for the price index come from the MEPS. Price data from the MEPS need to be supplemented with a different data source because the MEPS only updates once a year, and MEPS data are released with a several year lag. The service-based price indexes from the CPI and PPI are used to impute the current prices. Let Ik,t be the BLS price index for service k, then the price for period t is  pdk;t ¼ pdk;0 Ik;t Ik;0 (13.13) where pdk;0 is the initial period price in the MEPS data for service k to treat disease d. The result is a timely index that is released on a monthly basis, which is a necessary condition for their potential future use in the official price indexes. A particular medical service can be associated with the treatment of multiple diseases. The presence of comorbidities complicates the assignment of specific treatments to specific diseases. One version of the index makes a comorbidity adjustment on the basis of the relative utilization of a particular medical service to treat the different diseases. For example, if the average treatment bundle for disease A has two doctor visits and the average treatment bundle for disease B has one doctor visit, then a doctor’s office visit associated with both diseases is assigned two-third to disease A and one-third to disease B. If comorbidities increase over time, then this adjustment will lead to lower average utilizations per disease. This decrease in utilization may or may not be attributable to improvements in treatment technology, and the unadjusted index may be preferable in this case. The BEA disease-based price index is also based on the CNSTAT recommendation and tracks total expenditure by disease using MEPS

14. We produce both smoothed and unsmoothed indexes. For simplicity, we only present the smoothed indexes in this paper.

370 Handbook of US Consumer Economics

combined with private claims data. Although the indexes have similar motivation and goal, there are some differences in practice. The BEA indexes are annual and are only available through the most recent MEPS year. The BLS indexes are produced at a monthly frequency and are timely (although still reliant on the not-so-timely MEPS data for utilization). The BEA indexes also do not attempt to decompose expenditure growth into pure price inflation and utilization growth.

6. Data Health care utilization data for the disease-based price indexes are from the MEPS, which is conducted by the Agency for Healthcare Research and Quality (AHRQ). The MEPS started in 1996 and is a rotating panel data structure. Households complete an initial interview on entering the panel. Over the course of the next 2 years, there are five follow-up surveys. Every year, approximately half of the sample is dropped and replaced by new households. In addition to basic demographic questions, individuals are asked about their illnesses and their medical care usage. The survey supplements the responses from the household with data from their medical providers. Information about the individual’s diagnoses and illnesses is contained in the Medical Conditions file. Staff from the AHRQ converts the information about the illnesses into diagnosis codes. The Medical Conditions file is linked to Household Event files, which provide information on the medical care received for the conditions. In defining the disease categories, similar diagnosis codes are combined into broader categories. MEPS aggregates similar ICD-9 diagnosis codes using Clinical Classification (CCS) codes. We further aggregate the CCS codes based on the chapters of the ICD manual to generate a total of 18 disease categories. The limited sample size of MEPS prevents meaningful analysis at too fine a level of aggregation, particularly for conditions that are not common. Table 13.1 presents a list of the disease categories as well as the most frequently occurring diagnoses for each category based on CCS codes. Every diagnosis in the Conditions file is linked to the services used to treat the diagnosis, and the entire bundle of treatments is a unique event in the Event files. The typical bundle of treatment for a given disease that is included in the price index is defined as the average utilization of the various medical services among individuals with a given condition. These average quantities are updated annually when a new year of MEPS data is released. Table 13.2 presents the average utilization by disease for the 2014 wave of MEPS. The services that enter the price index are inpatient hospitalization, outpatient visits, doctor’s office visits, emergency room visits, home health care services, and prescription drugs. One limitation is that the prescription drug event files only records prescriptions filled by the individual, not whether the individual complies with the prescription.

TABLE 13.1 Disease categories. Disease category

Common diagnoses

1

Infectious and parasitic diseases

Viral infection; mycoses; other infections; bacterial infection; immunizations and screening

2

Neoplasms

Unspecified benign neoplasm; other nonepithelial cancer of the skin; neoplasm of unspecified nature; cancer of breast; cancer of prostate

3

Endocrine, nutritional, and metabolic diseases

Disorders of lipid metabolism; diabetes mellitus without complications; thyroid disorders; fluid and electrolyte disorders; other endocrine disorders

4

Diseases of the blood

Deficiency and other anemia; other hematologic conditions; coagulation and hemorrhagic disorders; diseases of white blood cells; sickle cell anemia

5

Mental disorders

Other mental conditions; anxiety, somatoform, dissociative, and personality disorders; affective disorders; preadult disorders; senility

6

Diseases of the nervous system

Otitis media; headache including migraine; other nervous system disorders; blindness and vision defects; other ear and sense organ disorders

7

Diseases of the circulatory system

Essential hypertension; coronary atherosclerosis and other heart disease; cardiac dysrhythmias; other and ill-defined heart disease; nonspecific chest pain

8

Diseases of the respiratory system

Other upper respiratory infections; upper respiratory disease; asthma; influenza; chronic obstructive pulmonary disease and bronchiectasis

9

Diseases of the digestive system

Intestinal infection; disorders of teeth and jaw; esophageal disorders; other disorders of stomach and duodenum; other gastrointestinal disorders

10

Diseases of the genitourinary system

Urinary tract infections; menopausal disorders; menstrual disorders; genitourinary symptoms and ill-defined conditions; other female genital disorders

371

Continued

Service-based versus disease-based approaches Chapter | 13

Number

Number

Disease category

Common diagnoses

11

Complications of pregnancy, childbirth, and the puerperium

Normal pregnancy and delivery; contraceptive and procreative management; spontaneous abortion; other complications of birth; other complications of pregnancy

12

Diseases of the skin

Other skin disorders; skin and subcutaneous tissue infections; other inflammatory condition of skin; chronic ulcer of skin

13

Diseases of the musculoskeletal system

Other nontraumatic joint disorders; spondylosis and other back disorders; other connective tissue disorders; osteoarthritis; rheumatoid arthritis and related disease

14

Congenital anomalies

Other congenital anomalies; cardiac and circulatory congenital anomalies; genitourinary congenital anomalies; nervous system congenital anomalies

15

Certain conditions originating in the perinatal period

Live birth; other perinatal conditions; short gestation or low birth weight

16

Injury and poisoning

Sprains and strains; other injuries and conditions due to external causes; contusion; open wounds of extremities; open wounds of head, neck, or trunk

17

Other conditions

Residual codes; administrative; medical examination; allergic reactions; other screening for suspected conditions

18

Residual and unclassified codes

These are cases where Medical Expenditure Panel Survey (MEPS) did not assign a condition code

*Note: Common diagnoses for a disease category are the most frequently occurring Clinical Classification System diagnosis codes in the MEPS conditions file.

372 Handbook of US Consumer Economics

TABLE 13.1 Disease categories.dcont’d

Service-based versus disease-based approaches Chapter | 13

373

TABLE 13.2 Average utilization per disease in 2014. Disease

Doctor’s visits

Emergency

Inpatient

Rx

1. Infections

3.30 (0.50)

0.00 (0.00)

0.00 (0.00)

5.41 (0.66)

2. Neoplasms

2.77 (0.60)

0.04 (0.02)

0.02 (0.02)

1.44 (0.52)

3. Metabolic diseases

4.47 (0.78)

0.05 (0.03)

0.13 (0.05)

1.03 (0.11)

4. Blood diseases

6.55 (1.47)

0.00 (0.00)

0.07 (0.02)

0.85 (0.16)

5. Mental disorders

2.62 (0.11)

0.03 (0.01)

0.03 (0.01)

4.56 (0.09)

6. Nervous system

2.98 (0.16)

0.00 (0.00)

0.00 (0.00)

1.08 (0.11)

7. Circulatory system

1.75 (0.17)

0.15 (0.03)

0.15 (0.03)

1.86 (0.17)

8. Respiratory system

2.25 (0.31)

0.29 (0.10)

0.41 (0.10)

2.83 (0.24)

9. Digestive system

1.68 (0.39)

0.33 (0.03)

0.31 (0.04)

1.37 (0.09)

10. Genitourinary system

0.95 (0.04)

0.17 (0.04)

0.01 (0.01)

0.91 (0.05)

11. Pregnancy and childbirth

4.41 (0.71)

0.59 (0.06)

0.06 (0.02)

0.71 (0.12)

12. Skin diseases

1.58 (0.12)

0.07 (0.01)

0.02 (0.00)

1.22 (0.04)

13. Musculoskeletal system

4.25 (0.16)

0.05 (0.00)

0.03 (0.00)

1.35 (0.03)

14. Congenital anomalies

2.83 (0.54)

0.04 (0.02)

0.06 (0.02)

0.48 (0.09)

15. Perinatal conditions

0.72 (0.18)

0.03 (0.02)

0.27 (0.09)

0.30 (0.12)

16. Injury and poisoning

2.98 (0.13)

0.35 (0.00)

0.06 (0.01)

0.71 (0.02)

17. Other conditions

1.63 (0.10)

0.05 (0.00)

0.02 (0.00)

1.37 (0.03)

18. Unclassified

2.04 (0.05)

0.07 (0.00)

0.04 (0.00)

1.80 (0.01)

Note: Standard error in parentheses.

The MEPS data on utilization and expenditures is supplemented with the CPI and PPI service-based price indexes to create a timely price index. Specifically, we use the physician services and hospital component of the PPI and the pharmaceutical component of the CPI. The CPI hospital and physician indexes only include private payers, so the PPI is used to price these services to match the utilization data in the MEPS (which includes all services regardless of payer). The CPI pharmaceutical index is used because the PPI only includes domestically produced pharmaceuticals. The CPI includes all pharmaceuticals consumed domestically, but only includes private payers.

374 Handbook of US Consumer Economics

7. Results 7.1 Trends in utilization by disease The disease-based price indexes price the entire episode of treatment. Changes in the price of the entire episode can be due to changes in utilization to treat a given disease or to changes in the prices of the individual goods and services. Therefore, changes in the average utilization of the different medical goods and services to treat a disease are a significant contributor to changes in the price index for that disease. Table 13.3 presents the change in utilization by disease and type of medical service over a 10-year period. The table reports the average utilization in 2014 as a percent of the utilization in 2004. Anything under 100% corresponds to a decline in average utilization. In general, utilization of inpatient hospital and physician services decreases for the majority of conditions. For emergency services, the average utilization decreases for about half of the diseases.

7.2 Disease-based price indexes In this section, we present the disease-based price indexes and compare them to other medical price indexes. The individual disease indexes can be aggregated to form a single-diseaseebased price index using the expenditures per disease as the weights in the aggregation. The disease-based indexes update the quantities and the expenditure weights using the most recent available year of MEPS data. An alternative is to hold the quantities fixed at the original base period levels. This alternative index is a Lowe index. It still aggregates over the different diseases to form a single index, but it holds the quantities that define that treatment for each disease constant at the base period levels. The Lowe index will generally overstate inflation if disease treatments shift to less expensive types of services (e.g., substituting outpatient hospitalization for inpatient in the treatment of a disease). Fig. 13.1 shows the disease-based price index, the disease-based Lowe index, the medical component of the CPI, and the PPI for the hospital industry from 2004 to 2017. The December 2003 value of all of the indexes is normalized to 1. The disease-based price indexes grow much less rapidly over this time period. The growth of the disease-based price indexes is about half as great as the growth in the disease-based Lowe index. This suggests that there is significant substitution to less expensive medical treatments. The Lowe index tracks fairly closely with the hospital industry component of the PPI. The close correlation is not coincidental because the PPI hospital data are used to update the prices of inpatient and outpatient hospital services in the disease-based price index. As the quantities are fixed in the Lowe index, all of the change over time is driven by changes in the medical service prices. Finally, the medical component of the CPI grows faster than all of the other indexes. This

Service-based versus disease-based approaches Chapter | 13

375

TABLE 13.3 Utilization changes from 2004 to 2014. Disease

Inpatient hospital

Physicians

Emergency

Rx

1. Infections

85.81%

80.34%

104.48%

93.45%

2. Neoplasms

66.04%

84.05%

78.59%

99.49%

3. Metabolic diseases

57.92%

73.62%

78.30%

98.62%

4. Blood diseases

67.83%

81.65%

76.95%

109.53%

5. Mental disorders

69.95%

94.04%

73.64%

103.60%

6. Nervous system

89.04%

95.83%

92.18%

101.96%

7. Circulatory system

58.71%

69.47%

76.82%

93.75%

8. Respiratory system

121.18%

91.65%

117.97%

88.61%

9. Digestive system

96.58%

85.64%

124.40%

108.04%

10. Genitourinary system

87.62%

107.12%

118.72%

98.48%

11. Pregnancy and childbirth

92.20%

93.81%

109.96%

92.56%

12. Skin diseases

108.65%

93.96%

144.29%

93.10%

13. Musculoskeletal system

104.78%

94.43%

92.22%

85.37%

14. Congenital anomalies

46.05%

82.24%

102.06%

59.63%

15. Perinatal conditions

73.72%

43.20%

48.75%

111.88%

16. Injury and poisoning

95.98%

96.56%

91.28%

90.28%

17. Other conditions

97.37%

130.33%

104.77%

82.73%

18. Unclassified

84.86%

83.38%

63.56%

88.84%

Notes: Reported value is 2014 utilization as a percentage of 2004 utilization.

difference is driven primarily by differences in the scope of the index. CPI is limited to out-of-pocket expenditures, which increase relatively rapidly, while the other indexes include all sources of payment. Fig. 13.2 presents the individual disease indexes over time. For some relatively uncommon conditions, the small sample of the MEPS can generate significant variance in the average utilizations from one year to the next. Even smoothing the change in utilization over the entire year leads to large spikes in

376 Handbook of US Consumer Economics

FIGURE 13.1 Medical price indexes.

FIGURE 13.2 Medical price indexes by disease defined at the level of ICD-9 chapter.

the series for some of the diseases. The overall trend is easier to interpret than the short-term fluctuations. Some conditions tend to have a relatively constant price of treatment over this period. The cost to treat the average neoplasm, mental disorder, or circulatory system disorder changes little over the period. Other diseases such as infections, respiratory system conditions, and digestive

Service-based versus disease-based approaches Chapter | 13

377

system conditions are characterized by a steady increase in the average cost of treatment over the time period. To get a better idea of what is driving the change in the disease-based price indexes, it is helpful to compare the disease index to the corresponding fixedquantity Lowe index. Table 13.4 presents the change in the disease-based price indexes from 2004 to 2017 compared to the fixed-quantity Lowe indexes. The first two columns present the indexes without adjusting for comorbidities, while the last two columns adjust for comorbidities. The Lowe indexes increase around 30%e35% over the time period for each of the conditions. This reflects the change in the underlying service-based indexes over this period. The disease-based price indexes tend to increase less than the Lowe indexes. The disease-based indexes for diseases of the blood, respiratory system, skin, and congenital anomalies all grow faster than the corresponding Lowe indexes. The disease-based price indexes for neoplasms, metabolic diseases, circulatory system diseases, and musculoskeletal system diseases all grow much slower than the corresponding Lowe indexes. The divergence of these indexes shows the importance of controlling for substitution across different types of medical services in tracking the total price to treat different conditions over time. It is possible to construct the disease-based price indexes at the level of individual diagnoses. Fig. 13.3 presents the indexes for a select group of diagnoses. These are all relatively common diagnosis codes and cover many of the broader diagnosis groups. Several of these diagnoses are designated as priority conditions in the MEPS. The MEPS questionnaire was changed after 2007 to better capture the prevalence of priority conditions. Therefore, the price index results are reported from 2008 to avoid capturing the change in the MEPS survey in the indexes. The indexes vary considerably year to year as the smaller sample size leads to greater variance in the estimation of average utilization. For many conditions, the price to treat the disease has fallen over this period. Although service prices increased, the decrease in average utilization was enough to decrease the total cost of treatment. The price index for AMI (Acute Myocardial Infarction) decreases dramatically during the time period because of a sharp reduction in average inpatient utilization. For some conditions, the average severity of the disease changes over time. This is most notable for influenza. A relatively severe u season causes an increase in medical utilization and a spike in the price index. The 2012e2013 flu season was particularly severe and causes a spike in the u price index.

7.3 Decomposition of nominal expenditures One of the most important uses of price indexes is in the calculation of real values. The disease-based price indexes can be used to decompose changes in nominal medical expenditures. Eq. (13.7) shows how aggregate nominal medical expenditures can be decomposed into population growth, disease prevalence growth, price growth, and real per capita utilization growth for the

378 Handbook of US Consumer Economics

TABLE 13.4 Disease-based price index growth from 2004 to 2017. Without comorbidity adjustment

With comorbidity adjustment

Disease

Lowe index

Disease-based index

Lowe index

Disease-based index

1. Infections

1.3798

1.5396

1.3865

1.5239

2. Neoplasms

1.3755

0.9779

1.3765

0.8435

3. Metabolic diseases

1.4009

1.1736

1.4343

1.2041

4. Blood diseases

1.3862

1.5674

1.4034

1.5489

5. Mental disorders

1.3432

1.4390

1.3489

1.2835

6. Nervous system

1.3253

1.2795

1.3373

1.2467

7. Circulatory system

1.3909

0.9975

1.4056

0.9102

8. Respiratory system

1.3882

1.602

1.3985

1.5331

9. Digestive system

1.3968

1.4110

1.4091

1.4097

10. Genitourinary system

1.3692

1.3164

1.3733

1.2589

11. Pregnancy and childbirth

1.3638

1.1051

1.3637

1.0866

12. Diseases of the skin

1.3528

1.6432

1.3620

1.4894

13. Musculoskeletal system

1.3250

1.2953

1.3328

1.1634

14. Congenital anomalies

1.3636

0.9950

1.3653

0.8437

15. Perinatal conditions

1.4319

0.8877

1.4321

0.8861

16. Injury and poisoning

1.3520

1.5802

1.3523

1.4672

17. Other conditions

1.3663

1.4978

1.3832

1.4733

18. Unclassified

1.3658

0.7597

1.3677

0.7057

All disease

1.3810

1.2839

1.3907

1.2188

Notes: The Lowe indexes use a fixed basket based on 1999 utilization. The reported values are the ratio of the year end index value for 2017 to the year end value for 2003.

Service-based versus disease-based approaches Chapter | 13

379

FIGURE 13.3 Price indexes for specific diagnoses. Diseases defined at the level of CCS code.

treatment of the disease. Table 13.5 presents the results of this decomposition. The first column presents the change in nominal expenditures for each disease. The second column is inflation, which is the disease-based price index for each disease. The third column is the real expenditure, which is the nominal expenditure growth deflated by the inflation measure. Real expenditure is the product of the number of people with a given condition and the average real utilization used to treat the condition. The number of people with a condition is equal to the population times the prevalence. The growth rate in the number of people with a condition is the product of population growth and prevalence growth. Population growth is constant for all diseases at 8.95%, and the fourth column is growth rate of disease prevalence. In general, the prevalence of the different diseases is increasing. This is due in part to the overall aging of the population. The last column is the per capita real output, which measures the real per person expenditure to treat each of the diseases. Although real expenditure for all diseases increases approximately 46% over the 10-year period, most of this growth is because of prevalence growth and population growth. The per capita real output growth does not directly measure change in utilization, as the change in utilization to treat a disease is directly incorporated into the disease-based price indexes. As the utilization data enter with a lag, at a given point in time, the price index is based on utilization from 3 years prior. In the aggregate, the per capita utilization growth measures the increase in utilization that is not yet accounted for in the price index. The value of the

Disease

Nominal expenditure

Inflation

Real expenditure

Prevalence

Per capita output

1. Infections

2.802

1.539

1.821

1.224

1.366

2. Neoplasms

1.512

0.897

1.686

1.326

1.168

3. Metabolic diseases

2.281

1.105

2.064

1.558

1.216

4. Blood diseases

4.302

0.867

4.960

1.333

3.416

5. Mental disorders

2.505

0.964

2.600

1.389

1.717

6. Nervous system

1.267

1.266

1.000

1.127

0.814

7. Circulatory system

1.349

0.849

1.588

1.313

1.110

8. Respiratory system

1.581

1.378

1.147

0.969

1.087

9. Digestive system

1.474

1.281

1.151

0.855

1.235

380 Handbook of US Consumer Economics

TABLE 13.5 Decomposition of nominal expenditure growth from 2004 to 2014.

1.384

0.975

1.420

1.053

1.237

11. Pregnancy and childbirth

1.394

1.183

1.178

0.871

1.241

12. Skin diseases

1.432

1.295

1.106

1.142

0.889

13. Musculoskeletal system

2.003

1.027

1.950

1.477

1.211

14. Congenital anomalies

2.230

0.663

3.365

1.292

2.391

15. Perinatal conditions

0.868

1.387

0.626

1.738

0.331

16. Injury and poisoning

1.759

1.231

1.429

1.021

1.284

17. Other conditions

1.447

1.404

1.031

1.129

0.838

18. Unclassified codes

0.811

0.936

0.867

0.904

0.880

All diseases

1.616

1.106

1.461

1.172

1.144

Notes: Population grew 8.95% over the period. Nominal expenditure ¼ real expenditure * inflation, real expenditure ¼ population * prevalence * per capita output.

Service-based versus disease-based approaches Chapter | 13

10. Genitourinary system

381

382 Handbook of US Consumer Economics

disease-based price indexes in 2014 uses the utilization data from the 2011 MEPS. When the 2014 price indexes are used to decompose 2014 nominal expenditures, the increase in average utilization from 2011 to 2014 will show up in the per capita utilization growth. Variation in per capita utilization growth among diseases represents differences in medical service intensity across diseases. The disease-based price indexes may not fully capture utilization change if there is a change in medical service intensity. For example, if the number of procedures for an inpatient stay increases over time, then even if the number of inpatient stays remains constant, the expenditure on inpatient stays will increase. This will appear as a real expenditure increase in the decomposition, although some of the increase in the cost to treat a disease should be reflected in the disease-based price index. Defining medical services at a finer level (e.g., specific procedure) would allow for the per capita output growth term to be captured in the price index as a change in utilization or as a change in service price. Similarly, having a measure of service price by disease would also allow for a more accurate decomposition of nominal expenditure. The decomposition of nominal expenditures can also be performed at the individual diagnosis level. Table 13.6 presents the results of the decomposition for the same set of diagnoses from the previous section. Nominal expenditure on all of the conditions increased with the exception of arm fractures. With the exception of arm fractures and AMI, some of the increase in nominal expenditures is because of a higher prevalence of the diagnoses. The decomposition at the specific diagnosis level highlights some of the current limitations of disease-based price indexes. These limitations are most clearly seen in the case of hepatitis. Nominal expenditures for hepatitis increased dramatically over this period because of new, high-cost treatments. One limitation of the disease-based price indexes is that they are based on the underlying service price indexes, but the price of a particular service may vary by disease. The sample sizes in the CPI and PPI are too small to calculate service price indexes for each disease. For hepatitis, the average treatment becomes much more prescription drug intensive over this time period. Rather than reflecting the price change of the drugs to treat hepatitis, the service price index reflects the average prescription drug. If the inflation for hepatitis drugs is greater than the overall pharmaceutical price index, then the hepatitis disease-based price index is understated and real expenditure growth is overstated. Another problem is that the new treatment represented a major breakthrough and greatly improved outcomes. Quality adjusting medical price indexes for changes in outcomes is a major unsolved issue. These, and other, limitations will be discussed in greater detail in the next section, which discusses the future of disease-based price measures.

Service-based versus disease-based approaches Chapter | 13

383

TABLE 13.6 Decomposition of Nominal Expenditure Growth for specific diagnoses from 2008 to 2014.

Inflation

Real expenditure

Prevalence

Per capita output

2.295

0.968

2.371

1.283

1.761

Hepatitis

13.693

0.948

14.447

1.567

8.787

Breast cancer

1.753

0.597

2.935

1.213

2.306

Prostate cancer

1.600

0.808

1.981

1.017

1.856

Diabetes

2.004

1.034

1.937

1.155

1.599

Cataract

1.422

0.988

1.439

1.275

1.076

AMI

1.277

0.345

3.700

0.931

3.789

Heart failure

1.473

0.788

1.869

1.943

1.890

Pneumonia

1.250

1.172

1.067

1.052

0.967

Influenza

4.415

0.872

5.063

1.575

3.064

Arm fracture

0.707

0.982

0.720

0.854

0.803

Diagnosis

Nominal expenditure

HIV

Notes: Population grew 4.91% over the period. Nominal expenditure ¼ real expenditure * inflation, real expenditure ¼ population * prevalence * per capita output.

8. Discussion and future work Although moving toward a disease-based framework for measuring prices and output in the medical sector could greatly improve the official measures, there is still much work to be done before it would be feasible to switch from service-based to diseased-based measures. In this section, we discuss some of the limitations of the current disease-based price indexes and describe ongoing research at the BLS to address these limitations.

8.1 Limitations of current disease-based measures Many of the limitations of the current disease-based price indexes stem from limitations of the MEPS data. For conditions that are not common, the size of the MEPS sample will yield few observations with those conditions. One consequence of the small sample is that the estimate of average utilization will be noisy for these conditions. This noise will cause large swings in the index

384 Handbook of US Consumer Economics

when the utilization is updated with each new year of MEPS data. The MEPS sample size is also too small to calculate meaningful indexes for specific diseases and conditions within the broader disease categories. The calculation of average utilization within a broad category of diseases will miss a lot of the substitution that occurs at the specific disease level. Changes in average utilization over time could also be due to changes in the composition of diseases within the broader category. Another limitation is that the medical service categories are defined broadly. It is necessary to define the input categories in a way that matches with CPI and PPI product/industry categories. New medical technologies could generate substitution within categories, which would be missed by the disease-based price indexes. For example, the disease-based price indexes currently capture substitution from doctor’s visits to pharmaceuticals but would not capture the substitution resulting from a new generation of drugs replacing an older generation in the treatment of a disease. Also, the service prices from the CPI and PPI assume that the intensity of treatment for a particular service does not change over time. Changes in utilization intensity can impact how medical price indexes capture price changes (Dunn et al., 2014). Consider the situation where two procedures that used to be performed in separate doctor’s office visits are now performed in the same visit. The price of the visit with two procedures is greater than the price of the visits where only one procedure occurs. The service-based price indexes will hold the characteristics of the visit constant when repricing and will not treat the substitution as inflationary. The service price indexes that go into the calculation of the disease-based price indexes are not disease-specific. If the inflation in the service price index is primarily due to the treatment of a specific disease, then the disease-based price index will understate the amount of inflation for that disease and overstate the inflation for other diseases. For example, if the price of insulin increases and the price of all other pharmaceuticals remains constant, the CPI pharmaceutical index will increase by much less than the increase in the price of insulin. When constructing the disease-based index, the change in the prescription drug component of diabetes expenditure will increase based on the increase in the CPI pharmaceutical index. The value of the index for other diseases that have any prescription drug utilization will increase because of the increase in the CPI pharmaceutical index, although the prices of treatments for those diseases did not change. Calculating the service indexes at the level of disease would require a much larger sample of prices than what currently is used to calculate the CPI and PPI medical indexes. Ultimately, having diseasespecific indexes at the level of a particular service category in the CPI or PPI would greatly improve the accuracy of the disease-based price indexes while still allowing them to be produced in a timely manner.

Service-based versus disease-based approaches Chapter | 13

385

8.2 Quality adjustment Medical technology generally improves over time, which allows for a disease to be treated using fewer resources or can achieve better outcomes at the same level of resources. If the prices of the medical goods and services stay the same and the quality improves, then the COL index has decreased. Separating the effect of quality change from pure price change is essential in forming a true COL index. In determining the cost to treat a disease over time, the qualityadjusted price index would hold outcomes constant for a given bundle of medical inputs. The papers that formulate single-disease price indexes take different approaches to this issue.15 Ultimately, none of the approaches used is likely to serve as a general solution to the issue of quality adjusting medical price indexes. In the paper on depression treatments by Berndt and coauthors, each treatment has some probability of being successful. They incorporate differences in quality among different treatment options by allowing them to have different probabilities of success. Then, by pricing the expected cost of a successful treatment, they control for quality differences among the different types of treatments. This approach is useful when outcomes can be characterized in zero/one terms (successful/unsuccessful or treated/not treated). From a practical point of view, implementing this approach on a large scale to quality adjust medical price indexes is not feasible because of the difficulty in assigning probabilities to the success of different treatment bundles. In Cutler et al. (1998), the quality of medical treatment for heart attacks is measured in terms of longevity. Quality change takes the form of additional years of life. In forming a COL index for heart attack treatments, they assume a value for an additional year of life and ultimately compute the index for a range of values. Given the controversial nature of assigning a monetary value to a year of life, this approach would likely not be feasible for the purposes of official price statistics. Also, longevity may not be an appropriate outcome for all diseases as changes in the treatment for many diseases have little to no impact on longevity. The issue of quality adjustment for medical goods and services in the price indexes is a major unresolved problem and a potentially large source of bias in the indexes. Ongoing research at the BLS seeks to address this issue. In an ongoing project, we use Medicare claims data to track treatments and outcomes over time. By obtaining estimates of the parameters of the health production function (i.e., q in Eq. 13.1), it is then possible to adjust for quality improvements either through placing a value on the improvement in mortality or through changes in the cost of a successful treatment. One limitation is that the only health outcome observable in the data is mortality. Ideally, quality adjustments would be made based not only on improved longevity but also 15. See Hall (2017) for an overview of the literature on quality-adjusting medical prices.

386 Handbook of US Consumer Economics

through changes in the quality of life. The appropriate outcome when considering changes in quality may also depend on the disease being considered. It is likely that a comprehensive solution to the issue of quality adjusting medical prices will require a new source of data that links the full history of medical treatments, nonmedical care inputs into the health production function, and quality of lifeebased outcome measures.

9. Conclusion Moving from measuring the medical sector on a service basis to a disease basis in the official government statistics presents many logistical and methodological challenges. Current methods and data collection are designed to produce medical price and output statistics on a service basis and are inadequate for producing disease-based measures. This chapter describes how the BLS constructs experimental disease-based price indexes using publicly available data. Results suggest that moving to disease-based measures could have a significant impact on our understanding of the medical sector of the economy. More research is required before the official statistics can be reported on a disease basis, and it may ultimately require the collection of additional data.

References Aizcorbe, A., Nestoriak, N., 2011. Changing mix of medical care services: stylized facts and implications for price indexes. Journal of Health Economics 30 (3), 568e574. Aizcorbe, A., Bradley, R., Greenaway-McGrevy, R., Herauf, B., Kane, R., Liebman, E., Pack, S., Rozental, L., 2011. Alternative Price Indexes for Medical Care: Evidence from the MEPS Survey. Bureau of Economic Analysis: Working Paper WP2011-01. Akerlof, G., 1970. “The market for lemons”: qualitative uncertainty and the market mechanism. Quarterly Journal of Economics 74, 488e500. Alchian, A., Klein, B., 1973. On a correct measure of inflation. Journal of Money, Credit, and Banking 5 (1), 173e191. Arrow, K., 1963. Uncertainty and the welfare economics of medical care. American Economic Review 53 (5), 941e973. Berndt, E., Busch, S., Frank, R., 2001a. Treatment price indexes for acute phase major depression. In: Cutler, D., Berndt, E. (Eds.), Medical Care Output and Productivity. University of Chicago Press, pp. 463e505. Berndt, E., Cockburn, I., Griliches, Z., 2001b. Pharmaceutical innovations and market dynamics: tracking effects of price indexes for anti-depressant drugs. In: Brookings Paper on Economic Activity. Berndt, E., Busch, S., Frank, R., 2002. The treatment of medical depression, 1991e1996: productive inefficiency, expected outcome variations, and price indexes. Journal of Health Economics 21, 373e396. Boskin, M., Dulberger, E., Gordon, R., Griliches, Z., Jorgenson, D., 1996. Toward a More Accurate Measure of the Cost of Living. Final Report to the Senate Finance Committee.

Service-based versus disease-based approaches Chapter | 13

387

Bradley, R., 2013. Feasible methods to estimate disease-based price indexes. Journal of Health Economics 32 (3), 504e514. Bradley, R., Cardenas, E., Ginsburg, D., Rozental, L., Velez, F., 2010. Producing disease- based price indexes. Monthly Labor Review 133, 20e28. Bundorf, K., Royalty, A., Baker, L., 2009. Health care cost growth among the privately insured. Health Affairs 28 (5), 1294e1304. Bureau of Economic Analysis, 2014. Concepts and Methods of the U.S. National Income and Product Accounts. Bureau of Labor Statistics, 2016. Handbook of Methods. Division of BLS Publishing, Office of Publications and Special Studies. Cutler, D., McClellan, M., Newhouse, J., Remler, D., 1998. Are medical prices declining? Evidence from heart attack treatments. Quarterly Journal of Economics 113 (4), 991e1024. Cutler, D., McClellan, M., Newhouse, J., Remler, D., 2001. Pricing heart attack treatments. In: Cutler, D., Berndt, E. (Eds.), Medical Care Output and Productivity. University of Chicago Press. Diewert, E., 1987. Index numbers. In: Eatwell, J., Newman, P. (Eds.), New Palgrave: A Dictionary of Economics. Macmillan Press, pp. 767e780. Dunn, A., Liebman, E., Shapiro, A., 2014. Implications of utilization shifts on medical- care price measurement. Health Economics 24 (5), 539e557. Dunn, A., Rittmueller, L., Whitmire, B., 2015. Introducing the new bea health care satellite account. Survey of Current Business 95 (1). Frank, R., Busch, S., Berndt, E., 1998. Measuring prices and quantities of treatments for depression. American Economic Review 88 (2), 106e111. Frank, R., Berndt, E., Busch, S., 1999. Price indexes for the treatment of depression. In: Triplet, J. (Ed.), Measuring the Prices of Medical Treatments. Brookings Institution Press. Grossman, M., 1972. On the concept of health capital and the demand for health. Journal of Political Economy 80 (2), 223e255. Hall, A., 2017. Adjusting the measurement of the output of the medical sector for quality: a review of the literature. Medical Care Research and Review 74 (6), 639e667. Pauly, M., 1968. Economics of moral hazard: Comment. American Economic Review 58 (3), 531e537. Pauly, M., 1974. Overinsurance and public provision of insurance: the roles of moral hazard and adverse selection. Quarterly Journal of Economics 88 (1), 44e62. Pollack, R., 1975. Intertemporal cost of living index. In: Berg, S. (Ed.), Annals of Economic and Social Measurement, vol. 4(1). NBER, pp. 179e198. Roehrig, C., Miller, G., Lake, G., Bryan, J., 2009. National health spending by medical condition, 1996e2005. Health Affairs 28, 358e367. Rothschild, M., Stiglitz, J., 1976. Equilibrium in competitive insurance markets: an essay on the economics of imperfect information. Quarterly Journal of Economics 90 (4), 629e649. Sato, K., 1976. The ideal log-change index number. Review of Economics and Statistics 58 (2), 223e228. Scitovsky, A., 1967. Changes in the costs of treatment of selected illness, 1951e65. American Economic Review 42, 1182e1195. Shapiro, M., Wilcox, D., 1996. Mismeasurement in the Consumer Price Index: An Evaluation. NBER Macroeconomics Annual.

388 Handbook of US Consumer Economics Song, X., Marder, W., Houchens, R., Conklin, J., Bradley, R., 2009. Can a disease based price index improve the estimation of the medical CPI? In: Diewert, E., Greenless, J., Hulten, C. (Eds.), Price Index Concepts and Measurement. National Bureau of Economic Research, pp. 329e372. Starr, M., Dominiak, L., Aizcorbe, A., 2014. Decomposing growth in spending finds annual cost of treatment contributed most to spending growth, 1980e2006. Health Affairs 33, 823e831. Thorpe, K., Florence, C., Joski, P., 2004. Which medical conditions account for the rise in health care spending? Health Affairs 4, 437e445. US Department of Health, Education and Welfare, 1967. A Report to the President of Medical Prices. US Government Printing Office.

Chapter 14

A brief history of the supplemental poverty measure1 Thesia I. Garner1, Liana E. Fox2 1 Research Economist, Division of Price and Index Number Research, Bureau of Labor Statistics, U.S Department of Labor, Washington, DC, United States; 2Research Economist Social, Economic and Housing Statistics Division, U.S. Census Bureau, Washington, United States

1. Introduction Published annually since 2011 by the Census Bureau, with support from the Bureau of Labor Statistics (BLS), the Supplemental Poverty Measure (SPM) augments the official poverty measure by taking into account many of the government programs designed to assist low-income families and individuals which are not included in the current official poverty measure. In addition, SPM thresholds are designed to more closely reflect the current spending needs of individuals living in the United States. This measure originated with an interagency technical working group (ITWG) convened in late 2009 by the Chief Statistician of the Office of Management and Budget (OMB). The recommendations of the technical working group drew heavily on the National Academy of Sciences Report, Measuring Poverty: A New Approach (Citro and Michael, 1995). Unlike the official poverty measure, the SPM is focused on understanding how specific tax and transfer programs in addition to cash income impact individuals’ ability to acquire a very specific bundle of goods and services. In addition, and also unlike the official poverty measure, SPM poverty needs are represented by a set of three basic thresholds that separately

1. This chapter was prepared for this volume particularly to promote and advance our understanding of poverty measurements in the United States. The chapter has undergone more limited reviews than official publications. Any views expressed on statistical, methodological, technical, or operational issues are those of the authors and not necessarily those of the Census Bureau or Bureau of Labor Statistics. Any errors or omissions are the sole responsibility of the authors. Thanks are extended to Kathleen Short, now retired, who provided much guidance in the development of the SPM while she was with the Census Bureau. Handbook of US Consumer Economics. https://doi.org/10.1016/B978-0-12-813524-2.00014-7 Copyright © 2019 Elsevier Inc. All rights reserved.

389

390 Handbook of US Consumer Economics

reflect the spending needs of renters, owners with mortgages, and owners without mortgages, and these thresholds are adjusted to reflect price differences across areas. At the time of this publication, the production of SPM resources and poverty statistics are funded as official products of the Census Bureau; however, the production of the SPM thresholds is funded as BLS research only. In November 2011, the BLS released the first set of SPM thresholds and the Census Bureau released the first annual SPM report that included SPM estimates for 2009 and 2010. Since then, each year in the fall, SPM poverty statistics have been released. The latest SPM estimates are for 2017. This chapter begins with a brief history of the creation of the SPM including the 1995 National Academies of Sciences Panel report and subsequent recommendations provided by the 2009 ITWG. This is followed by details on the current construction of the SPM threshold and resource measures. To see how the SPM changes the face of poverty in the United States, a snapshot of SPM poverty and related statistics is presented alongside a snapshot based on the official measure. In the final section of the chapter, we discuss tested changes to the thresholds and resources along with directions for future research.

2. History and background 2.1 The official poverty measure The official poverty measure for the United States began with the development of a set of thresholds by a Social Security Administration economist, Mollie Orshansky, in the mid-1960s. Orshansky based her thresholds on US Department of Agriculture’s (USDA) 1955 Household Food Consumption Survey data and USDA’s food plans. Using these data, Orshansky estimated that families of three or more people spent an average of one-third of their after-tax income on food. Following this, Orshansky set the poverty thresholds at a level of three times the USDA’s economy food plan, a diet designed for “temporary or emergency use when funds are low.”2 The first Census Bureau report to use the Orshansky thresholds to measure poverty was released in 1967.3 In 1969, the Bureau of the Budget (precursor to the OMB) established Orshansky’s thresholds as the official poverty thresholds for the United States through Statistical Policy Directive 14, with updating of the thresholds

2. While most would consider these thresholds to be an absolute measure of poverty, Orshansky described her thresholds as “relatively absolute” measures of poverty, as they were based on spending and expected consumption behaviors at a particular point in time, but updated for changes in prices like an absolute measure (see Fisher, 1997 for a full history of the Orshansky thresholds). 3. Prior to the official designation by OMB, previous census publications used a “low-income” standard to examine families with incomes below $3000 in 1963. At that time, the low-income level was roughly 50% of median income.

A brief history of the supplemental poverty measure Chapter | 14

391

accomplished through the use of the overall Consumer Price Index. The official poverty measure compares these thresholds to pre-tax cash income. Poverty measurement in the United States has been controversial since the development of the official poverty measure in the mid-1960s. As criticisms of the official poverty measure grew over time, the pressure to create an alternative measure mounted. This led to considerable research focused on improving the measure, including a major 1976 Health, Education, and Welfare (HEW) task force and report. Another significant contribution was made by Patricia Ruggles with the publication of her book, Drawing the LinedAlternative Poverty Measures and Their Implications for Public Policy (Ruggles, 1990). Her findings resulted in Congressional hearings which ultimately led to an appropriation for an independent study of poverty measurement, the National Academy of Sciences (NAS) Panel on Poverty and Family Assistance.

2.2 National Academy of Sciences panel on poverty measurement and research that lead to the SPM The 1995 NAS Panel’s report, Measuring Poverty: A New Approach, included reviews and analyses of methods used to produce poverty measures and detailed suggested improvements for producing a new measure for the United States which would replace the official measure (Citro and Michael, 1995). The NAS Panel on Poverty and Family Assistance concluded: Our major conclusion is that the current measure needs to be revised: it no longer provides an accurate picture of the differences in the extent of economic poverty among population groups or geographic areas of the country, nor an accurate picture of trends over time. The current measure has remained virtually unchanged over the past 30 years. Yet during that time, there have been marked changes in the nation’s economy and society and in public policies that have affected families’ economic wellbeing which are not reflected in the measure. Improved data, methods, and research knowledge make it possible to improve the current poverty measure. Citro and Michael 1995, p. 1

The NAS panel identified six major shortcomings in the official poverty measure: 1. The official measure does not distinguish between the needs of workers and nonworkers. 2. The official measure does not take into account significant variations in medical care costs. 3. The thresholds are the same across the nation, although significant price variations across geographic areas exist for such needs as housing. 4. The family size adjustments in the thresholds are anomalous in many respects, and changing demographic and family characteristics (such as the reduction in average family size) underscore the need to reassess the adjustments.

392 Handbook of US Consumer Economics

5. Changes in the standard of living call into question the merits of continuing to use the values of the original thresholds updated only for inflation. Historical evidence suggests that poverty thresholdsdincluding those developed according to “expert” notions of minimum needsdfollow trends in overall consumption levels. 6. Family resources should be expanded from gross money income to reflect the effects of important government policy initiatives (such as taxes and inkind benefits) that have significantly altered families’ disposable income and, hence, their poverty status. Moreover, the current poverty measure cannot reflect the effects of future policy initiatives that may have consequences for disposable income, such as changes in the financing of health care, further changes in tax policy, and efforts to move welfare recipients into the work force. Following the release of the NAS Panel’s report, researchers at the Census Bureau and BLS began independent research to test and try to implement the Panel’s recommendations in a federal economic statistical environment. This research, representing a fully realized NAS poverty measure, was first presented at the 1997 Joint Statistical Meetings (JSM) and subsequently published by Garner et al. (1998). Research contributing to the supplemental poverty measure Medical care Doyle and Johantgen (1996) Doyle (1997) Banthin et al. (2001) Bavier (2001) Garner and Short (2002) Housing Betson (1995) Betson (2009) Garner and Rozaklis (1999, 2001) Garner (2005) (revised 2006) Work-related expenditures Short et al. (1996) Equivalence scales Betson (1996) Geographic adjustments Kokoski et al. (1994) Johnson et al. (1997) Renwick (2009a,b, 2010, 2011) Resource unit, childcare Short (2009, 2010) Rental subsidies Renwick (2009, 2010), Kingkade (2017) Alternative specifications for Garner (2006), 2009a,b thresholds Garner and Betson (2010a,b)

Discussions of technical issues took place over the next several years at several workshops sponsored by the Brookings Institution Center on Children and Families and the University of Wisconsin Institute for Research and Poverty (IRP), as well as through another federal ITWG. These workshops and meetings focused on numerous issues that were unresolved in the NAS report including the treatment of medical expenses, valuing rental subsidies for resources and thresholds, valuing home equity and the flow of services from

A brief history of the supplemental poverty measure Chapter | 14

393

owner-occupied housing in thresholds and resources, and the geographic adjustment of thresholds for price differences. There were also discussions of how to set the initial threshold, updating the thresholds over time, defining the reference unit, work-related expenditures subtracted from resources, and variations in the equivalence scales. See Table, Research contributing to the supplemental poverty measure, for a more comprehensive list of research papers. Building upon this research and these discussions, in 1999, the Census Bureau released a report estimating six sets of alternative poverty estimates for 1990e97. For this, various components of the NAS panel’s recommendations were implemented separately and jointly (Short et al., 1999a). Research that was conducted in support of this report was presented subsequently at the 1999 JSM and focused on valuing the need for shelter, medical out-of-pocket (MOOP) spending, alternative units of analysis, accounting for owner-occupied housing, and the production of separate thresholds for different housing status groups (Short et al., 1999b). It was during the 1999 JSM that outlays or out-of-pocket spendingebased thresholds for owners with mortgages were first presented. Subsequent Census Bureau reports based on the NAS measure followed (e.g., Short, 2001). In 2004, at the request of OMB, the Committee on National Statistics (CNSTAT) of the National Research Council convened a workshop to review research on alternative methods to measure poverty. The goal was to review the research that had been done since the release of the Panel’s report and to make recommendations for next steps including reducing the number of alternative measures to produce. At the time of the 2004 workshop, over 50 research papers on experimental poverty measures had been written. This work was produced by researchers at the Census Bureau, BLS, Department of Health and Human Services, OMB, Social Security Administration, to name a few, and by researchers at think tanks and various universities. The primary topics presented and discussed at the workshop included setting and updating the thresholds, equivalence scales, geographic adjustment to thresholds, medical expenses, work-related and childcare expenses, and housing. (See Iceland, 2005 for a summary of the workshop.) After the workshop, Katherine Wallman, Chief Statistician at OMB, reconvened the ITWG on poverty, and research continued. Shortly after the 2004 CNSTAT workshop, calls for improved poverty measurement came from several local and state governments. Probably the earliest and most influential was the one from New York City (NYC). In 2006, NYC Mayor Michael Bloomberg convened the Commission for Economic Opportunity, a task force charged with making recommendations to improve NYC’s measure of poverty. The Commission came to the conclusion that the current official measure was a poor gauge of the degree of economic deprivation for NYC and did not account for City programs to alleviate it. Mayor Bloomberg embraced these recommendations, and thus poverty measurement became a new project for the City’s Center for Economic Opportunity (NYC-CEO). Under the leadership of Mark Levitan, the Center was in close communication with researchers at the BLS and Census Bureau regarding the work that has been done on the NAS measure, they convened workshops and

394 Handbook of US Consumer Economics

conducted their own research. For example, in November 2007, Levitan contacted Garner at BLS and Short at the Census Bureau regarding the NAS threshold and resources measure with regard to whether these might be useful to measure poverty in NYC. Levitan was particularly interested in the treatment of shelter, having read Garner’s (2005) work on the topic, as NYC has many housing programs. In January 2008, the CEO sponsored a Brookings Workshop to talk about methods and options. Six months later, Levitan contacted then BLS Commissioner Keith Hall to obtain NAS thresholds and expenditure shares for families with two adults and two children to be used in the development of a CEO measure. In August 2008, the first CEO Poverty Measure was produced and published (see NYC-CEO, 2008). The CEO sponsored another workshop at the Brookings Institution to present its report and gain feedback for next steps. The CEO continued to use a NAS-type of poverty measure with statistics published for 2009. In 2008, a key address by Blank, who had served on the NAS Panel, was presented before the Association for Public Policy Analysis and Management Presidential Address (Blank, 2008). In this address, she discussed the reasons why the official poverty measure was outdated, described why various efforts to improve it have failed, and made recommendations to allow for improvements in a poverty measure for the U.S. These four recommendations follow: 1. assign a statistical agency the authority to develop an alternative measure of income poverty; 2. explicitly direct the statistical agency to provide a poverty definition that produces both a credible and coherent poverty threshold and a consistent and appropriate resource measure; 3. allow public programs to continue to use the OMB-defined poverty rate or multiples of the poverty guidelines as an eligibility cut-off, unless they choose to make changes; and 4. commission work to develop a list of key measures of economic deprivation beyond income poverty. Also in 2008, Congressional staff from the Joint Economic Committee consulted with BLS and Census Bureau staff, as well as outside experts, to learn more about the research that had been conducted on NAS-based poverty measures. This led to the introduction of the “Measuring American Poverty Act of 2009,” or MAP, in the House of Representatives on June 17, 2009; a companion bill was drafted and introduced to the US Senate on August 6, 2009 (see U.S. House, 2009; U.S. Senate, 2009). The MAP was introduced to amend Title XI of the Social Security Act. According to the proposed legislation: ..this Act is to provide for an improved and updated method for measuring the extent to which families and individuals in the Unites States have sufficient income to allow a minimal level of consumption spending that meets their basic physical needs, including food, shelter (including utilities), clothing, and other necessary items, in order to better assess the effects of certain policies and programs in reducing the prevalence and depth of poverty, to accurately gauge the level of economic deprivation, and to improve understanding of targeting of public resources, without directly affecting the distribution of, or eligibility for, any Federal benefits or assistance. p. 3, lines 21 through page 4, lines 1e8.

A brief history of the supplemental poverty measure Chapter | 14

395

The MAP specified that modern poverty thresholds were to be based on a distribution of consumption expenditures that includes food, clothing, shelter, and utilities (FCSU), just like the NAS measure. The threshold was to be produced for a reference family with the threshold equal to 120% of the 33rd percentile of the distribution, as opposed to a percentage of the median of FCSU expenditures, or a limited band converging on this percentile. Four or more of the most recent years of CE data, or a combination with other data, were to be used to produce the thresholds. The thresholds were to be updated no less often than annually using this method (page 6, lines 5e20). Due to differences in the out-of-pocket expenditures of owners with and without mortgages, the MAP further specified that the calculation of the threshold “shall be made separately” for (1) families who own their primary residence and do not have a mortgage secured by the residence and (2) all other families such that they can “purchase similar quality shelter” (page 8, lines 1e17). The MAP did not become law; however, it did motivate discussions that led to additional research (see below) and ultimately to the production of the SPM. After the introduction of the MAP legislation, there was a new round of research focused on accounting for owner-occupied housing and nonmarket rental shelter (e.g., Betson, 2009), considering cohabitation in the resource unit, better estimations for childcare expenditures as subtractions from income (Short, 2009, 2010), geographic adjustments (e.g., Renwick, 2009a,b), rental subsidies in resources (e.g., Renwick, 2009b, 2010), and alternative choices for the specification of the thresholds (e.g., Garner, 2009a,b; Garner and Betson, 2010a,b). Among the alternative threshold specifications tested included varying the point in the FCSU distribution from a percentage of the median to the 33rd percentile (as specified in the MAP), moving to the consumer unit versus the family as the unit of analysis, using a different number of years of CE data to produce the thresholds, differing the updating of thresholds’ approach, and testing the impact of alternative definitions of shelter in the thresholds. Other work focused on the production of consistently defined thresholds and resources (Garner and Short, 2010b). Studies using the American community for geographic adjustments (Renwick, 2010, 2011) and the study of geographic distributions of poverty in the United States (Ziliak, 2010) also resulted. Throughout these years, Short (e.g., 2010, 2011) integrated alternative specifications of thresholds and resources and tested their impact on poverty estimates.

2.3 The interagency technical working group to develop a supplemental poverty measure and early research In late 2009, the ITWG on Developing a Supplemental Poverty Measure was formed by the OMB Chief Statistician. The Working Group included representatives from BLS, the Census Bureau, Council of Economic Advisers, Department of Commerce, Department of Health and Human Services, and OMB. The group was charged with developing starting points to allow the Census Bureau, working with BLS, to create an SPM. This group met several

396 Handbook of US Consumer Economics

times per week over 2 months. The group decided that the recommendations included in the NAS Panel’s report, Measuring Poverty, would provide the framework to define the SPM. In addition, previous research on the NAS measures and research conducted specifically for this working group also informed the working group’s deliberations. By early 2010, the working group produced a document with “observations about how to make a series of initial choices in the development of the SPM” (ITWG, 2010). These observations reflected discussions and recommendations of the entire working group. However, when there was no consensus, decisions were made by the Under Secretary for Economic Affairs in the US Department of Commerce (Rebecca Blank) and the Chief Statistician, OMB (Katherine Wallman). The ITWG suggestions were published in the Federal Register, and the Census Bureau and BLS reviewed comments from the public.4 The ITWG was more focused on creating a measure of poverty that estimated the impact of policies and programs on reducing poverty and less focused on creating a broader measure of well-being. The SPM was designed to provide a statistical measure of poverty, more of a posttax and transfer measure. It would not be used to estimate eligibility for government programs. As guiding principles, “the Working Group placed value on: consistency between threshold and resource definitions; data availability; simplicity in estimation; stability of the measure over time; and ease in explaining the methodology” (ITWG, 2010, pp. 2e3). These priorities were similar to those of the NAS Panel. As noted by Blank (2011), proposing the creation of the SPM was a major step forward by the administration at that time in pushing for poverty statistics that better reflected the circumstances of individuals and families in the United States. Although not included in the 2011 federal budget, future budgets did include proposals for funding. As noted earlier, at the time of this publication, the Census Bureau’s SPM work is specifically funded, while that of the BLS is funded as general research. ITWG initial choices in the development of the SPM included the following (ITWG, 2010): l

SPM thresholds should: l Use a reference sample that includes all family units with exactly two children. l Include in the definition of “family unit” all related individuals who live at the same address and any coresident unrelated children cared for by the family, and any cohabitors and their children. l Use a sample based on the most recent 5 years of available data on equivalized expenditures for the reference sample.

4. Federal Register notice (Vol. 75, No. 101, p. 29513) was issued on May 26, 2010, soliciting public comments regarding specific methods and data sources in developing the SPM (https:// www.govinfo.gov/content/pkg/FR-2010-05-26/pdf/2010-12628.pdf).

A brief history of the supplemental poverty measure Chapter | 14

From the distribution of equivalized FCSU expenditures within the reference sample, select the dollar amount at the 33rd percentile of the distribution. The NAS recommends taking a range: the 33rd percentile is at the center of the range. Shelter expenditures should include all mortgage expenses. l So far as possible with available data, the calculation of FCSU should include any in-kind benefits that are counted on the resource side for FCSU. This is necessary for consistency of the threshold and resource definitions. l Adjust the thresholds for housing status, distinguishing renters, owners with a mortgage, and owners without a mortgage. l To allow for basic expenditures outside of FCSU, multiply the estimated amounts on spending for FCSU by 1.2. The multiplier would allow for other expenditures that families must make. l To define a threshold for families of different sizes, adjust the thresholds by the so-called, “three-parameter equivalence scale” to adjust the reference thresholds for the number of adults and children in a family. l Adjust the thresholds for price differences across geographic areas. Family resources should: l Include cash income, plus any Federal Government in-kind benefits that families can use to meet their FCSU needs, minus taxes (or plus tax credits), minus work expenses, minus out-of-pocket expenditures for medical expenses. l Include all related individuals who live at the same address, any coresident unrelated children who are cared for by the family (such as foster children), plus cohabitors and their children. This is consistent with the reference sample for the thresholds. l Subtract income taxes and payments for child support from income. l Subtract from income work expenses that include both standard expenses associated with commuting as well as childcare. - For childcare expenses, the adjustment would be based on actual reported expenses. - For other work expenses, investigate the comparative advantages and disadvantages of trying to measure actual expenses versus assigning an average amount to all working adults. - Cap the level of total work expenses subtracted from any family’s resources by the earning level of the lowest-earning adult. l Subtract MOOP expenses from income. The SPM should be updated over time by: l Calculating resources each year as new data are released on the income available to families in the most recent year. l Updating techniques that impute the value of family unit resources, such as estimation of in-kind benefits, work expenses, taxes, etc. as often as possible. The measure should change smoothly, and this requires regular updating of as many components as possible. l

l

l

397

398 Handbook of US Consumer Economics l

Recalculating thresholds each year by adding in the latest year of available data and dropping the oldest year of data, so that the thresholds are always based on the latest 5 years of expenditure data. One reason to utilize 5 years of data to calculate the thresholds is to reduce the risk that they might change significantly from year to year.

Is the SPM a relative measure? Questions have arisen regarding whether the SPM is a relative measure, given that the procedure for updating the thresholds is what one might consider “quasirelative” with respect to FCSU expenditures (for example, see Johnson and Smeeding, 2012; Moskowitz et al., 2010). By design, the SPM (and NAS) thresholds rise (or fall) in real terms as the general standard of living rises (or falls), unlike price-adjusted thresholds, but they do not rise (or fall) as rapidly as total (real) consumption or income. Paraphrasing Blank (2011), “expenditures can rise due to price increases or because overall incomes increase but over the long term, spending on necessities tend to rise more slowly than income. An advantage of updating the thresholds to reflect changes in spending results in thresholds that also reflect changes in American lifestyles. And thus, poverty and deprivation are related to overall social needs.” Quasi-relative thresholds could change due to changes in living standards or income. Of particular concern is that, like any other relative measure, in a deep recession, SPM poverty thresholds might fall and result in declining poverty rates during a time of increasing economic hardship. The SPM attempts to mitigate this problem by using a five-year average and looking at changes in spending on FCSU rather than changes in overall consumption expenditures or income. The NAS Panel acknowledged that the updating method, using several years of FCSU spending data, could exacerbate the problem when there are downturns in the economy that last one or two years, but argued their position noting that this updating method better represents levels of living than an absolute one based on standards set in the distant past. In their report, the NAS stated, “Thresholds that rise [or fall] in real terms will not necessarily result in a larger number of proportion of poor people compared to thresholds that are simply adjusted for price change” (Citro and Michael, 1995, p. 329). Thus, it is important that research be conducted to try to disentangle the impact of such changes. Research examines how such changes might impact the thresholds. For example, how the thresholds might change during a recession was the focus of research by Garner and Gudrais(2012). Their work examined expenditure data that included the recession that began in December 2007 and ended in June 2009 (as defined by the NBER). In their study, these researchers examined quarterly FCSU spending over the 2005Q2-2010Q1 period, representing the five years of data that underlie the 2009 SPM thresholds. They found that quarterly FCSU spending was fairly constant within the 30e36th percentile of FCSU spending for the estimation sample upon which the thresholds are based. These results suggest that consumers at the lower end of the FCSU spending distribution might have little opportunity to change their spending or to substitute to lower quality or quantities of goods and services, at least in the short run. More recently, ongoing research looks at the potential impact on the thresholds of a change in income. Using FCSU spending

A brief history of the supplemental poverty measure Chapter | 14

399

Is the SPM a relative measure?dcont’d and after tax income for 2007Q2-2012Q1, Garner estimated the income elasticity of FCSU spending to be 0.328 at the 33rd percentile for consumer units with two children (the SPM threshold estimation sample and distribution point for the threshold). Given a hypothetical 10% increase in after-tax income in 2011 (that could result from a decrease in income taxes), the FCSU threshold for 2011 would have increased by only 0.656%, since it is based on five years of expenditure data. As in the Garner and Gudrais (2012) study, this result suggests that changes in income, at least in the short run, have small effects on thresholds. Another way to think about the possible relativity of the SPM is to estimate the income elasticity of the threshold. If the measure is relative, the income elasticity would equal one (see Kilpatrick, 1973 for a discussion). The income elasticity of the official poverty threshold is zero; this is because, in constant dollars, the official threshold does not change over time regardless of how income changes. To provide insight regarding this issue for the SPM, we estimated the resource elasticity of the SPM thresholds over the 2005e2016 time period. As noted earlier, there is no one SPM threshold for a year but three, one for each housing tenure group. So, for this exercise, for each year, we calculated a weighted average overall SPM threshold using the published housing tenure-specific thresholds and the CU housing tenure distributions (BLS, 2018); these are not geographically adjusted. Median resources are derived from the relative thresholds produced by Fox (2017a, Table 14.1). The thresholds and resources were for units with exactly two children, equivalized using a three-parameter equivalence scale to represent thresholds and resources for units with two children and two adults. Weighted average overall SPM thresholds and median resources were converted to constant dollars using the All Items CPI-U. These thresholds were regressed on resources. The resource elasticity of SPM thresholds over the 2005e2016 time period is 0.215. This means that for every 1.0% increase in median resources, SPM thresholds rose by 0.215%. For 2009e2013, the resource elasticity is 0.33, but for 2013e2016 it is only 0.025. Referring to the NAS report, we note that the SPM is not a “fully relative measure, such as one-half median income or expenditures, that would update the thresholds for changes in [total] income or total consumption” and total consumption would include “luxuries as well as basic goods and services” (Citro and Michael, 1995, pp. 42-43). We refer to Fox (2017a) for a comparison and discussion of the relationship of SPM thresholds, relative thresholds, and anchor SPM thresholds and resulting poverty statistics.

l

l

Recomputing adjustment factors used in the thresholds to calculate differences by housing status and for interarea price differences regularly. These factors should also be based on multiple years of data so that they change more smoothly from year to year. Weighting consistency over time in the SPM against the effect on historical consistency. Consistency over time is a valuable characteristic so that, after an initial experimental period, any definitional changes to

400 Handbook of US Consumer Economics

the measure should be weighed against the effect on historical consistency. The ITWG emphasized that the SPM should be considered an experimental measure with the expectation that improvements would be introduced over time as new data, methods, and further research become available. The ITWG recommended that changes in the SPM measure should be decided upon in a process led by research methodologists and statisticians within the Census Bureau in consultation with BLS and other appropriate data agencies and outside experts, and that these changes would be based on solid analytical evidence. Like the NAS measure, and unlike the official measure of poverty, the SPM is designed to assess needs as defined by a limited set of goods and services compared to available resources to meet those needs. Thresholds are to be based on the spending for FCSU and other basics such as household supplied, personal care, and nonwork-related transportation; thus, SPM thresholds are not designed to represent all goods and services. Also, only resources available to meet threshold needs are to be counted; resources are to be defined as after taxes, after work-related expenses, and after medical care needs are met. Table 14.1 includes the major characteristics of the SPM as compared to the official and the NAS measures of poverty. Research began in earnest on the SPM in the spring of 2010 with results being presented shortly thereafter. The first attempt to produce SPM thresholds, resources, and poverty statistics was presented by Garner and Short (2010a) at the International Association for Research on Income and Wealth (IARIW); thresholds, resources, and poverty statistics were produced for 2008. In 2010, the CPS ASEC added questions on childcare expenditures, MOOP expenditures, and tenure status that were needed to implement the ITWG suggestions. There was additional research and analysis by Bavier (2010a,b), Garner (2010), Johnson (2010), and Short and Renwick (2010) presented at various professional economics, policy, and statistics conference. The Congressional Research Service (Gabe, 2010) produced an excellent review of poverty measurement in the United States, including the SPM. The first Census Bureau report with SPM results based on the ITWG observations was published in the fall of 2011 with poverty statistics for 2010 (Short, 2011). In 2011, the University of Kentucky Center for Poverty Research sponsored a research forum on the “Cost of Living and the SPM.”5

5. A summary of the forum and the submitted working papers can be found at www.ukcpr.org/ sites/www.ukcpr.org/files/Supplemental_poverty_measures.pdf.

A brief history of the supplemental poverty measure Chapter | 14

401

TABLE 14.1 Poverty measure concepts operationalized: Official, NAS, and SPM. Official poverty measure

NAS

SPM

Purpose

Official poverty

Replace official

Supplement official

Defined where

OMB Statistical Policy directive no. 14

Census bureau in cooperation with BLS

Census bureau in cooperation with BLS

Estimation and reference sample for thresholds

All families and unrelated individuals

Families with two adults and two children

Estimation sample: Consumer units with two children Reference sample: Consumer units with two adults and two children

Poverty thresholds

Three times the cost of a minimum food diet in 1963

Based on 78%e83% of median expenditures on food, clothing, shelter, and utilities (FCSU) for families

Based on average of 30 e36th percentile of expenditures on FCSU for reference sample

Accounting for shelter needs in thresholds

Assumed to be in nonfood part of threshold

Assumed to be accounted for in average across housing tenure groups; does not include mortgage principal payment in original NAS

Accounted for in separate thresholds for owners with mortgages, owners without mortgages, and renters; includes mortgage principal payments Continued

402 Handbook of US Consumer Economics

TABLE 14.1 Poverty measure concepts operationalized: Official, NAS, and SPM.dcont’d Official poverty measure

NAS

SPM

Threshold adjustments

Vary by family size, composition, and age of householder

Vary by family size and composition using twoparameter equivalence scale, as well as geographic adjustments for differences in housing costs

Vary by resource unit size and composition using threeparameter equivalence scale, as well as geographic adjustments for differences in housing costs by tenure

Updating thresholds

Consumer Price Index: all items

Three-year moving average of expenditures on FCSU

Five-year moving average of expenditures on FCSU

Poverty measurement units

Families (individuals related by birth, marriage, or adoption) or unrelated individuals

Same as the official measure

Resource units (official family definition plus any coresident unrelated children, foster children, and unmarried partners and their relatives) or unrelated individuals (who are not otherwise included in the family definition)

Resource measure

Gross before-tax cash income

Sum of cash income plus noncash benefits that resource units can use to meet their FCSU needs, minus taxes (or plus tax credits), minus work expenses, medical expenses, and child support paid to another household

Same as NAS

BLS, Bureau of Labor Statistics; NAS, National Academy of Sciences; SPM, Supplemental Poverty Measure.

A brief history of the supplemental poverty measure Chapter | 14

403

3. Construction of the SPM thresholds Consistent with the NAS Panel recommendations (Citro and Michael, 1995) and the suggestions of the ITWG (2010), the SPM thresholds are based on outof-pocket spending on a basic set of goods and services that includes FCSU,6 and a small additional amount to allow for other needs as noted earlier. SPM thresholds produced by the BLS Division of Price and Index Number Research (DPINR) using 5 years of quarterly Consumer Expenditure (CE) Survey Interview data for consumer units with exactly two children are known as Research Experimental SPM thresholds.7 Expenditures are updated to annual threshold year dollars using the Consumer Price Index for All Urban Consumers-U.S. City Average (CPI-U). The thresholds are based on a range of FCSU expenditures centered on the 33rd percentile. Equivalence scales are used to convert the estimation sample FCSU expenditures to those of reference consumer units composed of two adults and two children.

6. The components of FCSU are defined here. Food expenditures are those for food at home and food away from home. Meals as pay are not counted nor are alcoholic beverages. Food expenditures are not expected to be exact but are collected through the use of global question and refer to “usual weekly” expenditures. Clothing expenditures include those for all the goods and services identified as “apparel” by the CE Division of the BLS. Apparel includes clothing for girls and boys aged 2 to 15 years, women and men aged 16 years and above, and for children less than 2 years of age. This category also includes footwear and other apparel products and services such as jewelry, shoe repair, apparel laundry and dry cleaning, and clothing storage. Shelter includes expenses for owners and for renters. To create the shelter variable for the SPM thresholds calculation, shelter expenses are restricted to those for the consumer unit’s primary residence only. For renters, expenditures include those for rent paid, maintenance and repairs paid for by the renter, and tenants’ insurance. Rent as pay is not included, although no information on this rent is collected in the CPS for resources. For owners, shelter expenses include those for property taxes and insurance, maintenance and repairs, and for those with mortgages, and mortgage interest and principal payments. As for renters, all expenditures are restricted to those for the CU’s primary residence. Unlike for the expenses of renters and owners without mortgages, mortgage shelter expenditures reflect obligations, not necessarily what the consumer unit paid. The CE Survey collects information about the terms of the mortgage or mortgages on the primary residence. Then staff members at the BLS who work with the CE data calculate the obligated payments. If property taxes and insurance are included in the mortgage payment, these too are calculated by these staff members for the consumer unit. Utility expenditures are those for energy including natural gas, electricity, fuel oil, and other fuels; telephone services including land lines, cell service, and phone cards; and water and other public services such as trash and garbage collected, and septic tank cleaning. For owners, these are for the primary residence only. For renters, these are for any utilities for which they are obligated to pay with the exception of rented vacation homes. The amount recorded by the respondent is for what is charged or billed, not what the consumer unit necessarily pays. The exception regarding questioning for utilities is for telephone cards; consumer units are asked about the purchase price of prepaid telephone and cellular cards and their spending for using public telephones. 7. See https://stats.bls.gov/cex/ for information on the CE.

404 Handbook of US Consumer Economics

3.1 Equivalence scale The equivalence scale used is one proposed by Betson (1996); parameters allow for the differing needs of adults and children and for economies of scale for consumption within the consumer unit. A distinguishing feature of the three-parameter equivalence scale is the adjustment for single parents; no adjustment for single parents was included in the two-parameter scale proposed by the NAS Panel. The three-parameter equivalence scale has been used in the production of the NAS (e.g., Garner and Short, 2010a,b) and SPM thresholds (for an early study, see, Garner et al., 2011)). Before estimating the SPM thresholds, the three-parameter equivalence scale is applied to expenditures made by consumer units with two children to convert them to expenditures for two adults with two children. The three-parameter equivalence scale, as used for the estimation of the SPM thresholds, is presented below. Single adults with children scale ¼ ð1 þ a þ bðK  1ÞÞf Multiple adults with children scale ¼ ðA þ bKÞf where a ¼ parameter to account for the needs of the first child in a single-parent consumer unit (CU), b ¼ parameter to account for the needs of additional children in a singleparent CU or all children in multiple adult CUs, f ¼ parameter to account for economies of scale within the consumer unit, A ¼ number of adults within the consumer unit, and K ¼ number of children within the consumer unit. The parameters a, b, and f were estimated by Betson to fit the cost of children literature. When rounded from Betson’s estimation, the parameters equaled 0.8, 0.5, and 0.7, respectively. To produce thresholds for resource units other than the reference unit composed of two adults and two children, the three-parameter equivalence scale is also used. However, when no children are present the following scales are applied: One adult scale ¼ 1 Two adults scale ¼ 1.41 With > 2 adults without children scale ¼ ðAÞf

3.2 Threshold estimation The range in the FCSU expenditure distribution that serves as the basis of the thresholds is the 30e36th percentile. Equivalized FCSU expenditures are

A brief history of the supplemental poverty measure Chapter | 14

405

ranked from lowest to highest for all consumer units with two children. Within the 30e36th percentile range, means of expenditures for FCSU and shelter plus utilities for each housing tenure group are estimated. These means are used to produce separate two-adult-two-child SPM thresholds for owners with mortgages, owners without mortgages, and renters. The equation used for estimation is presented below. SPM ThresholdEh ¼ 1:2  FCSUE  ðS þ UÞE þ ðS þ UÞEh 1.2 ¼ multiplier used to account for expenditures for other basic goods and services like those for household supplies, personal care, and nonworkrelated transportation. FCSU, S, and U refer to the means of the sum of expenditures for food, clothing, shelter and utilities, and the shelter and utilities portions of FCSU, respectively, within the 30th to 36th percentile range of FCSU expenditures. E refers to consumer units in the estimation sample within the 30th to 36th percentile range of FCSU equivalized expenditures. h refers to one of the three housing tenure groups: Owners with mortgages, Owners without mortgages, or Renters. Experimental SPM thresholds for the reference consumer unit are presented below for 2009e17.8 Also shown are standard errors and weighted distributions of consumer units by housing tenure each year. For use in poverty measurement, two-adult-two child SPM thresholds require two additional adjustments: one to account for the needs of differing numbers of adults and children, and a second to account for differences in prices across areas. The same equivalence scale used to produce the initial set of thresholds is used to produce the expanded set of thresholds. The adjustment for differences in the prices across areas is described next.

3.3 Geographic adjustments The American Community Survey (ACS) is used to adjust the FCSU thresholds for differences in prices across geographic areas. The geographic adjustments are based on five-year ACS estimates of median gross rents for two-bedroom units with complete kitchen and plumbing facilities. Separate medians are estimated for each of 260 metropolitan statistical areas large

8. See https://www.bls.gov/pir/spmhome.htm for standard errors and expenditure shares by FCSU components.

2016 $26,336 $280 0.382

2017 $27,085 $276 0.382

Owners without mortgages Standard error Percentage of sample

$20,298

$20,590

$21,175

$21,400

$21,397

$21,380

$21,806

$22,298

$23,261

$335 0.084

$341 0.093

$298 0.110

$233 0.120

$337 0.115

$470 0.108

$417 0.119

$390 0.129

$471 0.113

Renters Standard error Percentage of sample

$23,874 $345 0.426

$24,391 $379 0.421

$25,222 $378 0.431

$25,105 $398 0.442

$25,144 $400 0.447

$25,460 $363 0.476

$25,583 $282 0.510

$26,104 $302 0.489

$27,005 $263 0.505

a

Based on out-of-pocket expenditures for food, clothing, shelter, and utilities. Shelter expenditures include those for mortgage principal payments. Food expenditures are assumed to implicitly include spending using SNAP benefits.

406 Handbook of US Consumer Economics

Two-adult-two-child BLS-DPINR research experimental supplemental poverty measure (SPM) thresholdsa 2009 2010 2011 2012 2013 2014 2015 Owners with mortgages $24,450 $25,018 $25,703 $25,784 $25,639 $25,844 $25,930 Standard error $242 $323 $347 $368 $289 $345 $297 Percentage of sample 0.489 0.486 0.459 0.439 0.438 0.415 0.371

A brief history of the supplemental poverty measure Chapter | 14

407

enough to be identified on the public-use version of the CPS ASEC file. For each state, a median is estimated for each nonmetropolitan area (47) and for a combination of all smaller metropolitan areas within a state (42). This results in 349 adjustment factors. For details, see Renwick (2011).9 Each year the Census Bureau publishes the adjusted thresholds on its SPM website.10

4. Resources SPM resources include cash and noncash benefits with adjustments for reductions in resources due to meeting needs not accounted for in the thresholds. The primary source of data for the production of SPM resources is the Current Population Survey Annual Social and Economic Supplement (referred henceforth in this chapter as the CPS). These noncash benefits added and expenses subtracted are described next. While the original aim was to consistently include items in both the resource measure and the threshold, many of the noncash benefits listed below, with the exception of Supplemental Nutrition Assistance Program (SNAP), are not included in the threshold as noted earlier.

5. Additions to income: noncash benefits 5.1 Supplemental Nutrition Assistance Program SNAP benefits (formerly known as food stamps) are designed to allow eligible low-income households to afford a nutritionally adequate diet. In the CPS, respondents report if anyone in the household ever received SNAP benefits in the previous calendar year and, if so, the face value of those benefits. The annual household amount is prorated to SPM Resource Units within each household.

5.2 National School Lunch Program This program offers children free school lunches if family income is below 130% of federal poverty guidelines, reduced-price school meals if family income is between 130 and 185% of the federal poverty guidelines, and a subsidized school meal for all other children.11 In the CPS, the reference person is asked how many children “usually” ate a complete lunch at school, and if it was a free or reduce-priced school lunch. Since we have no further information, the value of school meals is based on the assumption that the children received the 9. Renwick et al. (2017) examined an alternative method of calculation for the geographic indexes using Regional Price Parities from the Bureau of Economic Analysis. 10. https://www.census.gov/data/tables/2017/demo/supplemental-poverty-measure/poverty-thres holds.html. 11. The poverty guidelines are issued each year by the Department of Health and Human Services (HHS). The guidelines are a simplified version of Census’s poverty thresholds used for administrative purposesdfor instance, determining financial eligibility for certain federal programs. For more details and guidelines, see https://aspe.hhs.gov/poverty-guidelines.

408 Handbook of US Consumer Economics

lunches every day during the last school year, which may overestimate the benefits received by each family. To value benefits, we obtain amounts on the cost per lunch from the Department of Agriculture Food and Nutrition Service, which administers the school lunch program. At the present time, there is no value included for government-provided school breakfasts or snacks.

5.3 Supplementary nutrition program for women, infants, and children This program is designed to provide food assistance and nutritional screening to low-income pregnant and postpartum women and their infants, and to lowincome children up to the age of 5 years. Incomes must be at or below 185% of the poverty guidelines, and participants must be nutritionally at-risk (having abnormal nutritional conditions, nutrition-related medical conditions, or dietary deficiencies). The CPS asks whether anyone in the household received Women, Infants, and Children Program (WIC) in the previous year. Lacking additional information, we assume 12 months of participation and value the benefit using program information obtained from the Department of Agriculture. As with school lunch, assuming yearlong participation may overestimate the value of WIC benefits received by a given SPM family. In these estimates, we assume that all children less than 5 years in a household where someone reports receiving WIC are also assigned receipt of WIC. If the child is aged 0 or 1 year, then we assume that the mother also gets WIC. If there is no child in the family, but the household reference person said “yes” to the WIC question, we assume there is a pregnant woman receiving WIC.

5.4 Low-Income Home Energy Assistance Program This program provides three types of energy assistance. Under this program, states may help pay heating or cooling bills, provide allotments for low-cost weatherization, or provide assistance during energy-related emergencies. States determine eligibility and can provide assistance in various ways, including cash payments, vendor payments, two-party checks, vouchers/coupons, and payments directly to landlords. In the CPS ASEC, the question on energy assistance asks for information about the entire previous year. Many households receive both a “regular” benefit and one or more crisis or emergency benefits. Since Low-Income Home Energy Assistance Program (LIHEAP) payments are often made directly to a utility company or fuel oil vendor, many households may have difficulty reporting the precise amount of the LIHEAP payment made on their behalf.

5.5 Housing assistance Households can receive housing assistance from a plethora of federal, state, and local programs. Federal housing assistance consists of a number of

A brief history of the supplemental poverty measure Chapter | 14

409

programs administered primarily by the US Department of Housing and Urban Development (HUD). These programs traditionally take the form of rental subsidies and mortgage interest subsidies targeted to very-low-income renters and are either project-based (public housing) or tenant-based (vouchers). The value of housing subsidies is estimated as the difference between the “market rent” for the housing unit and the total tenant payment. The “market rent” for the household is estimated using a statistical match with HUD administrative data. For each household identified in the CPS ASEC as receiving help with rent or living in public housing, an attempt is made to match on state, corebased statistical area, and household size. The total tenant payment is estimated using the total income reported by the household on the CPS ASEC and HUD program rules. Generally, participants in either public housing or tenantbased subsidy programs administered by HUD are expected to contribute the greater of one-third of their “adjusted” income or 10% of their gross income toward housing costs.12 See Johnson et al. (2010), for more details on this method. Initially, subsidies are estimated at the household level and prorated based on the number of people in the SPM family relative to the total number of people in the household if there is more than one SPM unit in a household. Housing subsidies are capped at the housing portion of the appropriate threshold minus the total tenant payment.

6. Subtractions from resources: necessary expenses 6.1 Taxes The NAS panel and the ITWG recommended that the calculation of family resources for poverty measurement should subtract necessary expenses that must be paid by the family. The measure subtracts federal, state, and local income taxes and Social Security payroll taxes (FICA) before assessing the ability of a family to obtain basic necessities, such as FCSU. Taking account of taxes allows us to account for receipt of the federal or state EITC and other tax credits. The CPS ASEC does not collect information on taxes paid but relies on a tax calculator to simulate taxes paid. These simulations include federal and state income taxes and FICA taxes.13 These simulations also use a statistical match to the IRS Statistics of Income public-use microdata file of tax returns.

12. HUD regulations define “adjusted household income” as cash income, excluding income from certain sources minus numerous deductions. Three of the income exclusions can be identified from the CPS ASEC: income from the employment of children, student financial assistance, and earnings in excess of $480 for each full-time student aged 18 years or above. Deductions that can be modeled from the CPS ASEC include $480 for each dependent, $400 for any elderly or disabled family member, childcare, and medical expenses. 13. Wheaton and Stevens (2016) compare the Census tax calculator to TAXSIM and the Bakija tax model and find consistency in tax estimates across the models.

410 Handbook of US Consumer Economics

6.2 Work-related expenses The NAS panel recommended deducting necessary work expenses, including both standard expenses associated with commuting and childcare necessary to enable a parent to work. Going to work and earning a wage often entails incurring expenses, such as travel to work and the purchase of uniforms or tools. For work-related expenses (other than childcare), the NAS Panel recommended subtracting a fixed weekly amount for each earner 18 years or over, set at 85% of median weekly expenses, as estimated from the SIPP. The number of weeks worked, reported in the CPS ASEC, is multiplied by this fixed weekly value for each person to arrive at annual work-related expenses. Edwards et al. (2014) examined an alternative method of valuing work-related expenses using the ACS. Another important part of work-related expenses is paying someone to care for children while parents work. To account for childcare expenses while parents worked, in the CPS, parents are asked whether or not they pay for childcare and how much they spent. The amounts paid for any type of childcare while parents are at work are summed over all children. The ITWG, following the recommendations of the NAS report, suggested capping the amount subtracted from income, when combined with other work-related expenses, so that these do not exceed total reported earnings of the lowest earning reference person or spouse/partner of the reference person in the family.14

6.3 Child support paid The NAS Panel recommended that, since child support received from other households is counted as income, child support paid out to those households should be deducted from the resources of those households that paid it. Without this subtraction, all child support is double counted in overall income statistics. Questions ascertaining amounts paid in child support are included in the CPS ASEC, and these reported amounts are subtracted in the estimates presented here.

6.4 Medical expenses The ITWG recommended subtracting medical expenses from income, following the NAS Panel. While many individuals and families have health insurance that covers most of the very large expenses, the typical family pays the costs of health insurance premiums, copays, deductibles, prescription drug,

14. Some analysts have suggested that this cap may be inappropriate in certain cases, such as if the parent is in school, looking for work, or receiving types of compensation other than earnings.

A brief history of the supplemental poverty measure Chapter | 14

411

and other small fees out-of-pocket. Questions covering these medical expenses are included in the CPS, but do not include Medicare Part B premiums. The CPS ASEC instrument identifies when a respondent reported Social Security Retirement benefits net of Medicare Part B premiums. For these respondents, a Part B premium set at the standard amount per month is automatically added to income. Corrections for these applied amounts are discussed in Caswell anShort (2011) and applied here. To be consistent with what is added to the Social Security income in these cases, the same amount is added to reported premium expenditures.15 For the remaining respondents that report Medicare status, Medicare Part B premiums are simulated using the rules for income and tax filing status (Medicare.gov).16 Changes were made to the questions about health insurance coverage and medical expenses in the 2014 CPS ASEC. Details about those changes can be found in Janicki (2014).

7. SPM estimation Using the definitions above, researchers at the Census Bureau construct a comprehensive measure of resources, starting with cash income, adding noncash benefits, and subtracting necessary expenses. For each SPM resource unit (see earlier presented table with definitions), resources are then compared to SPM thresholds that reflect adjustments for SPM resource unit size and composition and for geographic differences in the cost of housing across areas. If SPM resources fall below an SPM resource unit’s SPM threshold, all individuals in that resource unit are consider to be in poverty. If resources exceed the unit’s threshold, all members are considered “above poverty.”

8. SPM poverty rates Fig. 14.1 shows a comparison of the SPM to the official poverty measure from 2009 to 17 in terms of poverty rates. In 2017, the overall SPM rate was 13.9%. This was 1.6 percentage points higher than the official poverty rate of 12.3. The SPM has ranged from 0.6 to 1.6 percentage points higher than the official measure since 2009. There are two SPM rates for 2013, one with and one without the implementation of the redesigned income questions in the CPS ASEC. Fig. 14.2 shows the poverty rate using both measures for children below 18 years and adults aged 65 years and over. While the overall SPM poverty rate is higher than the official poverty rate, the SPM shows lower poverty rates for children than the official measure. This finding is not surprising as many 15. We make the simplifying assumption that respondents were insured by Medicare for the entire year. 16. The family income assumption is based on a rough estimate of eligibility and participation in at leaste one of the following programs: Qualified Medicare Beneficiary, Specified Low-Income Medicare Beneficiary, or Qualified Individual or Qualified Disabled and Working Individuals. We do not take into account the possibility of (state-specific) asset requirements.

412 Handbook of US Consumer Economics

18

Percent SPM

16

15.1

14

14.5

13.9

Official

12

12.3

10 8 6 4 2 0 2009

2010

2011

2012

2013

2014

2015

2016

2017

Includes unrelated individuals under the age of 15. Note: The data for 2013 and beyond reflect the implementation of the redesigned income questions.

FIGURE 14.1 Poverty rates using the official measure and the supplemental poverty measure (SPM): 2009e17. Source: U.S. Census Bureau, Current Population Survey, 2010-2018 Annual Social and Economic Supplements. Percent

25

20

15

10

5

0 2009

2010

2011

2012

2013

2014

2015

2016

2017

FIGURE 14.2 Poverty rates using the official measure and the supplemental poverty measure (SPM) for two age groups: 2009e17. Source: U.S. Census Bureau, Current Population Survey, 2010e2018 Annual Social and Economic Supplements.

antipoverty programs captured by the SPM, but not the official measure, are targeted toward families with children (such as the EITC and SNAP). For the first time since 2010, in 2016 there was a statistically significant increase in SPM poverty for individuals aged 65 years and above. Fox and Pacas (2018)

A brief history of the supplemental poverty measure Chapter | 14

413

(In millions) Under 18 years

Social Security

18 to 64 years

65 years and over

-27.0

Refundable tax credits

-8.3

SNAP

-3.4

SSI

-3.2

Housing subsidies

-2.9

Child support received

-1.0

School lunch

-1.2

TANF/general assistance

-0.5

Unemployment insurance

-0.5

Workers' compensation

-0.2

WIC

-0.3

LIHEAP

-0.2

Child support paid

0.2

Federal income tax

1.5

FICA

4.7

Work expenses

5.6

Medical expenses

10.9

-35

-25

-15

-5

5

15

Note: For information on confidentiality protection, sampling error, nonsampling error, and definitions, see .

FIGURE 14.3 Change in number of people in poverty after including each element: 2017 (in millions). Source: U.S. Census Bureau, Current Population Survey, 2018 Annual Social and Economic Supplement.

found that this increase was due to an increase in both deep and persistent poverty among people aged 65 years and older. Finally, one of the important contributions of the SPM is that it allows an examination of the potential magnitude of the effect of tax credits and transfers in alleviating poverty. Fig. 14.3 shows the effect that various additions and subtractions had on the number of people who would have been considered poor in 2017, holding all else constant and assuming no behavioral responses. Additions and subtractions are shown for the total population and by three age groups. Social Security transfers and refundable tax credits had the largest impacts, preventing 27.0 million and 8.3 million individuals, respectively, from falling into poverty. Medical expenses were the largest contributor to increasing the number of individuals in poverty, pushing 10.9 million individuals into poverty.

9. Changes to the SPM since 2011 As noted earlier, the SPM is a work in progress and improvements to the methods underlying the production of the measure and statistics are to be expected. The ongoing research regarding the estimation of the initial reference unit SPM thresholds continues but has not resulted in changes in the thresholds that are published on the BLS website. Changes, however, have resulted for the resource measure. The Census Bureau implemented several minor changes which are detailed in the 2016 report.17 17. These changes involved the method for valuing WIC benefits, public housing benefits, and how official poverty status was assigned to unrelated children.

414 Handbook of US Consumer Economics

10. Extensions of the SPM Since 2011, researchers have extended SPM estimates to other data sets and historically. A workshop sponsored by the Urban Institute in 2011 emphasized the importance of extending SPM estimates to the ACS.18 Work at the Census Bureau has produced SPM estimates from the ACS and the Survey of Income and Program Participation. Outside researchers have created SPM-like estimates from the PSID (Kimberlin). Following the lead of the New York City CEO, Wisconsin, California, and Virginia published their own alternative poverty estimates. The Wisconsin Poverty Measure was developed by the IRP at the University of Wisconsin under the direction of Isaacs and Smeeding (see Isaacs and Smeeding, 2009 and Chung et al., 2012)19,20. The Wisconsin Measure thresholds are NASbased, but resources are defined following the recommendations of both the NAS and ITWG for the SPM. Other states that have developed SPM-type measures include Connecticut, Georgia, Illinois, Massachusetts, Minnesota, New York, and Oregon. For California (Bohn et al. 2015) and Virginia (Cable, 2013),21 SPM-based poverty measures and statistics have been produced. The California Poverty Measure is jointly produced by the Stanford Center on Poverty and Inequality and the Public Policy Institute of California.22 The New York City CEO introduced an SPM-type of measure in 2012 (NYC-CEO, 2012).23 The most recent CEO report was published in 2019 with SPM-based poverty statistics for 2005e17 (NYC-CEO, 2019). In addition to NYC, research has also been conducted that results in city-level estimates for San Francisco. Many individuals and organizations have assisted in the development of the state and local area measures including most notably researchers 18. https://www.urban.org/sites/default/files/publication/27531/412396-workshop-on-state-povertymeasurement-using-the-american-community-survey.pdf. 19. The IRP continues to use a NAS-type measure with the threshold being the one identified as the FCSU from the Census Bureau web page which includes principal payments for mortgage payments. Another alternative includes medical expenses in the thresholds. See Wisconsin Poverty Report: Progress Against Poverty Stalls in 2016, Timothy M. Smeeding and Katherine A. Thornton, IRP, U WisconsineMadison, June 2018. https://www.irp.wisc.edu/wp/wpcontent/uploads/2018/06/WI-PovertyReport2018.pdf. 20. See http://www.irp.wisc.edu/research/povmeas/spm.htm for examples of some of this research. 21. The Virginia Poverty Measure is based on the SPM with one major difference in thresholds: medical costs like those for food and shelter are included in the thresholds. As noted by Cable (2013), this method has been endorsed by the Wisconsin Institute for Research on Poverty and the Urban Institute and is used in their local-level NAS-style poverty measures. 22. See https://inequality.stanford.edu/publications/research-reports/california-poverty-measure for information on the CPM. 23. Unlike the SPM thresholds produced by the BLS, the NYC CEO does not produce separate thresholds for renters, owners with mortgages, and owners without mortgages. Rather, a housing status adjustment is applied to all households that reside in “nonmarket rate” housing. For information on this, see Appendix B Deriving a Poverty Threshold for New York City (NYC-CEO 2019a and Appendix C Adjustment for Housing Status 2019b).

A brief history of the supplemental poverty measure Chapter | 14

415

from the NYC-CEO as noted previously, the Urban Institute (e.g., Zedlewski et al., 2010), Stanford Center on Poverty and Inequality, and University of Wisconsin IRP. As noted by others, there are many differences in the methodologies across the states and cities but the resulting poverty measures yield similar findings to the SPM estimates released by the Census Bureau. Researchers at Columbia University developed a historical version of the SPM, extending SPM estimates back to 1967 (Fox et al., 2015). This research allows for longer-term analysis of poverty trends and the changing roles of antipoverty programs in the postwelfare reform era. Wimer et al. (2016) also used these data sets to create an anchored SPM, setting poverty thresholds at 1967 and adjusting for inflation. This analysis allowed for a more similar comparison to the official poverty measure as changing living standards were held constant over time. This research was featured in a special chapter devoted to the 50th anniversary of the War on Poverty in the 2014 Economic Report of the President.24 A recent working paper by Fox (2017a) explored the impact of anchoring the SPM in 2009 and found that using anchored thresholds results in a smaller poverty rate decline compared to the SPM from 2009 to 2016 (0.5 percentage points vs. 1 percentage point).

11. Ongoing research on the SPM As noted earlier, the BLS, Census Bureau, and the ITWG consider the SPM a work in progress and expect that there will be improvements to the measure over time. With this in mind, research continues at the BLS and Census Bureau on the SPM and continues to be presented in various venues.25 In addition to the joint research with BLS researchers, Census Bureau research has focused on methods to improve the resource measure and geographic adjustments applied to the SPM thresholds. External researchers have also contributed to the literature, leading to potential improvements in the SPM. At the Census Bureau, research has focused on alternative geographic adjustments (Renwick, 2009a,b; Renwick, 2011; Renwick et al. 2014), improving the estimation of housing subsidies (Renwick and Mitchell, 2015; Kingkade, 2017; Renwick, 2017), refining the treatment of MOOP spending (Caswell & Short, 2011), and alternative methods of estimating work-related expenses (Edwards et al., 2014; Mohanty et al., 2017). Additionally, research at the Census Bureau also includes the potential usage of administrative records to account for underreporting in noncash benefits (Parker, 2011; Fox et al., 2017; Stevens et al., 2018).

24. https://obamawhitehouse.archives.gov/sites/default/files/docs/full_2014_economic_report_of_ the_president.pdf. 25. For example, the UC Davis Conference 2014: https://poverty.ucdavis.edu/event/war-povertyconference and the Stanford University conference in 2016; https://inequality.stanford.edu/ programs/conferences/measuring-poverty-21st-century.

416 Handbook of US Consumer Economics

At the BLS, several research efforts, some in coordination with colleagues at the Census Bureau, have been ongoing, although not implemented for use in official publications. This research has focused in-kind benefits, including MOOP expenditures in thresholds, expanding the estimation sample size, alternative equivalence scales, and the role of prices in the estimation of the initial SPM thresholds. Research to include the value of in-kind benefits in the SPM thresholds has results in several research studies (e.g., Garner and Gudrais, 2018; Garner et al., 2015, 2016; Garner and Hokayem, 2011, 2012). SNAP is assumed to already be included in food expenditures as debit card SNAP benefits can be used like other legal tender. However, in-kind benefits arising from the following are not currently being counted in the SPM thresholds: rental subsidies (public housing and rental assistance), WIC, National School Lunch Program (NSLP), and LIHEAP. The ITWG (2010) recommended that these benefits be included in the thresholds in order for thresholds and resources to be defined consistently. Other research has focused on the inclusion of MOOP expenditures in SPM thresholds as opposed to being subtracted from resources. Following the NAS recommendations, MOOP expenditures are to be subtracted from resources for the SPM as well. However, subtracting MOOP expenditures from resources lead to much debate early on with the NAS (see Citro and Michael, 1995). In response to demand, particularly by users of the NAS measure, thresholds that include MOOP expenditures, referred to as FCSUM, were estimated and researched (see Banthin et al., 2001). For many years now, due to increasing demand by the user community, the BLS has been producing FCSUM NAS thresholds; these are available on the Census Bureau website. Not surprisingly, with the production of the SPM, calls for including MOOP expenditures in the SPM thresholds also arose. In response, Garner et al. (2015) produced SPM thresholds that included MOOP expenditures; this research was presented at the ASSA 2014 meetings and at a 2015 Brookings Institution seminar. The authors concluded that the underlying SPM methodology, of basing the thresholds around the 33rd percentile of FCSUM expenditures, did not adequately account for medical care needs of consumers at the lower end of the spending distribution as consumers in this range of expenditures were more likely to have no medical insurance or public insurance. In 2018, Fox and Garner produced preliminary thresholds following the SPM methodology but changed the range in the spending distribution. The idea is that such an approach might better support thresholds that include MOOP expenditures. Specifically, if the SPM thresholds were based on a percentage of the median, rather than around the 33rd percentile, MOOP would more be more representative of medical care needs as consumers in this range are more likely to have private health insurance. Researchers at Baruch College, CUNY, have published research using a health-inclusive version of the SPM to estimate the effects of health insurance

A brief history of the supplemental poverty measure Chapter | 14

417

on poverty under the Affordable Care Act. Their measure includes the need for health insurance in the thresholds and counts health insurance benefits as resources available to meet that need (Remler et al., 2017). Along a similar line of research is a focus on the estimation sample upon which the SPM thresholds are based. This was debated by the ITWG in the winter of 2009e10 and continues to be debated. As noted earlier, the SPM estimation sample includes all consumer units with two children and any number of adults; NAS thresholds are based on the experience of consumer units with exactly two adults and two children. This change was made to better reflect differences in the living arrangement of people in the United States since the NAS report was released. Fox and Garner (2018) tested the impact on SPM thresholds and poverty statistics of expanding the estimation samples: first by including consumer units with any number of adults and children, and second by including all consumer units. Other earlier BLS research focused on expanding the number of years of CE data upon which the thresholds would be based. The NAS recommended using 3 years of data while the ITWP recommended using 5 years. Since first producing SPM thresholds, 5 years of data have been used. The larger sample was expected to “increase the stability of the thresholds and to ensure that they move more slowly from year to year” (ITWG, 2010, p. 4). Yet, even with this sample expansion, the sample sizes underlying the SPM thresholds, particularly those for owners without mortgages, are relatively small. For the SPM, a three-parameter equivalence scale is used; yet, there is a debate regarding whether this scale adequately accounts for the economies of scale in housing. This issue has most recently been addressed by Renwick and Garner (2017). In producing SPM thresholds and subsequent statistics, it is assumed that all consumer units, regardless of size and composition, share the same fraction of the thresholds on housing (i.e., shelter and utilities). The implication of this assumption is that the implicit economies of scale for housing are the same as those for the thresholds as a whole. If, on the other hand, one assumes that housing expenditures are subject to greater economies of scale than are food and clothing, it would be reasonable to use a larger percentage to identify the housing portion of the thresholds for smaller families. This would have two consequences for SPM poverty statistics. First, the portion of the SPM thresholds subject to the geographic adjustment would be larger for smaller familiesdincreasing thresholds for those who live in areas with housing costs greater than the national median, and decreasing thresholds for those who live in areas with lower housing costs. Second, since the values of housing subsidies in SPM resources are capped at the housing portion of the thresholds, this would increase the value of housing subsidies for some smaller consumer units and could reduce their poverty rates. Renwick and Garner investigated the impact of varying the housing share of the SPM poverty thresholds directly, first, by varying the share of the thresholds for housing, and

418 Handbook of US Consumer Economics

secondly and indirectly, by applying different equivalence scales by consumer unit size to the housing share of the thresholds only. Other research at the BLS focuses on the role of prices in the estimation of SPM thresholds. Garner and Munoz Henao (2018) developed quality-adjusted normalized prices using the CE data to test whether spatial differences in shelter and utility prices are embedded in the initial estimation of the twoadult-two-child SPM thresholds. Research suggests that such price differences exist (see Bishop et al., 2017). Garner and Munoz Henao applied these prices to the shelter and utilities component of the thresholds and reported that SPM thresholds are marginally higher for select geographic areas.

12. Future directions for SPM In 2016, OMB convened a new ITWG (SPM ITWG) to provide advice on challenges and opportunities brought before it by the Census Bureau and BLS concerning data sources, estimation, survey production, and processing activities for development, implementation, publication, and improvement of the SPM. Much like the original SPM ITWG, this group is composed of career federal employees representing their respective agencies and chaired by OMB. The agencies currently represented include the Bureau of Economic Analysis, the BLS, the Council of Economic Advisors, the Census Bureau, the Economic Research Service, the Food and Nutrition Service, the Department of Health and Human Services, the Department of Housing and Urban Development, the Internal Revenue Service, the National Center for Education Statistics, the National Center for Health Statistics, the OMB, and the Social Security Administration. Currently, this ITWG is reviewing potential changes to the SPM to implement in 2021, the 10-year anniversary of the first SPM report. A timeline for making changes is presented below. Among the potential ideas up for consideration include a new estimation of work expenses, modifications to the thresholds, including updating equivalence scales, expanding the estimation sample, moving the base of the thresholds from the 33rd percentile to the median of the FCSU distribution discussed earlier (see Fox and Garner, 2018), as well as incorporating additional noncash benefits in the threshold also discussed earlier (for example, see Garner et al., 2016). Before adopting any major changes, researchers at the Census Bureau and BLS will present results showing the need for and impact of such a change. Potential changes to the SPM will be presented and discussed at conferences, expert meetings, and posted on the Census Bureau’s SPM website (www.census.gov/topics/incomepoverty/supplemental-poverty-measure.html). The ITWG will make the final decision on changes in September 2020, and these changes, if any, will be implemented in the September 2021 SPM report.

A brief history of the supplemental poverty measure Chapter | 14

419

13. Summary The SPM is the culmination of decades of research on alternative poverty measures in the United States. Released annually in addition to the official poverty measure since 2011, the SPM does not replace the official poverty measure and is not designed to be used for program eligibility or funding distribution. The SPM is designed to provide information on aggregate levels of economic need at a national level or within large subpopulations or areas. As such, the SPM provides an additional macroeconomic statistics for further understanding economic conditions and trends. This handbook chapter provides estimates of the SPM for the United States. The results shown illustrate differences between the official measure of poverty and a poverty measure that takes account of noncash benefits received by families and nondiscretionary expenses that they must pay. The SPM also employs a poverty threshold that is updated by the BLS with information on expenditures for FCSU. Results showed higher poverty rates using the SPM than the official measure for most groups, with children being an exception with lower poverty rates using the SPM. The SPM allows us to examine the effect of taxes, noncash transfers, and necessary expenses on the poor and on important groups within the population in poverty. In addition, the effect of benefits received from each program and taxes and other nondiscretionary expenses on SPM rates was examined. Finally, this chapter examined current research on potential future changes to the SPM.

References Banthin, J., Garner, T.I., Short, K., January 2001. Medical care needs in poverty thresholds: problems posed by the uninsured. In: Agency for Healthcare Research and Quality, Paper Presented at the Annual Meeting of the Allied Social Science Associations (ASSA). Bavier, R., December 1997. Updating the Poverty Thresholds With Expenditure Data. Poverty Measurement Working Paper. U.S. Census Bureau.

420 Handbook of US Consumer Economics Bavier, R., March 2001. Do the Current Poverty Thresholds Include any Amount for Health Care. Poverty Measurement Working Paper. Bavier, R., November 25, 2010a? . From NRC to SPM e What Has Not Changed? Unpublished memo. Bavier, R., December 15 2010b. From NRC to SPM e What Has Not Changed? Unpublished memo. Betson, D., 1995. The Effect of Homeownership on Poverty Measurement. Office of Assistant Secretary of Planning and Evaluation. Department of Health and Human Services Manuscript. unpublished manuscript. Betson, D., 1996. Is Everything Relative? the Role of Equivalence Scales in Poverty Measurement. University of Notre Dame, Poverty Measurement Working Paper. U.S. Census Bureau. Betson, D., October 2009. Homeownership and Poverty Measurement. Paper Prepared for Presentation at the Brookings Institution/Census Bureau Conference on Improved Poverty Measurement, Washington, DC. Bishop, J., November 15, 2010. Discussant of Supplemental Poverty Measure Thresholds: Estimates for 2008. presentation by thesia I. Garner at the Society of Government Economists Annual Meeting, Washington, DC. Bishop, John, Jonathan, Less, Zeager, Lester A., March 8, 2017. “Improving the Supplemental Poverty Measure: Two Proposals,”. unpublished manuscript available from the authors, Department of Economics, East Carolina University, Greenville, NC. Blank, R.M., 2008. Presidential Address: How to Improve Poverty Measurement in the United States. Journal of Policy Analysis and Management 27 (2), 233e254. Blank, R.M., Fall 2011. The Supplemental Poverty Measure: A New Tool for Understanding U.S. Poverty. Pathways. Bohn, S., Danielson, C., Kimberlin, S., Mattingly, M., and Wimer, C., The California Poverty Measure 2012 Technical Appendices, Stanford Center on Poverty and Inequality. 2015, https:// inequality.stanford.edu/sites/default/files/CPM_2012_appendices.pdf. Bridges, B., Gesumaria, R.V., 2013. The supplemental poverty measure and the aged: how and why the SPM and official poverty estimates differ. Social Security Bulletin 73 (4). Bureau of Labor Statistics (BLS), September 2017. 2016 Supplemental Poverty Measure Thresholds Based on Consumer Expenditure Survey Data. Washington, DC. Available at: www.bls.gov/pir/spm/spm_thresholds_2016.htm. Burtless, G., March 2, 1998. Possible Workshops on Experimental Poverty Measure. memorandum to Kathy Short. Cable, D., The Virginia Poverty Measure: An Alternative Poverty Measure for the Commonwealth, Weldon Cooper Center for Public Service, University of Virginia, Charlottesville, VA, 2013. https://demographics.coopercenter.org/sites/demographics/files/VirginiaPovertyMeasure_ FullReport_May2013_0.pdf. Caswell, K.J., Kathleen, S.S., September 2011. Medical Out-Of-Pocket Spending Among the Uninsured: Differential Spending & the Supplemental Poverty Measure. SEHSD Working Paper Number 2011e24. U.S. Census Bureau. Caswell, K., Short, K., August 2011. Medical Out-of-pocket Spending of the Uninsured: Differential Spending and the Supplemental Poverty Measure. Presented at the Joint Statistical Meetings, Miami, FL. U.S. Census Bureau. Poverty Measurement Working Paper. Chung, Y., Isaacs, J.B., Smeeding, T.M., Thornton, K.A., May 2012. Wisconsin Poverty Report: Policy Context, Methodology, and Results for 2010, Part of the Wisconsin Poverty Project’s Fourth Annual Report Series. Institute for Research on Poverty University of

A brief history of the supplemental poverty measure Chapter | 14

421

Wisconsin Madison. https://www.irp.wisc.edu/wp/wp-content/uploads/2018/05/WIPovPolicy ContextMethodologyResults_May2012.pdf. Citro, C.F., Michael, R.T. (Eds.), 1995. Measuring Poverty: A New Approach. National Academy Press, Washington, DC. Doyle, P., Johantgen, M., June 1996. The New Poverty Measure: Administrative Data as a Source of Medical Expenditures. Poverty Measurement Working Paper. U.S. Census Bureau. Doyle, P., May 1, 1997. Who’s at Risk? Designing a Medical Care Risk Index, Available from the Census Bureau. https://www.census.gov/content/dam/Census/library/working-papers/1997/ demo/doyle2.pdf. Edwards, A., McKenzie, B., Short, K., January 2014. Work-related Expenses in the Supplemental Poverty Measure. Poverty Measurement Working Paper. U.S. Census Bureau. Fisher, G.M., September 1997. The Development of the Orshansky Poverty Thresholds and Their Subsequent History as the Official U.S. Poverty Measure. SEHSD Working Paper. Fisher 1995 to 1999: https://www.census.gov/content/dam/Census/library/working-papers/1999/ demo/fisher2.pdf. Fox, L., Garner, T., 2018. Moving to the Median and Expanding the Estimation Sample: The Case for Changing the Expenditures Underlying SPM Thresholds. SEHSD Working Paper #: 201802. Fox, L., Pacas, J., April 2018. Deconstructing Poverty Rate Among the 65 and Older Population: why has Poverty Increased Since 2015? SEHSD Working Paper Number 2018-13 U.S. Census Bureau. Fox, L., Wimer, C., Garfinkel, I., Kaushel, N., Waldfogel, J., 2015. Waging war on poverty: poverty trends using a historical supplemental poverty measure. Journal of Policy Analysis and Management 34, 567e592. Fox, L.E., Heggeness, M.L., Pacas, J., Stevens, K., December 2017. Using SNAP Administrative Records to Evaluate Poverty Measurement. SEHSD Working Paper Number 2017-49. U.S. Census Bureau. Fox, L.E., December 2017a. Anchored and Relative: Supplemental Thresholds for the SPM. SEHSD Working Paper Number 2017-50. U.S. Census Bureau. Fox, L., 2017b. Revising Official Poverty Estimates of Unrelated Children Under Age 15 in the Supplemental Poverty Measure Report. SEHSD Working Paper. Fox, L., September 2017c. The Supplemental Poverty Measure: 2016. Current Population Reports, P60-261. U.S. Census Bureau. Available at: https://www.census.gov/content/dam/Census/ library/publications/2017/demo/p60-261.pdf. Fox, L., September 2018. The Supplemental Poverty Measure: 2017. Current Population Reports, P60-265. U.S. Census Bureau. Available at: https://www.census.gov/content/dam/Census/ library/publications/2018/demo/p60-265.pdf. Gabe, T., April 12, 2010. Poverty Measurement in the United States: History, Current Practice, and Proposed Changes. Congressional Research Services (CRS) Report for Congress, (CRS Report No. R41187). Garner, T.I., Betson, D., March 20, 2010. Housing and Poverty Thresholds: Different Potions for Different Notions. Paper Prepared for Presentations at the Midwestern Economics Association Meetings, Chicago, IL (An earlier version of this paper is available as BLS Working Paper 435: “Setting and Updating Modern Poverty Thresholds,”. http://stats.bls.gov/osmr/abstract/ ec/ec100030.htm. and was also presented at the Annual Meeting of the Allied Social Science Associations (ASSA) in Atlanta, Georgia, January 3, 2010 during the Society of Government Economists Session (SGE).

422 Handbook of US Consumer Economics Garner, T.I., Betson, D., March 2010a. Setting and Updating Modern Poverty Thresholds. BLS Working Paper 435. Garner, T.I., Gudrais, M., December 29, 2012. Maintaining Consumption Levels Over Economic Fluctuations and the Impact on Consumption vs. Spending-Based SPM Thresholds. Prepared for Presentation at the Annual Meeting of the Allied Social Science Associations (ASSA), San Diego, CA. available at: https://www.bls.gov/pir/spm/garner_exp_cons_spm_assa_2_29_12. pdf. Garner, T.I., Hokayem, C., September 2011. Supplemental Poverty Measure Thresholds: Imputing Noncash Benefits to the Consumer Expenditure Survey Using Current Population Survey e parts I and II. SEHSD Working Paper Number 2011-27. U.S. Census Bureau. Garner, T.I., Hokayem, C., July 2012. Supplemental Poverty Measure Thresholds: Imputing School Lunch and WIC Benefits to the Consumer Expenditure Survey Using Current Population Survey. BLS Working Paper 457. Available at: http://www.bls.gov/osmr/pdf/ec120060. pdf. Garner, T.I., Short, K., November 1, 2001. Owner-Occupied Shelter in Experimental Poverty Measures. Paper Presented at the Annual Meeting of the Southern Economic Association, Tampa, Florida. Garner, T.I., Short, K., March 2002. Personal Assessments of Minimum Income and Expenses: What do they tell us about ‘minimum living’ Thresholds and Equivalence Scales? BLS Working Paper 379. Garner, T.I., Munoz Henao, J.D., 2018. Controlling for Prices Before Estimating SPM Thresholds and the Impact on SPM Poverty Statistics. Presented at the 2018 Society of Government Economists Annual Conference, Washington, DC. Federal Committee on Statistical Methodology Research and Policy Conference, Washington, DC, 2018 ASSA Annual Meetings in Philadelphia, PA and SEA Annual Meetings in Orlando, FL 2017. Garner, T.I., Rozaklis, P., 1999. Accounting for Owner Occupied Housing in Poverty: Focus on Thresholds. American Statistical Association (ASA) Proceedings of the Section on Government Statistics and Section on Social Statistic. Garner, T.I., Rozaklis, P., January 6, 2001. Owner-occupied Housing: an Input for Experimental Poverty Thresholds. Paper Presented at Session Organized by the Society of Government Economists at the Annual Meeting of the Allied Social Sciences Associations. Garner, T.I., Short, K., May 2002. Experimental Poverty Measures Under Alternative Treatments of Medical out-of-pocket Expenditures: an Application of the Consumer Expenditure Survey. BLS Working Paper 358. Garner, T.I., Short, K.S., August 2010a. Combining Surveys for Poverty Measurement. Paper Presented at the 31st Conference of the International Association for Research on Income and Wealth. www.iariw.org/papers/2010/4dGarner.pdf. Garner, T.I., Short, K.S., June 2010b. Creating a consistent poverty measure over time using NAS procedures: 1996e2005. Review of Income and Wealth. Series 56, Number 2. Garner, T.I., Short, K., Shipp, S., Nelson, C., Paulin, G., March 1998. Experimental poverty measurement for the 1990s. Monthly Labor Review 121 (3), 39e61. Garner, T.I., Gudrais, M., Short, K., August 2015. Consistency in supplemental poverty measurement: adding imputed in-kind benefits to thresholds and impact on poverty rates for the United States. In: 2015 JSM Proceedings, Seattle, WA. Garner, T.I., Gudrais, M., Short, K., April 2016. Supplemental Poverty Measure Thresholds and Noncash Benefits available at: https://www.bls.gov/pir/spm/smp-thresholds-and-noncashbenefits-brookings-paper-4-16.pdf.

A brief history of the supplemental poverty measure Chapter | 14

423

Garner, T.I., Gudrais, M., Sepember 2018. Alternative Poverty Measurement for the U.S.: Focus on Supplemental Poverty Measure. BLS Working Paper 510. Available at: https://www.bls.gov/ pir/spm/alt-pov-spm-wp-510.pdf. Garner, T.I., August 10, 2005. Developing poverty thresholds 1993e2003. Presented at the conference in Minneapolis, Minnesota. In: 2005 Proceedings of the American Statistical Association, Social Statistics Section [CD-ROM]. American Statistical Association, Alexandria, VA (revised September 18, 2006). Garner, T.I., August 2009a. National Academy of Sciences–Based Poverty Thresholds: the Details of Alternatives and Choices in Specification. Presented at the Joint Statistical Meetings, American Statistical Association, Social Statistics Section, Washington, DC. Garner, T.I., October 2009b. Poverty Thresholds Alternatives/Choices. Presented at the Brookings Institution/Census Bureau Conference on Improved Poverty Measurement, Washington, DC. Garner, T.I., December 2010. Supplemental Poverty Measure Thresholds: Laying the Foundation. Available at: https://stats.bls.gov/pir/spm/spm_pap_thres_foundations10.pdf. Garner, T.I., 2010a. Moving to a Supplemental Poverty Measure (SPM): Research on Thresholds for 2008. Presentation at the Southern Economic Association Annual Meeting, Atlanta, GA, November 20, 2010 (Presentation available from: the author). Iceland, J., 2005. Experimental Poverty Measures Summary of a Workshop. The National Academy of Sciences Press, Washington, DC. Isaacs, J.B., Smeeding, T.M., April 2009. The First Wisconsin Poverty Report. https://www.irp. wisc.edu/resource/the-first-wisconsin-poverty-report/. ITWG, March 2010. Observations from the Interagency Technical Working Group on Developing a Supplemental Poverty Measure. Available at: www.census.gov/hhes/povmeas/methodology/ supplemental/research/SPM_TWGObservations.pdf. Janicki, H., September 2014. Medical Out-of-pocket Expenses in the 2013 and 2014 CPS ASEC. SEHSD Working Paper. Johnson, D.S., Smeeding, T.M., May 2012. A Consumer’s Guide to Interpreting Various U.S. Poverty Measures. University of Wisconsin-Madison Institute for Research on Poverty, Fast Focus (14). Johnson, D., Shipp, S., Garner, T.I., August 1997. Developing poverty thresholds using expenditure data. In: Proceedings of the Government and Social Statistics Section. American Statistical Association, Alexandria, VA, pp. 28e37. Johnson, P., Renwick, T., Short, K., 2010. Estimating the Value of Federal Housing Assistance for the Supplemental Poverty Measure. Poverty Measurement Working Paper. U.S. Census Bureau. Johnson, D., Winter 2010. Progress toward improving the U.S. Poverty measure: developing the new supplemental poverty measure. Focus 27 (2). Kilpatrick, R.W., 1973. The income elasticity of the poverty line. The Review of Economics and Statistics 55 (3), 327e332. Kingkade, W.W., September 2017. What are Housing Assistance Support Recipients Reporting as Rent? SEHSD Working Paper Number 2017-44 U.S. Census Bureau. Kokoski, M.F., 1994. Patrick Cardiff, and Brent Moulton, Interarea Price Indices for Consumer Goods and Services: an Hedonic Approach Using CPI data. BLS working paper 256, july, BLS Working Papers, Office of Prices and Living Conditions. U.S. Department of Labor, Washington, D. C. Macartney, S., July 2013. Estimating the Value of WIC Benefits for the Supplemental Poverty Measure. SEHSD Working Paper Number 2013-18. U.S. Census Bureau. Available at:

424 Handbook of US Consumer Economics https://www.census.gov/content/dam/Census/library/working-papers/2013/demo/wic-paperjuly2013.pdf. Mohanty, A., Edwards, A., Fox, L., November 2017. Measuring the Cost of Employment: Workrelated Expenses in the Supplemental Poverty Measure. SEHSD Working Paper Number 2018-43. U.S. Census Bureau. Moskowitz, D., Haskins, R., Smeeding, T.M., May 11, 2010. Is the Census Bureau’s Supplemental Poverty Measure a Relative Measure of Poverty? Brookings Institution Center on Children and Families. Available at: https://www.brookings.edu/wp-content/uploads/2016/06/0511_census_ haskins.pdf. New York City Center for Economic Opportunity (NYC-CEO), August 2008. The CEO Poverty Measure, A Working Paper. https://www1.nyc.gov/assets/opportunity/pdf/08_poverty_ measure_report.pdf. April 2013, The CEO Poverty Measure, 2005-2011, An Annual Report. https://www1.nyc.gov/ assets/opportunity/pdf/13_poverty_measure_report.pdf. New York City Center for Economic Opportunity (NYC-CEO), April 2019a, New York City Government Poverty Measure 2017, An Annual Report. Appendix B Deriving a Poverty Threshold for New York City https://www1.nyc.gov/assets/opportunity/pdf/NYCgovPoverty2019_ Appendix_B.pdf. New York City Center for Economic Opportunity NYC-CEO, April 2012. The CEO Poverty Measure, 2005-2010, A Working Paper. https://www1.nyc.gov/assets/opportunity/pdf/12_ poverty_measure_report.pdf. New York City Center for Economic Opportunity (NYC-CEO), April 2019b, New York City Government Poverty Measure 2017, An Annual Report. Appendix C Adjustment for Housing Status https://www1.nyc.gov/assets/opportunity/pdf/NYCgovPoverty2019_Appendix_C.pdf. Parker, J., September 2011. SNAP Misreporting on the CPS: Does it Affect Poverty Estimates? U.S. Census Bureau. SEHSD Working Paper Number 2012-01. Remler, D.I., Dorenman, S.D., Hyson, R.T., October 2017. Estimating the effects of health insurance and other social programs on poverty under the affordable care act. Health Affairs 36 (10). Renwick, T., Mitchell, J., November 2015. Estimating the Value of Federal Housing Assistance for the Supplemental Poverty Measure. SEHSD Working Paper Number 2016-01. U.S. Census Bureau. Renwick, T., Figueroa E., and Aten, B., “Supplemental Poverty Measure: Adjustments with Regional Price Parities vs. Median Rents from the ACS,” March 2014, Working Paper Number: SEHSD-WP2014-22. Available at: https://www.census.gov/library/working-papers/ 2014/demo/SEHSD-WP2014-22.html. Renwick, T., Figueroa, E., Aten, B., July 2017. Supplemental Poverty Measure: A Comparison of Geographic Adjustments with Regional Price Parities vs. Median Rents from the American Community Survey: An Update. Available at: www.census.gov/library/working-/2017/demo/ SEHSD-WP2017-36.html. Renwick, T., August 2009a. Alternative Geographic Adjustments of U.S. Poverty Thresholds: Impact on State Poverty Rates. Presented during the American Statistical Association Annual Meetings, Washington, DC. https://www.census.gov/content/dam/Census/library/workingpapers/2009/demo/geo-adj-pov-thld8.pdf. Renwick, T., August 2009b. Experimental poverty measures: geographic adjustments from the American community survey and BEA price Parities. In: 2009 Proceedings of the American Statistical Association, Social Statistics Section [CD-ROM]. American Statistical Association, Alexandria, VA pp.-pp. Presented at the conference in Washington, DC.

A brief history of the supplemental poverty measure Chapter | 14

425

Renwick, T., January 3, 2010. Improving the Measurement of Family Resources in a Modernized Poverty Measure. Paper Prepared for Presentations at the Allied Social Sciences Associations (ASSA) Meetings. Society of Government Economists (SGE) session, Atlanta, GA. Renwick, T., 2011. Geographic Adjustments of Supplemental Poverty Measure Thresholds: Using the American Community Survey 5-year Data on Housing Costs. SEHSD Working Paper Number 2011-21. U.S. Census Bureau. Renwick, T., 2017. Estimating the Value of Federal Housing Assistance for the Supplemental Poverty Measure. Eliminating the Public Housing Adjustment SEHSD Working Paper Number 2017-38. U.S. Census Bureau. Ruggles, P., 1990. Drawing the LinedAlternative Poverty Measures and Their Implications for Public Policy. The Urban Institute Press, Washington, DC. Semega, J., Fontenot, K., Kollar, M., September 2017. Income and poverty in the United States: 2016. Current Population Reports, P60-259, U.S. Census Bureau, U.S. Government Printing Office, Washington, DC. Shorst, K., Garner, T.I., August 2002. Experimental poverty measures under alternate treatments of medical out-of-pocket expenditures. Monthly Labor Review 3e13. Short, K., Renwick, T.J., 2010. Supplemental Poverty Measure: Preliminary Estimation for 2008. Paper Prepared for the 32nd Annual Research Conference of the Association for Public Policy Analysis and Management, Boston, MA, November 4-6, 2010. Short, K., Martina, S., Eller, T.J., August 1996. Work Related Expenditures in a new Measure of Poverty. SEHSD Working Paper Number 1996-15. U.S. Census Bureau. Short, K., Garner, T.I., Johnson, D., Doyle, P., 1999. Experimental Poverty Measures: 1990 to 1997, U.S. Census Bureau, Current Population Reports, Consumer Income, P60-205. U.S. Government Printing Office, Washington, DC. Short, K., Iceland, J., Bavier, R., Garner, T.I., Rozaklis, P., Hernandez, D.J., 1999b. Report on Experimental Poverty Measures: 1990 to 1997. JSM Proceedings. Short, K., 2001. Experimental Poverty Measures: 1999. Current Population Reports, P60-216. In: Consumer Income. U.S. Census Bureau, U.S. Government Printing Office, Washington, DC. Short, K., August 10, 2005. Estimating Resources for Poverty Measurement: 1993e2003, 2005 Proceedings of the American Statistical Association, Social Statistics Section [CD-ROM]. American Statistical Association, Alexandria, VA. pp.-pp. Presented at the conference in Minneapolis, Minnesota. Short, K., August 2009. Cohabitation and Child Care in a Poverty Measure, 2009 Proceedings of the American Statistical Association, Social Statistics Section [CD-ROM]. American Statistical Association, Alexandria, VA. pp.-pp. Presented at the conference in Washington, DC. Short, K., January 3, 2010. Experimental Modern Poverty Measures 2007. Presented in a Session Sponsored by the Society of Government Economists at the Allied Social Science Association Meetings, Atlanta, Georgia. http://www.census.gov/hhes/www/povmeas/papers.html. Short, K., November 2011. The Research Supplemental Poverty Measure: 2010. Current Population Reports, P60-241. U.S. Census Bureau. available at: www.census.gov/hhes/povmeas/ methodology/supplemental/research/Short_ResearchSPM2010.pdf. Short, K., 2011. Supplemental Poverty Measure: Preliminary Estimates for 2009. Paper Prepared for the ASSA Annual Meetings, Denver, CO. Short, K., November 2012. The Research Supplemental Poverty Measure: 2011. Current Population Reports, P60-244. U.S. Census Bureau. Available at: www.census.gov/hhes/povmeas/ methodology/supplemental/research/Short_ResearchSPM2011.pdf. Short, K., November 2013. The Research Supplemental Poverty Measure: 2012. Current Population Reports, P60-247. U.S. Census Bureau. available at: www.census.gov/prod/2013pubs/ p60-247.pdf.

426 Handbook of US Consumer Economics Short, K., October 2014a. The Supplemental Poverty Measure in the Survey of Income and Program Participation. Presented at APPAM Research Conference. available at: www.census. gov/hhes/povmeas/publications/spm2009.pdf. Short, K., October 2014b. The Supplemental Poverty Measure: 2013. Current Population Reports, P60-251. U.S. Census Bureau. available at: www.census.gov/content/dam/Census/library/ publications/2014/demo/p60-251.pdf. Smith, J.C., Medalia, C., 2014. Health Insurance Coverage in the United States: 2013. Current Population Reports, P60-250. U.S. Census Bureau, U.S. Government Printing Office, Washington, DC. Stanford University, https://inequality.stanford.edu/sites/default/files/PathwaysFall11.pdf. Stevens, K., Fox, L., Heggeness, Misty L., April 2018. Precision in Measurement: Using Statelevel SNAP Administrative Records and the Transfer Income Model (TRIM2) to Evaluate Poverty Measurement. SEHSD Working Paper Number 2018-15. The Measuring of American Poverty Act of 2009, MAP Act, H.R. 2909, Bill Introduced in the 111th U.S. Congress by Representative McDermott and a Companion Bill Introduced by Senator Dodd (S. 1625). U.S. Bureau of the Census, 2010. Poverty Thresholds Following NAS Recommendations. http:// www.census.gov/hhes/www/povmeas/tables.html. U.S. Department of Agriculture, April 2009. The Food Assistance Landscape: FY 2008 Annual Report. USDA Economic Research Service. Economic Information Bulletin No. 6-6. U.S. House of Representatives, June 17, 2009. 111th Congress (2009-2010), H.R. 2909 e Measuring American Poverty Act of 2009. https://www.congress.gov/bill/111th-congress/ house-bill/2909. U.S. Senate, August 6, 2009. 111th Congress (2009e2010), S. 1625 e Measuring American Poverty Act of 2009. https://www.govtrack.us/congress/bills/111/s1625. Wheaton, L., Stevens, K., April 2016. The Effect of Different Tax Calculators on the Supplemental Poverty Measure. Urban Institute available at: www.census.gov/content/dam/Census/library/ working-papers/2016/demo/Effect-of- Different-Tax-Calculators-on-the-SPM.pdf. Wimer, C., Fox, L., Garfinkel, I., Kaushel, N., Waldfogel, J., 2016. Progress on poverty? New estimates of historical trends using an anchored supplemental poverty measure. Demography 53 (4), 1207e1218. Zedlewski, S., Giannarelli, L., Wheaton, L., Morton, J., 2010. Measuring Poverty at the State Level. Urban Institute. https://www.urban.org/research/publication/measuring-poverty-statelevel/view/full_report. Ziliak, J.P., 2010. Alternative Poverty Measures And the Geographic Distribution of Poverty in the United States, A Draft Report Prepared for the Office of the Assistant Secretary for Planning and Evaluation. U.S. Department of Health and Human Services available from: the author at [email protected].

Index ‘Note: Page numbers followed by “f ” indicate figures, “t” indicate tables and “b” indicate boxes.’

A Accurate measurement, 273, 323e324 of economic activity, 273e274 ACS. See American Community Survey (ACS) Actual modeling, 301 Ad hoc system, 289 Adjustablerate mortgage payment (ARM payment), 150 AFS. See Alternative financial services (AFS) Agency for Healthcare Research and Quality (AHRQ), 370 Aggregate economy, 7e8 Aggregate medical price index, 358e359 AHRQ. See Agency for Healthcare Research and Quality (AHRQ) AHS. See American Housing Survey (AHS) AI. See Artificial intelligence (AI) Alibaba, 236 Alternative credit, 121e122 users, 133e138 Alternative data, 280, 283e290 household survey response rates, 284f microfoundations of macro, 287e290 sources, 283, 285e286 traditional vs. web-scale data, 283t Alternative financial services (AFS), 121e122 Amazon, 236 American Community Survey (ACS), 331, 335, 405e407 American Housing Survey (AHS), 57e58 Annual Social Information System (RAIS), 103e104, 112, 114f AP sample. See Area probability sample (AP sample) AR model. See Autoregressive model (AR model) Area probability sample (AP sample), 55 ARM payment. See Adjustablerate mortgage payment (ARM payment) Artificial intelligence (AI), 319

Asset concentration, 71e73 share of assets held by asset percentile group, 72f and net worth, 207e211, 208te210t price shocks, 86e89 Auto delinquency, 147 Auto sectors, 301 Autoregressive model (AR model), 290, 298e299 Average household portfolios composition, 58e65 average composition of household assets, 64f mean asset holdings, 62te63t percent of families that hold assets, 60te61t

B B2C. See Business-to-consumer (B2C) Baby boom generation, 8e9 Baby boomers, 219 Bad economic time, 278e279 Balance sheet, 194e195, 201e211 Banco do Brasil (BB), 108 Bank credit, 122e123 pawn credit as alternative to regular, 124e126 Bankruptcy Abuse Prevention and Consumer Protection Act (BAPCPA), 48e49 BEA. See US Bureau Economic Analysis (BEA) Betson’s estimation, 404 Biased health care price indexes, 355e356 Big data, 3e6 Billion Prices Project, 5e6 Bins, 260e261 Blackbox model, 307e308 BLS. See Bureau of Labor Statistics (BLS) Bootstrapping, 309e310 Borrower characteristics, trends in, 34e37 Borrowing behavior, 23

427

428 Index “Boskin Commission” report, 14 Bottom-up model, 307e308 Brazilian banking system, 114 Brazilian credit registry data, 16e17 “Buffer stock” model, 148 Bureau of Economic Analysis (BEA). See Bureau of Labor Statistics (BLS) Bureau of Labor Statistics (BLS), 4, 274e275, 331, 355, 361e362, 367, 389e390, 418 experimental disease-based price indexes, 368e370 Business assets, 78 cycle chronologies, 273e274 effects on economic behavior, 194 modeling, 288e289 ownership, 78e79 share with business assets, 78f share with equity by birth-year/normalincome cohort, 79f share, 82e83 Business-to-consumer (B2C), 236

C Caixa Economica Federal (CEF), 108 California Poverty Measure, 414e415 CAMS. See Consumption and Activities Mail Survey (CAMS) Carrefour, 236 Cash infusion, 144 CCB. See Chase Consumer and Community Bank (CCB) CCP. See Consumer Credit Panel (CCP) CCS codes. See Clinical Classification codes (CCS codes) CDC. See Centers for Disease Control and Prediction (CDC) CDF. See Cumulative distribution function (CDF) CE. See Consumer Expenditure (CE) Census Bureau research, 415 Centers for Disease Control and Prediction (CDC), 298 Central Bank of Brazil, 98, 101e103 Central tendency, measure of, 260e261 CEO. See Commission for Economic Opportunity (CEO) CES. See Closed-end second (CES); Constant elasticity of substitution (CES); Consumer Expenditure Survey (CES)

Chase Consumer and Community Bank (CCB), 167e168 Chase payment methods, 161e162, 167e168 Child support paid, 410 China’s consumer spending e-commerce, 233 development, 235e238 JD’s E-Commerce data, 237e238 online retailers, 236 patterns and key features, 238e248 online consumer spending, by age cohort, 243e246 online consumer spending, by festival, 238e239, 239f online consumer spending, by product, 240e243, 240f online consumer spending, by region, 246e248 product industrial classifications, 254te256t regional classification, 256t spending and regional income, 248e254 estimation results, 250e254, 252te253t trends and patterns against regional income, 248e250 Chinese Industry Classification System, 240 Clean data, 309e310 Clinical Classification codes (CCS codes), 370 Closed-end second (CES), 24 CNSTAT. See Committee on National Statistics (CNSTAT) Cohort-specific demographic effects, 13e14 Cohorts, 73e85 business ownership, 78e79 business share, 82e83 combined risk asset share, 85 equity ownership, 79e80 equity share, 83 home-ownership, 80e81 housing share, 83e84 interpreting cohort figures, 75e76 median household assets, 76e77 mortgage holding, 81e82 ownership of risky assets, 77 COL index. See Cost of living index (COL index) Combined risk asset share, 85 Commission for Economic Opportunity (CEO), 393e394 Committee on National Statistics (CNSTAT), 356e357, 368, 393 price index for disease, 369

Index 8848 company, 236 Competitive market forces, 78 Conceptual mistakes, 319 Consistency, 7e8 Constant elasticity of substitution (CES), 358e359 Constant grappling, 273e274 Constant monitoring, 298 Consumer Credit Panel (CCP), 6, 9, 23e24, 30, 204e206 Consumer expectations data prediction, 263e266 disentangling preferences and expectations, 257e258 information content of probabilistic questions, 262e263 integrating subjective expectations data into economic models of behavior, 266e269 quantitative and probabilistic question formats, 259e261 survey data on subjective expectations, 258e259 Consumer Expenditure (CE), 1e2, 403 Consumer Expenditure Survey (CES), 9e12, 162e163, 197, 229e230, 350e351, 361 microdata, 4e5 Consumer financial behavior, 23 Consumer life cycle, 2 Consumer Price Index (CPI), 329e331, 335e336, 361, 373 index, 14 service price index, 366 Consumer Price Index for All Urban Consumers-U.S. City Average (CPI-U), 403 Consumer price index housing (CPI-housing), 283e284 Consumer spending, 195 Consumers responding to real income shocks consumer spending and decline of gas prices, 148e150 consumption, investment, and mortgage resets, 150e153 income fluctuations around mortgage defaults, 153 JPMCI research on consumer spending responses to income and price changes, 143e153 sampling criteria and data asset for study, 156, 158te159t

429

spending responses even to income and price changes, 142f Consumption and Activities Mail Survey (CAMS), 162 Consumption behavior of millennials, 193 Consumption heterogeneity, 234, 247e248 Contiguous bins, 263 Continuous economic measurement, 273e274 Conventional economic theory, 124e126 Cost of living index (COL index), 357e359 Costco, 236 CPD-W approach. See Weighted countryproduct-dummy approach (CPD-W approach) CPI. See Consumer Price Index (CPI) CPI-housing. See Consumer price index housing (CPI-housing) CPI-U. See Consumer Price Index for All Urban Consumers-U.S. City Average (CPI-U) CPS. See Current Population Survey (CPS) Credit bureau, 128e129 data, 36e37 Credit cards, 38e39 delinquency, 147 Credit demand factors, 207 Credit Information System (SCR), 101e103, 115 Credit supply, 207 Creditworthiness, awareness about, 136e138 Cross validation, 313e314 Cumulative distribution function (CDF), 261 Current Population Survey (CPS), 407e408 Current technology platforms, 324e325

D Data availability, 4e5 colonialism mitigating, 324e325 curation, 296e298 Data privacy, 164b job categories in employment sector, 297te298t regulation, 325 processing mistakes, 318e319 refinement, 166e168 releases predicting, 295e301 revision problem, 281 DB. See Defined benefit (DB) DC. See Defined contribution (DC)

430 Index Debt, 204e207 trends in, 38e43 Decomposing borrowing cycle, 28e34 Defined benefit (DB), 54 Defined contribution (DC), 59 pension plans, 59 Delinquencies, 47e49 Delta modeling, 299 Demographics, 194 comparison by generation, 198e201 racial composition of population by generation, 199f Density forecasts, 269 Department of Agriculture Food and Nutrition Service, 407e408 Diary Survey, 162e163 Digital world, 323 “Discouraged” borrowers, 135 Disease-based approach, 356, 365e368. See also Service-based approach general disease-based price indexes, 367e368 single-disease price indexes, 366e367 Disease-based price indexes, 356, 374e382, 378t Division of Price and Index Number Research (DPINR), 403 Double Ninth Festival, 237 Dragon Boat Festival, 237 Dynamic stochastic general equilibrium (DSGE), 289

E E-commerce, 235e236 eBay, 236 Economic agents, 273e274 census surveys, 362 characteristics, 194 data releases and prediction, 274e275 intuition, 286 models, 273e274 theory, 279e280 trends, 194 Economy, 274 Educational attainment for individuals, 200f, 201e211 assets and net worth, 207e211 debt, 204e207 income, 201e203 Elasticity of intertemporal substitution (EIS), 268

ELIs. See Entry level items (ELIs) Employers, 175 Employment situation, 306 Ensembling, 309e310, 315 Entry level items (ELIs), 332e333 Equifax, 204e206 Equifax Risk Score, 25 Equifax/Consumer Credit Panel (Equifax/ CCP data), 230 Equity, 86 ownership, 79e80, 83f share, 83 Equivalence scale, 404 “Excess sensitivity” of planned consumption, 268e269 Extensive margins, 59

F FA. See Financial Accounts (FA) Family resources, 397 FCSU. See Food, clothing, shelter, and utilities (FCSU) FCSUM, 416 Federal health care statistics, 356 Federal housing assistance, 408e409 Federal Reserve Bank of Atlanta, 259 Federal Reserve Bank of New York (NY Fed), 258e259 Federal Reserve Board (FRB), 55 FICA taxes, 409 Financial Accounts (FA), 53e54, 57f Financial assets, 8e9 Financial crisis (2008), 23, 27, 77, 207 Financial market assets, 54, 65 share of assets, 70f Financial planning, 176e177 firms, 175 models, 162 Financial vulnerability, 55, 85e91 Fisher’s ideal index, 363 “Fixed basket” of medical goods and services, 362 Flow of Funds, 25e26 Food, clothing, shelter, and utilities (FCSU), 395 Food spending, 228 Forecast, 280 Forecast uncertainty, 260e261 Fortune, 222e223 Fourth quarter (Q4), 275 Framework for alternative data, 290e295

Index FRB. See Federal Reserve Board (FRB) FRBUS macroeconomic model, 8e9

G GAIA index, 333 Gas prices, consumer spending and decline of, 148e150, 149f GDP. See Gross domestic product (GDP) Geary indexes, 333e334 Geary price indexes (PGeary), 334 Generation X, 194e196, 199, 206, 219 Generational consumption patterns, empirical assessment of, 214e219 regression analysis of total spending by generation, 217t Geographic adjustments, 405e407 GFT. See Google Flu Trends (GFT) Global financial crisis (2007), 194 “Go-Go” phase, 165 Google Flu Trends (GFT), 14, 298 Government-controlled banks, 108e113 Great Depression, 194 Great Recession, 23 Greatest Generation, 213, 216e218 Gross domestic product (GDP), 97, 233, 274e275, 278f, 355e356 of Tibet, 246 Grossman model, 359

H HAMP. See Home Affordable Modification Program (HAMP) Health, 357e358 Health, Education, and Welfare (HEW), 391 Health and Retirement Study (HRS), 162 Health care price indexes, 355e356 spending, 141e145, 169be170b Hedonic methods, 15 HELOCs. See Home Equity Lines of Credit (HELOCs) HEW. See Health, Education, and Welfare (HEW) HHS. See US Department of Health and Human Services (HHS) Home Affordable Modification Program (HAMP), 152e153 Home economics, 2e3 Home Equity Lines of Credit (HELOCs), 25e27, 39e44, 40f Home-ownership, 80e81

431

homeowner share by birth-year/normalincome cohort, 80f Homogeneous utility function, 358e359 Household balance sheet, 10 asset price shocks, 86e89 health of, 85e91 effect on wealth from hypothetical price changes, 87te88t LTV, 89fe91f PIR, 89fe91f risk from asset price shocks, 85e86 income price shocks, 86e89 trends in vulnerability across time, 89e91 Household debt and credit data, 24e28 household debt in historical context, 26f housing price change, 27f decomposing borrowing cycle, 28e34 perspectives on current household debt, 43e49 change in debt composition, 43e44 delinquencies, 47e49 implications of change in debt composition, 44e46 trends in borrower characteristics, 34e37 trends in debt, 38e43 Household debt and recession in Brazil aggregate view, 99e101, 99f household debt and GDP growth, 100f characteristics, 104e113 composition, 104e108 credit growth across income distribution, 112e113 government-controlled banks and tale of booms, 108e112 novel data set, 101e104 potential causes of household debt boom, 114e117 institutional reforms and domestic programs, 115e117 international factors, 117 macroeconomic context, 114e117 Household portfolio composition trends asset concentration, 71e73 average household portfolios composition, 58e65 cohorts, 73e85 business ownership, 78e79 business share, 82e83 combined risk asset share, 85 equity ownership, 79e80 equity share, 83, 83f

432 Index Household portfolio composition trends (Continued ) home-ownership, 80e81 housing share, 83e84 interpreting cohort figures, 75e76 median household assets, 76e77 mortgage holding, 81e82 ownership of risky assets, 77 financial vulnerability, 55, 85e91 health of household balance sheet, 85e91 asset price shocks, 86e89 income price shocks, 86e89 risk from asset price shocks, 85e86 trends in vulnerability across time, 89e91 household portfolios, 65e71 across time, 69e71 SCF data, 55e58 household finance research, 58 wealth measurement, 56e58 shocks, 85e91 Household portfolios, 65e71 average assets and debts of asset groups, 67t share of assets in household asset portfolios, 66f unrealized capital gains, 71f share of debt, by asset percentile groups, 69f across time, 69e71 Households, 58 Event files, 370 expectations, 257 finance, 8e10 research, 58 financial assets, 58, 73e74 nonfinancial assets, 58 saving decisions, 74 spending in CE survey, 212e213, 215t Housing, 65, 80, 85e86, 170, 301 assistance, 408e409 market assets, 54 share of assets, 70f share, 83e84 Housing and Urban Development (HUD), 408e409 HRS. See Health and Retirement Study (HRS) HRS/CAMS survey, 163, 174 HSBC, 108 HUD. See Housing and Urban Development (HUD)

I IARIW. See International Association for Research on Income and Wealth (IARIW) ICP. See International Comparison Program (ICP) IJC. See Initial jobless claims (IJC) Immigrant status, awareness about, 136e138 In-sample error (in-sample MSE), 310 in-sample MSE. See In-sample error (in-sample MSE) Income, 201e211 fluctuations around mortgage defaults, 153 groups, 79e80 price shocks, 86e89 real income by year and generation, 202t regression analysis of income by gender/ family status and generation, 204t replacement models, 168 Income-based repayment policies, 147 Individual and sole proprietor (INSOLE), 55e56 Individual-specific empirical CDF, 261 Inflation, 377e382 effect, 356e357 growth, 360 Influential data points, 313e314 Information content of probabilistic questions, 262e263 Initial jobless claims (IJC), 306 Inpatient hospital services, 356 INSOLE. See Individual and sole proprietor (INSOLE) Institute for Research and Poverty (IRP), 392e393 Institutional actions, 321 Integrated Public Use Microdata Series (IPUMS), 4 Intensive margins, 59 Interagency technical working group (ITWG), 389e390, 418e419 to developing SPM and early research, 395e400 Internal consistency, 309 International Association for Research on Income and Wealth (IARIW), 400 International Comparison Program (ICP), 329 Internet explorer, 305e306 Interpretable model, 307e308 Interpreting cohort figures, 75e76 Interview Survey, 162e163 Intuitive psychological variables, 290

Index Investment, 150e153 IPUMS. See Integrated Public Use Microdata Series (IPUMS) IRAs. See Self-directed retirement accounts (IRAs) IRP. See Institute for Research and Poverty (IRP) Italian Survey of Investment in Manufacturing, 261 ITAU, 108 Item strata, 361 ITWG. See Interagency technical working group (ITWG)

J J.P. Morgan expenditure model, 168e170 JD.com, 240 JingDong (JD), 233 E-Commerce data, 237e238 market share, 237f Job finding expectations, 267 Job loss, 145e148 Joint Statistical Meetings (JSM), 392 JPMorgan Chase Institute (JPMCI), 141 research on consumer spending responses to income and price changes healthcare spending and tax refunds, 143e145 around job loss and expiration of unemployment insurance benefits, 145e148 JSM. See Joint Statistical Meetings (JSM)

K Keynesian approach, 287e288

L Lantern Festival, 237 Lasso techniques, 313e314, 316f Leverage ratio, 68 Life cycle asset accumulation processes, 54 LIHEAP. See Low-Income Home Energy Assistance Program (LIHEAP) Likert scale, 259 Linear regression, 313 framework, 216 linear regressionebased techniques, 315 model, 201, 203 Liquidity constraints, 124e126 Live production results, 315e325 accurate measurement, 323e324

433

mitigating data colonialism, 324e325 prediction, 318e319 public benefits of microfoundations of macro, 321e323 Loan-to-value ratios (LTVs), 33, 55, 89fe91f Log-normal distribution, 261 Long-term care costs, accounting for, 174e175 Low-Income Home Energy Assistance Program (LIHEAP), 408 Low-income households, 82 Lowe index, 374, 377 Lower-income cohorts, 77 LTVs. See Loan-to-value ratios (LTVs)

M Machine learning (ML), 292, 307 Macro data, 275e279 increased noise in times of low growth, 278e279 revision problem, 275e278 Macro forecasting alternative data, 283e290 classification of, 294t framework for, 290e295 microfoundations of macro, 287e290 auto sectors, 301 correlation between curated search features for economic categories and traditional economic data, 302te304t data curation, 296e298 economic data releases and prediction, 274e275 housing, 301 importance of macroeconomic measurement and prediction, 273e274 live production results, 315e325 accurate measurement, 323e324 archetypes of desired skills, 320t mitigating data colonialism, 324e325 prediction, 318e319 public benefits of microfoundations of macro, 321e323 typical model development loop, 321e323 macro data, 275e279 increased noise in times of low growth, 278e279 revision problem in traditional data, 275e278 modeling differences, 298e300

434 Index Macro forecasting (Continued ) nonfarm payrolls, 301e315 blackbox model, 307e308 bootstrapping, 309e310 bottom-up model, 307e308 clean data, 309e310 ensembling, 309e310 internal consistency, 309e310 interpretable model, 307e308 model overconfidence metric, 310e313 modeling noise in small data sets, 309 shrinkage, 309e310 techniques, 311te312t top-down model, 307e308 predicting data releases, 295e301 real-time macro data with less noise, 279e283 nowcasting, 279 pyramid-like framework of nowcasting, 279e283 retail, 301 Macroeconomic measurement and prediction, 273e274 stocks and bonds in NBER recession vs. expansions, 276t US data releases revision percentage, 277t Macroeconomy, 257e258 Mainstream credit, 121e122 Measuring American Poverty Act of 2009 (MAP), 394e395 Measuring Poverty, 395e396 Median household assets, 76e77 median assets, 76f Medical Conditions file, 370 Medical Expenditure Panel Survey (MEPS), 368, 370, 373 Medical expenditures, 362 Medical expenses, 410e411 Medical goods, 356e357 average utilization per disease, 373t basic theoretical framework, 357e361 decomposing nominal expenditures, 360e361 medical price and COL indexes, 358e359 utility and health, 357e358 BLS experimental disease-based price indexes, 368e370 current service-based approach to medical measurement, 361e365 data, 370e373

decomposition of nominal expenditures, 377e382, 380te381t disease categories, 371te372t disease-based approach, 365e368 disease-based price indexes, 374e377 limitations of current disease-based measures, 383e384 medical price indexes, 376f quality adjustment, 385e386 trends in utilization by disease, 374 utilization changes, 375t Medical out-of-pocket spending (MOOP spending), 393 expenditures, 416 Medical price, 358e359 Medicare Part B premiums, 410e411 MEPS. See Medical Expenditure Panel Survey (MEPS) Michigan Survey of Consumers, 258e259 Microfoundations of macro, 287e290 public benefits of, 321e323 Microtheory, 356e357 Millennials, 193e195 comparison of consumption behavior by generation, 211e222 empirical assessment of generational consumption patterns, 214e219 household spending in CE survey by age and generation, 212e213 comparison of demographics by generation, 198e201 educational attainment for individuals, 201e211 generations and review of research on age, generations, and economic decisions, 195e198 unique consumption basket, 219e222 vehicle purchases, 222e228 ML. See Machine learning (ML) Model development loop, 318 Model overconfidence metric (MOM), 310e313 MOOP spending. See Medical out-of-pocket spending (MOOP spending) Mortgages, 24 debt, 26e27 holding, 81e82 risky asset share, 82 income fluctuations around mortgage defaults, 153 resets, 150e153 Motor vehicles, 222e223

Index

N NAS. See National Academy of Sciences (NAS) National Academy of Sciences (NAS), 391 panel on poverty measurement and research, 391e395 National Bureau of Economic Research (NBER), 274, 274f National Health Expenditure Accounts (NHEA), 361 National Income and Product Accounts (NIPA), 53e54 National medical price index, 362 National School Lunch Program (NSLP), 407e408, 416 NBER. See National Bureau of Economic Research (NBER) New York City (NYC), 393e394 NFP. See Nonfarm payrolls (NFP) NHEA. See National Health Expenditure Accounts (NHEA) NIPA. See National Income and Product Accounts (NIPA) Nominal expenditures, decomposition of, 360e361, 377e382, 380te381t growth for specific diagnoses, 383t Noncash benefits of SPM, 407e409 housing assistance, 408e409 LIHEAP, 408 National School Lunch Program, 407e408 SNAP, 407 supplementary nutrition program for WIC, 408 Nonfarm payrolls (NFP), 274e275, 300f, 301e315 blackbox model, 307e308 interpretability versus flexibility tradeoff, 308f bootstrapping, 309e310 bottom-up model, 307e308 clean data, 309e310 ensembling, 309e310 internal consistency, 309e310 interpretable model, 307e308 model overconfidence metric, 310e313 MOM metric for particular model of NFP, 314t modeling noise in small data sets, 309 shrinkage, 309e310 top-down model, 307e308 Nonlinear technique, 315 Nowcasting, 279e280

435

Atlanta fed’s GDP nowcast, 282f estimating economic cycle turning points, 282f implementation of nowcasting framework, 280f New York fed’s GDP nowcast, 281f pyramid-like framework of, 279e283 NSLP. See National School Lunch Program (NSLP) NY Fed. See Federal Reserve Bank of New York (NY Fed) NYC. See New York City (NYC)

O Office of Management and Budget (OMB), 389e391, 418 Official poverty measure, 390e391 OMB. See Office of Management and Budget (OMB) On exactitude of science, 289e290 One-time surveys of expectations, 259 Online consumer spending by age cohort, 243e246 by festival, 238e239, 239f by product, 240e243, 240f by region, 246e248 Organizational mistakes, 319 Ownership of equity, 79 Ownership of risky assets, 77

P Panel Study of Income Dynamics (PSID), 201, 229 Pawn brokers, 128e129 Pawn credit as alternative to regular bank credit, 124e126 Pawnbroking works, 127e128 Payday lending, 123e124 Payment-focused mortgage debt reduction, 153 Payment-to-income ratio (PIR), 89fe91f PCE. See Personal Consumption Expenditures (PCE) PDF. See Probability density function (PDF) PDF-implied mean or median, 260e261 People retirement age, 179e181 Per capita real output, 377e382 Permanent income hypothesis (PIH), 3e4, 13e15, 141, 143, 148 Personal Consumption Expenditures (PCE), 331e332, 355, 363

436 Index PGeary. See Geary price indexes (PGeary) Pharmaceuticals, 356 Phased retirement hypothesis, 165 Physician services, 356 PIH. See Permanent income hypothesis (PIH) PIN. See Power Information Network (PIN) PIR. See Payment-to-income ratio (PIR) Point of Purchase Survey, 361 Policy makers, 273e274 Population, 194 Portfolio composition, 53 Postretirement investment, 176 Poverty thresholds, 390e391 Power Information Network (PIN), 223 PPI. See Producer Price Index (PPI) PPPs. See Purchasing power parities (PPPs) Prefectural cities in China, 249 Prevalence effect, 356e357 Price of health capital, 358e359 index approach, 360 Private insurance, 359 Probabilistic question formats, 259e261 Probability density function (PDF), 260e261 Probit regressions, 133 Producer Price Index (PPI), 362, 373 PROEF. See Program for Strengthening of Federal Financial Institutions (PROEF) PROER. See Program of Incentives for Restructuring and Strengthening National Financial System (PROER) PROES. See Program of Incentives for Reduction of State Role in Banking Activity (PROES) Program for Strengthening of Federal Financial Institutions (PROEF), 114 Program of Incentives for Reduction of State Role in Banking Activity (PROES), 114 Program of Incentives for Restructuring and Strengthening National Financial System (PROER), 114 Prosperous Retirement, The (Stein), 165 PSID. See Panel Study of Income Dynamics (PSID) Purchasing power parities (PPPs), 329

Q Quality adjustment, 385e386 Quantitative question formats, 259e261 Quick warning, 285

R RAIS. See Annual Social Information System (RAIS) RAND’s American Life Panel, 258e259, 262e263 Random forest-bootstrapping nonlinearity, 315 Random health shock, 357e358 Rational expectations, 258 Rationality in consumer credit market accessing to mainstream credit for all Swedes vs. alternative credit users, 133e138 awareness about creditworthiness and immigrant status, 136e138 decision trees, 134f data, 128e129 empirical implementation, 133 pawn credit as alternative to regular bank credit, 124e126 pawnbroking works, 127e128 summary statistics, 129e133, 131te132t variable description, 129te130t Real consumption effect, 356e357 Real expenditure, 377e382 Real-time macro data, 279e283 nowcasting, 279 pyramid-like framework of, 279e283 Regional price parities (RPPs), 329, 331, 337e338 adjusted personal incomes for metropolitan and nonmetropolitan portions, 343e348 regional price parities for states, 338e343, 339te342t for states and metropolitan areas, 334e337 in United States, 329e330 price levels for CPI areas, 332e334 Registry of Credit Risk (CRC), 115 Regular bank lending policy, 123e124 Repeated sampling. See Bootstrapping Replacement rate, 175e176 Research Experimental SPM thresholds, 403 Response rates, 56 Retail, 301 trade surveys, 362 Retirement, 177e179 data sources, 162e164 CE survey, 162e163 Chase data, 163e164 HRS/CAMS survey, 163 further research, 189

Index implications for employers, 188 ideas for firms, 188e189 life cycle implications of spending for firms providing financial planning, 176e177 of spending for plan providers and employers, 175e176 life cycle of spending, 166e170 accounting for long-term care costs, 174e175 data refinement, 166e168 generational view, 171e187 investable wealth levels, 171e174 J.P. Morgan expenditure model, 168e170 phased retirement hypothesis, 165 planning, 161 quantitative evidence of spending reductions in retirement, 165e166 retirement transition period, 166 shifting into, 177 average spending year before vs. year after retirement, 181e183 distribution of changes in spending year before vs. year after retirement, 183e184 evidence of retirement spending surge, 184e185 people retirement age, 179e181 retirement transition period data filter, 177 spending volatility, 185e186 beyond transition phase, 186e187 transition period, 166 data filter, 177 Revisions, 275 problem, 275e278 Risk from asset price shocks, 85e86 Risky asset share, 82 share with mortgage on primary residence, 81f RPPs. See Regional price parities (RPPs)

S SCE. See Survey of Consumer Expectations (SCE) SCF. See Survey of Consumer Finances (SCF) Schooling decisions, 257 SCR. See Credit Information System (SCR) Self-directed retirement accounts (IRAs), 59

437

Service-based approach, 356. See also Disease-based approach current service-based approach to medical measurement, 361e365 current medical price indexes, 363e365 current methods, 361e363 Shocks, 85e91, 275 Shrinkage, 309e310, 313e314 Silent Generation, 195e196, 199, 213, 216e218 Simulation methods, 310 Single canonical stylized model, 148 Single-disease price indexes, 366e367 SNAP. See Supplemental Nutrition Assistance Program (SNAP) SOI. See Statistics of Income (SOI) SPM. See Supplemental Poverty Measure (SPM) Spring Festival, 237, 239 Standard Euler equation, 269 Standard price index theory, 357e358 Statistical analysis techniques, 279e280 Statistical learning mistakes, 319 Statistics of Income (SOI), 55e56 Streetlight effect, 1 Student loans, 39be43b Subjective expectations. See also Consumer expectations integrating subjective expectations data into economic models of behavior, 266e269 survey data on, 258e259 Supplemental Nutrition Assistance Program (SNAP), 407 Supplemental Poverty Measure (SPM), 389e400 change in number of people, 413f changes to, 413 estimation, 411 extensions, 414e415 future directions for, 418e419 ITWG to developing SPM and early research, 395e400 NAS panel on poverty measurement and research, 391e395 necessary expenses, 409e411 child support paid, 410 medical expenses, 410e411 taxes, 409 work-related expenses, 410 noncash benefits, 407e409 official poverty measure, 390e391 poverty measure concepts, 401te402t

438 Index Supplemental Poverty Measure (SPM) (Continued ) poverty rates, 411e413 using official measure and, 412f research, 415e418 resources, 407 and poverty statistics, 390 thresholds construction, 403e407 equivalence scale, 404 geographic adjustments, 405e407 threshold estimation, 404e405 Supplementary nutrition program for WIC, 408 Survey of Consumer Expectations (SCE), 4e5, 258e259, 264 labor market expectations data, 264 respondents, 265e266 Survey of Consumer Finances (SCF), 9, 55e58, 57f, 197, 229 household finance research, 58 wealth measurement, 56e58 Survey of Economic Expectations, 258e259 Survey of Household Income and Wealth, 258e259 Survey of Professional Forecasters, 258e259 Synthetic cohort technique, 13

T T-Mall, 237 Tangible personal property, 127e128 Taxes, 409 refunds, 143e145 Taylor rule, 307e308 Technological improvements, 293 Three-parameter equivalence scale, 397, 404, 417e418 Threshold estimation, 404e405 Tomb-Sweeping Day, 237 Top-down model, 307e308 Tornqvist index, 363 Traditional affordability metrics, 153 Traditional data, 283e284 sources, 285e286 Traditional economic metrics, 274e275 Traditional financial plans, 162 Transaction accounts, 65 Triennial SCF, 207 Two-adult-two child SPM thresholds, 405

U UI. See Unemployment insurance (UI) Uncorrelated noise, 286e287

Understanding Consumption (Deaton), 3e4 Unemployment insurance (UI), 145e147 benefits expiration, 145e148 US Bureau Economic Analysis (BEA), 329e330, 361e362 US consumer, empirical analysis of big data, 3e6 consumer spending and aggregate economy, 7e8 household finance, 8e10 international perspectives, 15e17 measurement issues, 14e15 responding to shocks, 10e11 spending over life cycle, 11e14 US Department of Agriculture (USDA), 390e391 US Department of Health and Human Services (HHS), 174 US household balance sheets, 8e9 USDA. See US Department of Agriculture (USDA) User-generated online data, 293e295 Utility, 357e358

V Vehicle purchases, 222e228, 224f spending on food and housing, 228 Visualization, 324 Volatility, spending, 185e186 beyond transition phase, 186e187 Vulnerability across time, trends in, 89e91

W Walmart, 236 Wealth measurement, 56e58 Web-scale data, 283 Weighted country-product-dummy approach (CPD-W approach), 332e333 WIC Program. See Women, Infants, and Children Program (WIC Program) Wisconsin Measure thresholds, 414e415 Within-group inequality, 235 Women, Infants, and Children Program (WIC Program), 408 supplementary nutrition program for, 408 Work-related expenses, 410

Z Zhuoyue.com, 236