Understanding Records, Second Edition: A Field Guide to Recording Practice 9781501342387, 9781501342370, 9781501342417, 9781501342400

The revised edition of Understanding Records explains the musical language of recording practice in a way any interested

269 103 10MB

English Pages [243] Year 2019

Report DMCA / Copyright


Polecaj historie

Understanding Records, Second Edition: A Field Guide to Recording Practice
 9781501342387, 9781501342370, 9781501342417, 9781501342400

Table of contents :
Title Page
Copyright Page
Academic context
A quick note on organization
A final note on how to use this field guide
Chapter 1: Tracking (Making Audio Signals)
Chapter 2: Mixing (The Space of Communications)
Mixing and the multitrack paradigm
Signal processing
Chapter 3: Mastering (The Final Say)
Mastering: A brief history
Finalization: Mixing/mastering
Mastering as arm’s-length peer review
The art of mastering: Making a start
The art of mastering: EQ
The art of mastering: Dynamics
The art of mastering: Other tools, other approaches
Mid-side processing: Determining width and depth
Mid-side processing: Corrective and creative
De-noising: Single-ended and otherwise
A quick note on the peculiar art of remastering
The transfer process: Balancing for output(s)
Conclusion: Mastering as a value added for other services
Coda (Fade-Out)
Chapter 1
Chapter 2
Chapter 3

Citation preview

Understanding Records, Second Edition


Understanding Records, Second Edition A Field Guide to Recording Practice Jay Hodgson

BLOOMSBURY ACADEMIC Bloomsbury Publishing Inc 1385 Broadway, New York, NY 10018, USA 50 Bedford Square, London, WC1B 3DP, UK BLOOMSBURY, BLOOMSBURY ACADEMIC and the Diana logo are trademarks of Bloomsbury Publishing Plc First published in the United States of America 2010 This edition published 2019 Copyright © Jay Hodgson, 2019 For legal purposes the Acknowledgments on p. vi constitute an extension of this copyright page. Cover design by Avni Patel Cover image © gremlin/Getty Images All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage or retrieval system, without prior permission in writing from the publishers. Bloomsbury Publishing Inc does not have any control over, or responsibility for, any third-party websites referred to or in this book. All internet addresses given in this book were correct at the time of going to press. The author and publisher regret any inconvenience caused if addresses have changed or sites have ceased to exist, but can accept no responsibility for any such changes. A catalog record for this book is available from the Library of Congress. ISBN: HB: 978-1-5013-4238-7 PB: 978-1-5013-4237-0 ePDF: 978-1-5013-4240-0 eBook: 978-1-5013-4239-4 Typeset by Deanta Global Publishing Services, Chennai, India Companion website: jayhodgsonmusic.com To find out more about our authors and books visit www.bloomsbury.com and sign up for our newsletters.

Contents Acknowledgments





Tracking (Making Audio Signals)



Mixing (The Space of Communications)


Mastering (The Final Say)

69 149

Coda (Fade-Out)


Notes Index

209 217

Acknowledgments I owe a heartfelt debt of gratitude to many people who contributed directly to this book. First, and foremost, I would like to thank Steve MacLeod for reworking all of the figures, and for recording the vast majority of videos, featured throughout this book. Steve, who runs News Knowledge Media, is one of the hardest working and most talented filmmakers and photographers I have had the pleasure of working with; I look forward to many future collaborations! I would also like to thank Dr. Matt Shelvock, aka kingmobb, for agreeing to collaborate with me on the “putting it together” track for this book, “Lonesome Somedays,” and for patiently teaching me the ins-and-outs of modern cloudbased production and the streaming business. Our chats are some of the most enjoyable, and yet also some of the most edifying, ones I have had in years. Thanks Matt! I also owe thanks to a few musicians who provided me with exclusive videos and interviews for this book. I’m humbled by their willingness to participate, and I owe them a debt I will have to struggle to repay over the years. Specifically, I’d like to thank Alastair Sims, Alex Chuck Krotz, Kevin O’Leary, Dr. Matt Shelvock, Dylan Lauzon, Russ Hepworth-Sawyer and Nate Baylor. It is a personal belief that academics must begin to recognize the expertise of people who work in creative fields, and to do this, they must find way to include the people who make the music they study in the conversations they have about that music. My sincere thanks goes out to those noted above for helping me to make this happen. Finally, I’d like to thank Hannah Buckley, who provided a crucial final copyedit on this manuscript before I submitted it, and who graciously allowed me to use her music for some audio and video examples. And my gratitude, of course, extends to David Barker, the commissioning editor for the first iteration of this project, and to Leah Babb-Rosenfeld, who commissioned this update. Thanks are also due to my musical “fellow travelers”: John “Chico” Barrett, Chris Campbell, Mark Collins, John Keogh, Richard Corner, Donna MacKenzie, Ludvig Girdland, Billy Mohler, Rohin Khemani, Natural Najera, Ramsey Roustum, Dave Goldyn, Mike Preis, Ryan Sackrider, Rollin Weary, Bruno Canale, John McDermott, Bill Bridges, Donald MacDonald, Bob Toft, and



everyone else who has shared their love for music with me. And thanks are due to those who put up with me in musical creativity: Noah Pred, Rick Bull, Dylan, Nate, Alastair, Alex, Matt, Steve, Hannah, Dan, Jeff, Lucas, Lia, Jordan, Mark, Mike, Ryan, Rollin, Rohin, Billy, Johnny, Craig, Ramsey, Ludvig, and anyone else I may have missed. And, of course, thanks are due to my family and friends, especially Eva, James, and Alec—love you to no end!



Most people do not hear what they think they hear when they listen to a record. Perhaps you fall into this category and don’t realize it. Let’s try an experiment, to see if this is the case. I have prepared a playlist on Spotify, called “Understanding Records: Introduction,” which has a few tracks in it that I produced, mixed, mastered, performed, and/or worked on in some other musical capacity, so I know precisely what was done to make them. I am going to ask that you listen to these tracks and while you listen simply ask yourself: What do I think I’m hearing? No subscription is needed to access this playlist, nor any other playlist mentioned in this field guide. You can listen to any of the playlists I reference in this book using the Spotify app, and though the service will likely ask you to consider a paid subscription from time to time, you do not need one to access this material. To find the specific playlist I mentioned above, simply enter “Understanding Records: Introduction” into the app’s search tab, and the playlist should appear. If that doesn’t work, you can always navigate to the website for this book (www.jayhodgsonmusic.com), where you will find a hyperlink to take you to each of the playlists and videos I note throughout. As you listen to the first track—Nikki’s Wives’ “Hunting Season” (2018)— you might notice: female vocals, a pitch-shifted vocal line beneath that, electric bass, kick, snare, claps, toms, and so on. Whether or not you can identify every single sound, or effect, that you hear on these tracks is unimportant at this point, and completely irrelevant to the experiment. For now, I just want you to concentrate on what you think you are hearing. The next track in the playlist may be easier for some listeners to analyze. It is a string quartet I composed and recorded in Chicago a few years ago, called “String Quartet No. 1: Gutei’s Finger” (2017). You may be able to identify the various instruments of the quartet—a cello, a viola, and two violins—or you may just hear “strings,” and that is the extent of your knowledge of orchestral instruments. Whatever you hear is entirely fine. Perhaps you hear some of the reverb we applied to the tracks during mixdown, but whether or not you can hear things like reverb at


Understanding Records, Second Edition

this point is unimportant. In any event, this field guide is designed to help you train your ears so you can hear things like reverberation, so if you don’t hear it now, come back and listen to this track again once you have worked your way through all the material in this field guide, and see if you can hear it then. Again, at this point, all I want you to do is listen to these tracks and consider: What do I think I’m hearing? Once you have listened through the playlist, and made your mental notes, ask yourself: How many different sounds have I heard? Most tracks had a kick drum, but each kick drum sounded different. Many of the tracks had vocal parts, and percussion parts of some sort or another, but they all sounded somehow different, too, sometimes even at different points in the same song. These are the particulars, the details, of the craft that I call “Recording Practice,” that is, making and hearing recorded musical communications. This field guide is devoted to helping readers learn to hear such particulars. What makes a kick drum cut through a mix? Why do some tracks groove so hard that everything seems to sway in time, but only when the kick drum comes in? Why does one track sound so loud, and another so quiet, when my iPhone is set to the same volume? When you finish this field guide, you should be able to answer these questions without trouble. And if you make records (or hope to someday), reading this book will help you to hear the various places your recorded musical communications can and should go, and it will give you some helpful tools for executing your audile imaginings. On a deeper level, this field guide is designed to clarify just what it is that recordists make, and listeners hear, when they engage in recorded musical communications. That is to say, this field guide is meant to clarify the material substance of Recording Practice. When you listened through the playlist mentioned above, you undoubtedly noted a number of different sounds. You heard kick drums, violins, electric basses, vocals, and so on. And yet you didn’t actually hear those sounds—you didn’t hear acoustic phenomena. What you actually heard was a single acoustic phenomenon, a single sound, produced by speaker and headphone technology, designed to trick your hearing apparatus into believing it detects the presence of various acoustic phenomena. In other words, you heard a single sound that only sounds like all those other sounds. Recording Practice is simply the art of crafting such sonic representations. Why does this matter? The reason is simple. Whether we analyze records for a living (as musicologists and music theorists, and professors working in the

Introduction  3

field of popular music studies, claim to) or make them (as I myself do, and as I teach my students to), we should know exactly what it is that we make and hear when we make and hear recorded musical communications, and that is a single acoustic phenomenon, a single sound, designed to sound like other sounds. This field guide is meant to help people hear “the screen” when they listen to a record, the same way cinematographers see “the screen” when they design the optics of a movie. At the very least, this field guide will help analysts and recordists alike to stop mistaking the subject of recorded musical communications—the sounds that recordists represent through their recording practices, which I explain in this field guide—for communication itself. Recording Practice is the art of using sound to represent sound(s). It is my ultimate hope that this field guide will help people hear this happen.

Academic context It is generally agreed that what people do to make and hear a record, a process I call “Recording Practice,” is different than what they do to make and hear a live musical performance. However, almost nothing has been written about Recording Practice which treats it as a totally unique way of musically communicating. Besides a few notable exceptions, analysts and historians have considered Recording Practice nothing more unique than a technological adaptation of “live” performance practices. Not surprisingly then, even the most fundamental of record-making techniques—say, compressing and equalizing an incoming audio signal—remains conspicuously absent from the lion’s share of published research on music. This leaves researchers and practitioners with a very limited collection of writings to consult, should they want to understand what people actually hear when they listen to recorded music. The current dearth of professional research on specific record-making techniques used on commercially available records likely has much to do with the technical nature of Recording Practice. Musicians and fans often describe record making as a process which is equally mystifying and compelling as, say, building a bilge pump. Consequently, a kind of defensive snobbery tends to dominate critical, historical, and academic commentary on Recording Practice, the vast majority of which treats Recording Practice as only a technical support, something like building scaffolds, for the “true arts” of performance and composition. My first goal in writing this field guide is to change that attitude.


Understanding Records, Second Edition

Recording Practice is a technical affair, to be sure. But it is also immensely artistic. As Dave “Hard Drive” Pensado explains: [Recording Practice] is unique, in that you have to be technical, and creative, at the same time. It’s hard to imagine [Pablo] Picasso designing the first computer. It’s also hard to imagine Bill Gates painting “Guernica.” And as a recordist, you have to be Bill Gates for thirty seconds, then Picasso for thirty seconds. You’re constantly shifting back and forth.1

Recording Practice is a complete, self-sufficient musical language. The likes of Glenn Gould and the Beatles said as much when they retired from concert performance altogether in the mid-1950s and mid-1960s, respectively. Their controversial absences from the concert stage loudly proclaimed that, unlike before, success in musical communications could be had just by making records. These musicians maintained that the technical and aesthetic considerations that guided musical communications for centuries before should no longer apply. Most crucially, performability should no longer play a guiding role in record making. According to Virgil Moorfield, the Beatles’ Revolver marks the dramatic turning point in this historical development: From this album [read Revolver] on .  .  . The Beatles would experiment with abandon. They dispensed with the concept of “realism,” or what could be called “figurative” recording, often constructing instead a virtual or imaginary space unconfined by what is possible in the “real” world of live performance on conventional instruments. For [producer George] Martin and The Beatles, the performability of Revolver and the groundbreaking records that followed it was not a concern, not only because The Beatles were not obliged to perform live (they retired from touring in 1966), but because the records themselves succeeded in shifting the audience’s expectations from a replicated concert experience to something more.2

Recording Practice—this “something more”—is now the dominant language in musical communication. However, that language remains a total mystery to many listeners. This field guide is designed to rectify this state of affairs. Working their way through this field guide, any interested reader can learn to recognize and reproduce the most fundamental musical terms that recordists use to communicate now. Drawing on records that shaped the postwar pop soundscape, and ensuring that each cited track is available on all modern streaming services, but most especially on Spotify, this field guide

Introduction  5

(i) explains the fundamental terms of Recording Practice, presented in chronological record-making sequence—that is, in the order they typically arise during a conventional record production; (ii) explains and elucidates the techniques that recordists use to create those terms; (iii) provides original audio examples designed to clarify the musical techniques and procedures which those terms operationalize; (iv) describes, and locates on a number of hit records, common musical uses for those terms; and, finally, (v) situates them within the broader record-making process at large. This field guide is by no means exhaustive. Recording Practice is an immense, and immensely complicated, topic which requires volumes of encyclopedic exposition to comprehensively elucidate. The musical techniques I survey in this field guide are simply those that recur most often on modern pop records. In so doing, they comprise a fundamental musical lexicon or a basic musical vocabulary. Aside from a few pioneering exceptions, though, this lexicon remains notably absent from professional research on popular music history and practice. Surprisingly, these musical terms are also absent from the vast majority of audioengineering textbooks currently on the market, which usually only sketch the technical details of Recording Practice without explicitly referencing any of the aesthetic programs that recordists deploy their musical practice to service. Gaps in knowledge are to be expected in a field as young and diffuse as popular music studies. Numerous insightful and challenging analyses of record making have indeed emerged in the last two decades, but these studies typically address the analytic priorities and concerns of disciplines which are not primarily interested in musical technique per se, like cultural studies, sociology, media studies, cultural anthropology, and political economy. Correspondingly, Recording Practice usually fails to register in these disciplines as a fundamentally musical affair. A straightforwardly technical perspective on Recording Practice has only very recently begun to emerge, as the taboo on studying popular musics (and pop recording practices), which once gripped university music departments, gradually slackens. However, even as musicologists and music theorists turn their analytic attentions to pop records, they remain largely fixated on musical details that can be notated—formal contour, harmonic design, pitch relations, metered rhythms, and so on—even though many of these authors reject notation as an analytic tool. In other words, no matter how avant-garde their methods, many commentators


Understanding Records, Second Edition

still typically fixate on the very same musical details that musicologists and music theorists traditionally analyze. They simply disagree over how best to interpret those details. Or they study products of musical processes as though they were the processes themselves; to study songwriting, they look at songs, and to study record production, they look at records. Recording Practice itself only registers in a small, albeit growing, collection of articles and books. It is within this growing collection that the following field guide should be situated, academically speaking. This field guide is designed to demonstrate, describe, and elucidate the fundamentally musical nature of Recording Practice, which traditional analytic modes have only heretofore described as an extramusical conveyance or reproduction of “live” performance practices. At the same time, this book provides a broader aesthetic orientation for the technical procedures that audio-engineering textbooks detail. In so doing, this field guide should provide future recordists with a useful aesthetic and technical orientation to the immensely complicated craft they have determined to learn.

A quick note on organization This field guide was designed to introduce the particulars of Recording Practice to a general (albeit research-minded) readership. Analysts, historians, and practitioners alike should find information in this field guide to help them in their endeavors. I adopted the organizational approach of sequential entries, each of which is concerned with some technique or sub-technique of the broader record-making process at large. I divide record making into three roughly chronological phases: (i) tracking, (ii) mixing and signal processing, and (iii) mastering. I then divide each of these phases into a series of component, constitutive “meta-techniques” and “sub-techniques.” For instance, I divide the tracking phase into three meta-techniques, namely, “transduction,” “directinjection,” and “sequencing.” Embedded within each of these broader metatechniques is a galaxy of constitutive sub-techniques, each of which requires analytic attention as both a singular procedure and a component of Recording Practice in general. From a practical perspective, such divisions will always be artificial. Each time recordists select a particular microphone to record a particular sound source, for instance, they filter the frequency content of that sound source in particular ways; in so doing, they equalize and mix their records, even at this very early

Introduction  7

stage. Recording Practice is an entirely holistic procedure, after all. Tracking, mixing, signal processing, and mastering cannot be separated—not in practice, at least. They are easily excised in theory, though, because each procedure is tailored to produce a different result. During the tracking phase, for instance, recordists generate raw audio signals which they later massage into final form using a variety of mixing, signal-processing, and mastering techniques. Using signal processing, recordists filter and refine the “raw audio signals” they collect during tracking, and they spatially organize the component tracks of a multitrack production into well-proportioned shapes. In mastering, recordists apply a finishing layer of audio varnish to their mixes, to ensure that they sound optimal on a variety of playback machines and in a variety of different formats. I saw no reason whatsoever to “dumb down” the explanations for the procedures offered in this field guide. Though I have done my absolute best to provide simple, easy-to-understand explanations for each of the concepts and techniques surveyed in the following pages, I also struggled to ensure that I did not over-simplify anything in so doing. Recording Practice is an artistic and technical affair. Thus, readers interested in learning anything about Recording Practice should prepare to consider a number of unambiguously technical concepts and procedures, which inhere in every recorded musical communication. As Dave “Hard Drive” Pensado so aptly put it, the most successful recordists are equal parts Bill Gates and Pablo Picasso—neither the technical nor the artistic disposition should dominate. I should also note the threefold selection criterion used to choose the musical examples cited throughout this field guide. First, whenever possible, I sought verification (in print) that the tracks cited feature the musical techniques I say they feature. Then I considered whether they provide the clearest possible illustration of those techniques. Finally, I verified that the tracks are readily available for streaming on most major streaming services, but especially on Spotify. I also intentionally culled examples from as random a sample set as I could muster to emphasize the fact that all of the musical terms in this field guide are, unless otherwise noted, deployed by recordists working in any musical genre.

A final note on how to use this field guide This field guide is designed to be a multisensory learning experience, useful for musicians, recordists, historians, and analysts alike. As such, while they work


Understanding Records, Second Edition

their way through this field guide, readers should do their very best to listen as much as they read. Original tracks have been created to demonstrate many of the audio concepts I consider in this field guide; playlists have been created on Spotify, which has an unlimited free-use term fettered only by advertising, so readers can hear these concepts in professional practice, and exclusive videos, and print interviews, featuring successful recordists I have had the honor of working with over the years, are also featured throughout this field guide. Prompts are given in the body of the text when this material should be consulted, and information is provided about which musical features readers should focus on while they listen. Readers should endeavor to listen carefully to the commercial tracks and original audio examples cited throughout, and precisely when they are cited, and to view the videos when prompted. Again, it is crucial that readers listen as much as they read while working through this field guide. “The most important music of our time is recorded music,” notes Alexander Case, whose Sound FX: Unlocking the Creative Potential of Recording Studio Effects was a primary inspiration for this book. “While the talented, persistent, self-taught engineer can create sound recordings of artistic merit, more productive use of the studio is achieved through study, experience and collaboration.”3 I would simply add that listening, too, is a useful and instructive tool. To make records, and to understand the musical language of Recording Practice, recordists and analysts alike should first learn to hear Recording Practice. This field guide is meant to help readers of all disciplinary backgrounds and expertise do just that.


Tracking (Making Audio Signals)

When recordists track a record, they create the raw audio signals which they later shape into a so-called “master cut,” ready for streaming and physical release, using a range of signal processing, mixing, and mastering techniques. The material form this raw audio signal takes depends completely on the technology recordists used to create and store it. If they use an Edison cylinder phonograph, for instance, recordists create raw audio signal in the form of bumps and pits on a wax-cylinder. If they use a computer, on the other hand, the raw audio signal they make takes shape as a digital sequence of binary code. The number of separate audio signals, called “tracks,” which can be combined into a “master cut” also depends on technology. Early acoustic devices, like the phonograph and the gramophone, had only one track available for recording. Tape machines, which are electromagnetic devices, eventually expanded that number from two to sixty-four (and more) available tracks. And now, with the advent of digital-audio (computer-based) recording technology, recordists can create and combine a theoretically unlimited number of tracks. Regardless of the technology they use, though, recordists have only three fundamental techniques at their disposal for tracking records, namely (i) transduction, (ii) directinjection (DI), and (iii) sequencing. Transduction remains the most common of these techniques and, accordingly, warrants immediate attention. A brief explanation of DI and sequencing techniques follows.

Transduction Transduction, or the conversion of one kind of energy into another kind of energy, is the technical basis of tracking. The three most iconic recording technologies— microphones, headphones, and speakers—are, in fact, transducers. Microphones transduce (convert) acoustic energy into alternating current, which is electrical energy, while headphones and loudspeakers transduce electrical energy back into acoustic energy.


Understanding Records, Second Edition

A number of variables mediate every transduction. First among these mediating variables are microphones. Each microphone has a biased way of hearing sound, which recordists call its “frequency response.” Some microphones exaggerate, while others attenuate, certain frequencies, while still others only transduce sounds coming from particular directions. Variations in frequency response are vast, and seemingly endless. Given the crucial role that microphones play in the tracking process, it should come as no surprise to discover that these variations guide the microphone selection and placement processes in their entirety. To understand the almost total influence frequency response exerts over the tracking process—and, in turn, to understand the crucial role that microphone selection plays in determining the final audible character of tracks on a record— readers must first understand what it is that frequency response shapes, namely, a so-called “sound source.” Thus, I will explain the concept of a sound source before I elucidate the microphone selection process. Readers who are already familiar with the physics of sound should feel free to skip ahead to the following section of this field guide, headed “Microphone selection, I.”

Sound source, soundwave, and waveform A sound source is a vibrating physical object—no more, no less. As objects vibrate, their back-and-forth motions displace the air molecules surrounding them, forcing them to compress (bunch together) and rarefy (thin apart) in recurring patterns called “soundwaves.” The vibrational energy of the sound source is rapidly conveyed from air molecule to adjacent air molecule in the form of a soundwave until it reaches, for my purposes, human ears or a microphone. The tympanic membrane in the human ear vibrates in sympathy with soundwaves, which is to say, the membrane moves back and forth at a rate which is directly proportional to the vibration of the sound source itself. Through a complex electrochemical process, this sympathetic back-and-forth motion creates a sequence of electrical impulses which, in turn, the brain interprets as sound. With microphones, a diaphragm rather than a tympanic membrane sways sympathetically with the changes in air pressure that a soundwave creates. This sympathetic motion is then translated into alternating positive and negative charges of electrical current, which is transmitted by cables through routing technology (e.g., mixing consoles), processing technology (e.g., compressors and equalizers (EQs)), and, ultimately, to storage technology (e.g., tape machines and computers).

Tracking (Making Audio Signals)


Recordists visualize soundwaves as “waveforms” (see Figure 1.1). Waveforms graph the changes in air pressure that characterize a soundwave. The horizontal axis of a waveform represents time, while the vertical axis represents changes in air pressure. Therefore, the pushing-and-pulling energy of a soundwave is flipped onto its side in a waveform graph and is represented as an up-anddown motion along the vertical axis. Upward motion along the vertical axis of a waveform represents compressions of air molecules, while downward motion represents rarefactions. The horizontal axis delineates the time it takes for each successive compression and rarefaction to occur. The vertical expanse of a waveform, which recordists call “amplitude,” delineates the total displacement of air molecules a soundwave generates. As such, amplitude represents the total displacement power of a soundwave. This displacement power is usually measured in decibels (dB), which is a unit of measurement named after an often overlooked pioneer of the recording process: Alexander Graham Bell (this is why the “B” in “dB” is usually capitalized). Decibels are groupings of ten “bels,” with each “bel” defined as a common logarithm of two powers. Accordingly, when one power is ten times another, it is said to be 1 dB more powerful. A 100 horsepower (hp) engine is 1 dB more powerful than a 10 hp engine, for instance, while a soundwave ten times more powerful than another is said to be 1 dB louder.

Figure 1.1  A waveform representation of the changes in air pressure produced by plucking the bottom string of an acoustic guitar.


Understanding Records, Second Edition

Figure 1.2  A close-up of the waveform from Figure 1.1, with peak compression and rarefaction amplitudes noted.

Contrary to popular belief, amplitude doesn’t measure “loudness.” Amplitude roughly equates with “volume,” which designates, for my purposes right now, the peak dB a waveform is set to reach. Loudness, on the other hand, is a subjective impression of a sound created by a number of characteristics, not least of which being so-called “average amplitude” or the amount of time a waveform spends at or near its peak volume. If a sine waveform and a square waveform are both set to peak at the very same volume (see Figure 1.3), the human ear nonetheless hears the square waveform as louder. This is because the square waveform spends almost all of its time at or near peak volume, while the sine waveform only quickly slides by that mark. Recordists refer to the difference between the peak amplitude of a soundwave and its average amplitude as its “crest factor.” “Transient,” or percussive, sounds usually exhibit a greater crest factor, that is, they exhibit a greater difference between their peak and base amplitudes, than do sustained sounds. Waveforms which spend all or most of their time at peak amplitude, on the other hand, like the so-called “square wave” in Figure 1.3, exhibit practically no crest factor at all.

Demonstration tracks and playlists: Spotify At this point in the chapter, I would like to direct readers to the various playlists on Spotify, which I reference throughout this book. These playlists contain audio

Tracking (Making Audio Signals)


Figure 1.3  A “sine wave” (above) and a “square wave” (below).

Figure 1.4  Crest factor refers to the distance between a waveform’s peak amplitude (peak level) and its average amplitude (RMS level).


Understanding Records, Second Edition

examples designed to demonstrate and aurally concretize the concepts I explore in the following pages, or comprise commercially released tracks that nicely demonstrate how some concept or technique has played out in the creative practice of recordists. I have used Spotify to house these playlists for a number of reasons, but the most salient is that its freemium model allows for free use without any term limits, and thus I can be reasonably certain that readers will be able to avail themselves of these audio materials for as long as Spotify continues to operate. As I noted in the introduction for this book, it is crucial, to my mind, that readers listen to the audio examples—and, as I’ll note later, view video materials on YouTube—when prompted in the reading. Indeed, the audiovisual materials I have created for this book are not meant as accompaniment but, rather, as crucial parts of a broader whole. It is only by learning to hear musical techniques at work that the concepts which underpin them take on any creative meaning, and this is the only meaning that has relevance, as far as I am concerned. Given that this is the first time readers are asked to navigate their way to one of these playlists in the body of this field guide, I will explain how they should interpret the suggestion. In the entry below, you will see a designation for a “Playlist Title.” This is the title of the playlist you are asked to find on Spotify, and if you enter it into Spotify’s search engine, it will show up first on the list that Spotify suggests (if, for some reason, the playlist doesn’t appear in the search engine, you can always find a direct link to these playlists at www. jayhodgsonmusic.com). Then there is a designation for one or more tracks, the number representing where in the playlist they are located. Thus, in the entry below, readers are asked to navigate to the playlist titled “Understanding Records, Chapter One: Audio Concepts & Demonstrations.” They are furthermore asked to navigate to track 1, titled “Sine Wave & Square Wave.” Finally, a brief explanation notes that listeners will hear a sine wave and a square wave playing the same pitch, and peaking at the same level, and that they should consider which strikes them as subjectively louder. Playlist Title: Understanding Records: Audio Concepts & Demonstrations Track: 1 Sine Wave & Square Wave On this track, readers hear a sine wave, followed by a brief silence, and then a square wave. Both waveforms play the same fundamental pitch, and at the same peak level (“volume”) of −0.3 dBfs. Listeners should consider which waveform strikes them as subjectively louder, despite the fact that they reach the same “volume” of −0.3 dB (fs). The process repeats a number of times.

Tracking (Making Audio Signals)


Frequency Frequency measures the number of times per second that the changes in air pressure which define a soundwave recur, and it is usually expressed in Hertz (Hz) and kiloHertz (kHz). A soundwave which repeats 20 times per second is said to have a frequency of 20 Hz, while a soundwave which recurs 20,000 times per second is said to have a frequency of 20 kHz. Humans are only equipped to hear frequencies more or less within this range. Barring variations in biology, frequencies below 20 Hz and above 20 kHz are actively rejected by the human ear. Though it is technically reductive to say so, for the purposes of this field guide it will suffice to say that the human ear subjectively interprets frequency as pitch, just as it is sufficient to say that the human ear subjectively interprets average amplitude as loudness. Soundwaves of equal volume, which recur with greater rapidity than others, sound like they have a “higher” pitch to human ears. If a motor idles at a rate of 1,200 rotations per minute (rpm), for instance, which translates into a rate of 20 rotations per second, it generates a soundwave with a frequency of 20 Hz. Compared to the soundwave created by a motor which idles at a rate of, say, 2,400 rpm, the motor which idles at 1,200 rpm produces a lower sounding pitch.

Figure 1.5 Keys on a standard acoustic piano keyboard and their corresponding frequencies, spanning slightly more than an octave above middle-C (256 Hz).

Understanding Records, Second Edition


Wavelength Soundwaves occur—or, better yet, they recur—in both time and space. The physical distance a soundwave must travel to complete one cycle of compressions and rarefactions (i.e., its wavelength) is directly related to its frequency. Higher pitched, and thus more rapidly recurring, frequencies require smaller wavelengths, while lower pitched, less rapidly recurring, frequencies require longer wavelengths. Table 1.1 lists the wavelengths of common musical sounds, expressed in feet. Even just a cursory gloss through this table should convince the most stubborn of skeptics that a massive variance in wavelength inheres across the audible spectrum. Though the concept of wavelength may seem completely academic, it actually plays an important and thoroughly practical role in the tracking process. Wavelength is a core player in the creation of so-called “standing waves” and “room modes,” for instance, which acoustically bias a room into exaggerating the amplitude of only certain frequencies, usually in the lower end of the frequency spectrum. When recordists track in a room that acoustically exaggerates, say, 384 Hz, each time an electric bassist plucks that frequency, it registers at a higher volume than other frequencies from the electric bass. This presents myriad problems during mixing and mastering. Faced with a spiking volume each time the bassist plucks 384 Hz, recordists have no choice but to somehow minimize that specific frequency each time it sounds, without disfiguring the broader dynamic and frequency balance of the track at large. Wavelength also plays a crucial role in determining something called “phase coherence” in recording, which I examine next.

Table 1.1  Frequency (in Hertz) and wavelength (in feet) for common musical sounds Sound Lowest audible frequency Lowest note on a piano Kick drum fundamental Concert tuning note (A440) Highest note on a piano Vocal sibilance (s) Cymbal sizzle Highest audible frequency

Approximate frequency Wavelength (ft) 20 Hz 28 Hz 60 Hz 440 Hz 4.18 kHz 5 kHz 16 kHz 20 kHz

56.5 40.4 19.0 2.57 0.27 0.23 0.07 0.05

Tracking (Making Audio Signals)


Phase coherence, phase interference, and comb filtering When a soundwave recurs with the same characteristic shape for an extended period of time, it is called a “periodic soundwave.” The “period” of a periodic soundwave is the amount of time it takes to complete one cycle, which is to say, one time through the pattern of compressions and rarefactions that defines it. The “phase” of a periodic soundwave designates a physical place in that cycle, and it is measured in degrees, with 0o and 360o marking the beginning and ending of a full cycle, and 180o marking the halfway point. When recordists transduce the same sound source using two microphones, for instance, and the resulting waveforms captured by each microphone exhibit the same shapes throughout their cycles, the two signals are said to be “in phase.” When there are discrepancies in the waveforms, the signals are said to be “out of phase.” In fact, the concept of “phase” underwrites many tracking procedures. Any time recordists use more than one microphone to track the same sound source, they risk inducing so-called “phase interferences,” that is, tonal distortions created by recording the same soundwave but at different points in its phase, so the resulting signals misalign when combined. For example, one microphone may pick up a soundwave while it is in its compression phase at particular frequencies, whereas a differently placed microphone picks up the same waveform while it is rarefying at the same frequencies. Thus, the resulting combined waveform would be tonally deficient in the specific frequency region where they overlap while moving in opposite directions, as the opposing forces counteract and, in the process, do something like “turn down the volume” where the waveforms oppose. This “weakening” of a waveform in particular regions, when it is induced by discrepancies in phase, is called “destructive phase.” In extreme cases, destructive phase can produce a peculiar kind of tonal distortion called “comb filtering.” Alexander Case explains this phenomenon nicely: Combining a musical waveform with a delayed version of itself radically modifies the frequency content of the signal. Some frequencies are cancelled, and others are doubled. The intermediate frequencies experience something in between outright cancellation and full-on doubling. .  .  . Taking a complex sound like guitar, which has sound energy at a range of different frequencies, and mixing in a delayed version of itself at the same amplitude, will cut certain frequencies and boost others. This is called comb filtering, because the alteration in the frequency content of the signal looks like teeth on a comb.1


Understanding Records, Second Edition

Figure 1.6  A comb filter, so named for its resemblance to the teeth on a comb. Each dip in the comb filter, which recordists call “notches,” represents attenuations of the input signal at particular frequencies.

Playlist Title: Understanding Records: Audio Concepts & Demonstrations Track: 2. Comb Filter This track begins with white noise. A comb filter is slowly introduced, and moved up and down the audible spectrum.

Far from purely academic considerations, the related concepts of phase, phase interference, and comb filtering direct recordists while they track, especially when they use more than one microphone to do so. In fact, recordists have devised a number of tracking techniques specifically built on the concept of phase interference. Bob Ezrin offers a good example: On an emotional level, it’s difficult to get vocalists to deliver exciting performances if they feel uncomfortable wearing headphones. I’d typically have them sing without headphones and let them monitor the tracks through some huge speakers. To diminish leakage from the rhythm tracks into the vocal mic, I’d use the old trick of putting the speakers out of phase. Somewhere between those two speakers will be a point at which the sound-pressure level is close to

Tracking (Making Audio Signals)


zero. All you have to do is have someone move the mic around while you listen in the control room for the spot where there’s almost no leakage. Leave your mic in that spot. Now your vocalist can stand in front of this great big PA system and feel the music surround them, just as if they were singing onstage.2

Microphone selection, I When recordists transduce a sound source, they use one or more microphones to transduce its vibrational energy into an audio signal. Thus, understanding how microphones “hear” is a crucial step in understanding how recordists create the sounds on any given record. Experienced recordists do not randomly select microphones and hope it all turns out for the best, after all. Rather, they select a microphone and place it in relation to a sound source using a multifaceted and highly nuanced musical reasoning. In fact, microphone selection is usually an extremely personalized technique. Despite the proscriptions of audio-engineering textbooks, there is no single correct way to select, let alone use, a microphone. To make appropriate microphone selections—to choose the microphone which best relates a sound source to the peculiar needs of a particular musical project—recordists educate themselves about the unique frequency response characteristics of each microphone in their arsenal. Over the course of their careers, they will usually develop a short list of “go-to” microphones they regularly use to track certain sound sources. Bruce Swedien, whose production and engineering credits include records by top-tier Top 40 pop vocalists—like Michael Jackson (Off the Wall, Thriller, Bad, Dangerous), Donna Summer (Donna Summer), Roberta Flack (I’m the One), and Jennifer Lopez (Rebirth)—starts the microphone selection process with the short list given in Table 1.2 as his guide. Table 1.2  Bruce Swedien’s list of “go-to” microphones, related to vocal timbres3 Vocal characteristics


Well-rounded, naturally goodsounding voice Thin, weak voice Loud, brassy voice with good projection Good voice, but too sibilant

Neumann U-47, Neumann U-87, Neumann U-67, Telefunken 251, AKG C-12, or Sony C800-G. RCA 44BX Shure SM-7, AKG 414 EB, or RCA 44BX Neumann U-47, Neumann U-67, RCA 44BX, or Shure SM-7


Understanding Records, Second Edition

Operations principles Regardless of the sound source, or of the specific models of microphone in their “go-to” list, modern recordists chiefly use only one of three different kinds of microphone to transduce: (i) dynamic (or moving coil) microphones, (ii) condenser (or capacitor) microphones, and (iii) ribbon microphones. Each of these microphones has a unique “operations principle”: each transduces sound using unique materials configured to function in particular ways. Though recordists can use microphones however they see fit—including as drum sticks, billy clubs, and chop-sticks—most microphones are manufactured to transduce sound in very specific ways, which recommend them for a limited set of uses. A vintage ribbon microphone, for instance, will often break if presented with overabundances of sound pressure; thus, its specifications recommend uses other than tracking a heavy-hitting drummer like the Who’s Keith Moon, Tool’s Danny Carey, or Led Zeppelin’s John Bonham from a close proximity. Dynamic microphones, on the other hand, are prone to inertia and thus are not suitable for recording quiet, high-pitched material from a great distance. They might be suitable for recording Carey’s kick drum, then, but maybe not a flute solo from the back of a massive concert hall. Indeed, to make the best microphone selection, recordists have only one technique at their disposal, namely, to arm themselves with as much information as they can muster about the unique operations principle of each microphone in their toolbox. To understand microphone selection, then, we will have to do the same.

Dynamic (or moving coil) microphones Dynamic (or moving coil) microphones function like miniature speakers put in reverse.4 Inside every dynamic microphone is a magnet, and suspended inside its magnetic field is a coil of wire, called a “voice coil,” with a diaphragm attached to its end. Soundwaves vibrate the diaphragm back and forth, which sympathetically pushes and pulls the coiled wire back and forth inside the microphone’s magnetic field. This back-and-forth motion, in turn, generates an electrical current that is eventually converted into an audio signal. One of the most celebrated frequency response profiles in the pop world belongs to the Shure SM57 dynamic microphone. This microphone follows the operations principle detailed in Figure 1.7. Figure 1.8 reproduces frequency response specifications for the SM57, which Shure provides with each unit in its packaging. Distinguishing traits of the SM57’s frequency response contour include a sharp bass roll-off beginning at roughly 200 Hz and an upper-midrange

Tracking (Making Audio Signals)


Figure 1.7 The operations principle of a common dynamic (or moving coil) microphone.

Figure 1.8 Frequency response specifications for the Shure SM57 dynamic microphone, provided by Shure with each unit in its packaging.


Understanding Records, Second Edition

peak centered at about 6 kHz. Just as human ears tend to exaggerate the uppermidrange (1–5 kHz) of what they hear and reject frequencies roughly below 20 Hz and above 20 kHz, the Shure SM57, when plugged in fresh from the box, (i) sharply attenuates all incoming sound below 200 Hz and above about 12 kHz; (ii) exaggerates frequencies between roughly 2 kHz and 7 kHz; and (iii) rejects everything under 40 Hz and above about 15 kHz. The SM57 shares the basic shape of its frequency response with most other dynamic microphones. Because they are typically manufactured to withstand especially high sound-pressure levels (SPLs), and rugged uses which could easily destroy the delicate internal circuitry of, say, a vintage RCA 44 ribbon microphone, dynamic microphones remain the industry standard for a number of live and close-mic applications. The bass roll-off, which characterizes its frequency response, is a manufactured bias, designed to attenuate “proximity effect” exaggerations of bass frequencies, which every kind of unidirectional microphone generates when placed in close proximity to a sound source. At the same time, the SM57’s exaggerated peak, between about 2 kHz and 7 kHz, emphasizes the attack transients of most rhythm and lead instruments in a pop arrangement, and thus enhances their clarity on record.

Condenser (or capacitor) microphones Condenser microphones use two conducting plates to transduce the vibrational energy of a soundwave into electrical current. These conducting plates comprise a capacitor, that is, an electronic mechanism designed to store energy in the form of an electrostatic field. One of the plates in the capacitor, known as the “backplate,” never moves. The other plate, called the “front-plate,” functions like a diaphragm, vibrating sympathetically with changes in air pressure around it. When excited by a soundwave, the distance between the backplate and the front-plate changes, which in turn modifies the capacitance of the capacitor. When the front-plate moves toward the backplate, the microphone’s capacitance increases and a charge current registers; when the front-plate moves away from the backplate, the capacitance decreases and a discharge registers. Recordists typically prize condenser microphones for their neutral and expansive frequency response. Condenser microphones usually exhibit a faster response to attacks and transient detail in a sound source, and they usually add the least coloration. They also tend to transduce a broader frequency range than do other microphones. This all said, though, condenser microphones aren’t

Tracking (Making Audio Signals)


Figure 1.9 The operations principle of a common condenser (or capacitor) microphone.

completely neutral. No microphone transduces sound in an entirely neutral way across the audible spectrum. Condenser microphones are simply the most neutral of all the microphones that recordists presently use. Figure 1.10 below reproduces frequency response specifications for Neumann U47 and U48 condenser microphones, set for a variety of directional response patterns (see “Directional Response” section). I use these two microphones as examples because they are among the most celebrated of condenser microphones in use today. A quick scan of their frequency response contours demonstrates that any deviation from 0 dB (SPL) remains minimal, rather than neutral, across the audible spectrum. Two steep +10 dB (SPL) rolloffs, beginning at 250 Hz and again at 12 kHz, characterize the U48’s frequency response, and multiple +/-6 dB (SPL) variances obtain between about 2 kHz and 15 kHz with both the U47 and the U48. Most condenser microphones have a frequency response contour which closely resembles that of the U47 and U48. Manufacturers use extremely light and thin materials for the diaphragms in condenser microphones. Thus, the operations principle of a condenser microphone is far less prone to inertia


Understanding Records, Second Edition

Figure 1.10  Frequency responses of the Neumann U47 and U48 large-diaphragm condenser microphones, set for three directional responses (top-to-bottom: omni-, uni-, and bidirectional).

(i.e., stasis or lack of movement) than are the coiled wires in a dynamic microphone, even if those coiled wires can handle much higher SPLs without distorting. Not surprisingly, then, condenser microphones transduce a much more expansive array of frequencies, and with greater accuracy and efficiency, than do dynamic microphones; in addition, they maintain that accuracy from greater distances. This makes condenser microphones an ideal choice for tracking (i) sound sources comprising energy spanning the audible spectrum; (ii) sound sources which require tracking from a distance; and (iii) reverberations.

Large- and small-diaphragm condensers The size of the diaphragm in a condenser microphone plays a crucial role in determining its frequency response. Small-diaphragm condenser microphones, that is, condenser microphones with a front-plate diameter of less than about two-and-a-half inches, do not respond to lower frequencies with the same level of detail as do large-diaphragm condenser microphones, though they typically

Tracking (Making Audio Signals)


respond faster, and with greater accuracy, to frequencies above roughly 3 kHz (they also tend to produce a slight bump in their frequency response between 2 kHz and 7 kHz). Small-diaphragm microphones can handle higher SPLs than their large-diaphragm counterparts, however, given that their smaller diaphragms tend also to be stiffer. Thus, recordists use small-diaphragm condenser microphones to track often impulsive sound sources, which contain highly detailed high-frequency content, like cymbals, hi-hats, and steel-string acoustic guitars, but which do not necessarily require the same extended bass response as their large-diaphragm counterparts. They use large-diaphragm condenser microphones to transduce sound sources comprising an obviously expansive array of frequencies, like lead vocals, classical guitars, drum kits, and reverberations. Small-diaphragm condenser microphones usually come as a “matched pair,” that is, as a packaged tandem with uniform outputs. This is because many smalldiaphragm condenser microphones are designed for stereo-mic applications, which is to say, “tandem arrays.” The so-called “x-y array,” for instance, is a common technique for recording acoustic guitars in rock and pop (see Figure 1.12). And it is not uncommon to see small-diaphragm condenser microphones spaced in an “a-b array” (see Figure 1.13) overhead the drum kit, though largediaphragm condensers are more likely to be found there nowadays. In any event, when they first learn to use stereo arrays for tracking, novice recordists are typically alerted to the strength of the resulting signal in the center channel. Most stereo arrays do not point directly at the sound source, and, as such, they

Figure 1.11  A large-diaphragm condensor microphone and a matched pair of smalldiaphragm condenser microphones.


Understanding Records, Second Edition

tend to produce a weaker “center image” than would a microphone pointed directly at the source. This can be advantageous—for example, a weakened center image might carve out space for a pop vocal in the center of a mix—but it also means that recordists need to choose a stereo array with some idea of the final mix already in mind.

Figure 1.12  An “x-y array,” using a matched pair of small-diaphragm condensers on an acoustic guitar.

Figure 1.13  An “a-b array,” using a pair of large-diaphragm AKG 414c condenser microphones.

Tracking (Making Audio Signals)


Ribbon microphones Ribbon microphones are actually a class of dynamic microphone. However, the “dynamic” label has become so associated with moving coil microphones that the two have become synonymous. In fact, the operations principles for dynamic and ribbon microphones are remarkably similar: both use magnets, magnetic fields, and metallic elements to generate electrical current, which is to say, both transduce based on the principle of electromagnetism. With ribbon microphones, however, it is a thin piece of corrugated metal, called “the ribbon,” which hangs suspended inside the microphone’s magnetic field (see Figure 1.14). Soundwaves push against the ribbon, prodding it to vibrate sympathetically with their vibrational energy. This mechanical energy—the vibrational energy of the ribbon—is then converted into magnetic energy and in turn into the electrical current of audio signal. The metallic ribbon element in a ribbon microphone is thinner and lighter than the diaphragm and coiled wire used to make a dynamic microphone, though it is much longer and more prone to inertia. And the ribbon element is also much thicker than the ultra-thin diaphragm used in most condenser microphones. Thus, the frequency response of a typical ribbon microphone is the least expansive of the three modern microphone types. Ribbon microphones remain notably “dark,” or insensitive to high frequencies, relative to both smalldiaphragm and large-diaphragm condenser microphones, and to most moving coil microphones (due to their smaller diaphragms).

Figure 1.14  The operations principle of a common ribbon microphone.


Understanding Records, Second Edition

Figure 1.15  Front and rear frequency responses for the bidirectional AEA R84 ribbon microphone. Readers should compare the contour below to those in Figures 1.8 and 1.10.

Ribbon microphones are the most fragile microphone type, though their fragility is often exaggerated in pedagogy. Nonetheless, they do remain exceedingly rare in personal, or “project,” recording environments. This is, in part, due to their perceived fragility. This rarity is also, however, a product of their meager outputs. Most project studios feature cheaper preamplifiers than their professional counterparts, and they simply cannot amplify the ribbon mic’s output to a workable level without inducing obfuscating amounts of “self-noise,” that is, noise generated by the recording setup itself. “Active” ribbon microphones, like the Royer R-122V, condition their output so this self-noise issue becomes less of an obstacle for recordists, but they tend to be fairly expensive, and savings remains the raison-d’etre for project recording. Likewise, though more recent generations of ribbon mic have become more rugged than their antique ancestors, modern ribbon mics are still not ideal for recording sounds that exhibit higherthan-average SPLs, and many will still overheat—and may even see their thin ribbon elements melt—if connected to an external power source Playlist Title: Understanding Records: Audio Concepts & Demonstrations Track: 3 Microphone Shootout This track is comprised of the same thirty-second guitar performance recorded by four different microphone “types,” from the same position, and through

Tracking (Making Audio Signals)


the same signal chain. The performance was recorded in a heme-anechoic chamber, which is to say, an environment that aggressively minimizes any room reflections above 60 Hz, and nothing was done to “treat” the resulting recordings (i.e., no EQ and no gain matching applied; nothing done to address noises in the line brought about by the “shootout” setup), as the point of this demonstration is to provide concretizing audio information that readers can use to help hear the most obvious differences between the microphone “types” I note above. Listeners will thus hear the same performance, with negligible room acoustics, as recorded by (i) an SM57 dynamic microphone, (ii) an AKG 414c largediaphragm condenser microphone, (iii) a Schoeps small-diaphragm condenser microphone, and (iv) an ART AR5 ribbon microphone. Note that the last track will feature much more indirect signal, as the AR5, being a ribbon microphone, is by necessity bidirectional (see “Directional Response” section ).

Microphone selection, II Though it would be reasonable to assume that a jagged frequency response contour is somehow deficient—that a neutral contour is more desirable—no frequency response is inherently better than another except given a particular application. For instance, the fact that ribbon microphones transduce a stunted range of frequencies, while condenser microphones transduce an expansive range, simply recommends both microphones for distinct gamuts of uses. Bruce Swedien provides an exceedingly clear example: I learned long ago that using ribbon mics in the initial recording of percussion tracks definitely makes life easier when it comes to mastering a recording. Listen carefully to Michael [Jackson] and his brothers playing glass bottles on “Don’t Stop ‘Til You Get Enough.” . . . I used all ribbon microphones[—]RCA DX77s and RCA 44BXs. The heavy mass of the ribbon element, suspended in the magnetic field of a ribbon mic, makes it impossible for a ribbon mic to trace the complete transient peak of a percussive sound such as a glass bottle. If I had used condenser microphones, with the condenser mic’s ability to transduce the entire transient peak of the bottles, the bottle would have sounded great, played back from tape in the control room, but when it came time to master, such an incredible transient peak would have minimized the overall level . . . of the entire piece of music. In other words, condenser mics would have compromised the dynamic impact of the sonic image of the entire piece of music.5

Bob Ezrin’s microphone selection for Alice Cooper’s vocals on “I’m Eighteen” provides another good example of this principle. When Ezrin selected the Shure


Understanding Records, Second Edition

SM57 to record Cooper’s lead vocals, the young Canadian producer was fully aware that, according to him, all the rules about recording vocals “the correct way” said that he should have selected a large-diaphragm condenser microphone instead. However, the condenser microphone’s neutral frequency response wound up being, ironically, entirely too neutral for Cooper’s voice: the evenness of the frequency response refused to yield the compelling, gut-wrenching rock vocal sound that Ezrin and Cooper agreed was needed. The SM57’s jagged frequency response, made all the more jagged through compression and equalization, simply related Cooper’s vocals to the particular needs of “I’m Eighteen” better than any other microphone they tried. “When I listened to the tracks,” recalls Ezrin, “they didn’t sound real to me. I went reaching for whatever I could find to give [Cooper’s] vocals some sense of power and space. The only things available were EMT plate reverbs and tape echo. . . . [So I] used a Shure SM57 to record Alice [Cooper’s] voice. The trick to using the SM57 on vocals is compressing it to even the sound out and getting gross with the equalization. I would dial in some real tough-ass midrange and a lot of top end.”6

Conventional microphone selections for tracking a drum kit belie a slightly more complicated dynamic at work in the selection process. While recordists still base their microphone selections on the frequency response they think will best relate the drum kit’s component parts to the broader needs of the project at hand, and while directional response plays an important role as well (see “Directional Response” section), the physical properties of the sound source itself plays an obviously determinant role. Snare drums, kick drums, floor toms, and mounted toms present extremely high SPLs, which can easily damage the transducing mechanisms on ribbon and condenser microphones, especially when clumsily placed too close to source. Recordists thus have little choice but to select dynamic microphones for many of these components, though from personal experience I can say that a vogue for recording the kick drum with a large-diaphragm condenser mic placed in tandem with a large-diaphragm dynamic mic has certainly taken hold in the last few decades. When it comes to tracking cymbals and reverberations, however, the relatively low SPL demands of those sources allow recordists to select whichever frequency response they desire, and they often opt for large-diaphragm condensers to most accurately capture the transient detail of those tracks.

hi-hats: Nuemann KM84 toms: Sennheiser 421

Neumann KM 84 Shure SM57 (top) and SM58(bottom) Shure SM57 Shure SM57 Shure SM 57

Altec 633 Electro RE20

Neumann FET 47 AKG D12

toms: Sennheiser 421 toms: Sennheiser 421/ hi-hats Neumann KM 84 toms: Sennheiser 421

Sennheiser 421 Shure SM 57 Shure SM57


toms: Sennheiser 421

hi-hats: Neumann KM84

hi-hats: SONY ECM51

toms: AKG D19

Neumann FET 47 AKG D12



hi-hats: AR 6451

top: Shure SM57/ bottom: AKG 451 AKG D19



Brian Adams, “Cuts like a Knife” (producer: Bob Clearmountain) Kate Bush, “Wuthering Heights” (producer: Jon Kelly) Derek and the Dominoes, “Layla” (producer: Tom Dowd) Devo, “Whip It” (producer: Robert Margouleff) Dire Straits, “Money for Nothin” (producer: Mark Knopfler) The Knack, “My Sharona” (producer: Peter Coleman) Madness, “Our House” (producer: Clive Langler, Alan Winstanley) Madonna, “Like a Virgin” (producers: Nile Rodgers/Madonna/Stephen Bray) The Pixies, “Monkey Goes To Heaven” (producer: Gil Norton) The Police, “Every Breath You Take” (producer: Hugh Padgham)



Band, track(Producer)


Neumann U87 and U47 Coles 4038

Neumann U47

Neumann U87

Neumann KM84

Neumann U87

AKG 414

Neumann U47

Coles 4038

Neumann U87


Table 1.3  Microphone selections for drum tracks on celebrated records. Readers unfamiliar with the brands listed below should simply note the overlap between microphone selections

Tracking (Making Audio Signals) 31

hi-hats: Neumann KM84 toms: Sennheiser 421

Shure SM57 Shure 545 Shure SM57 Shure SM57 Shure SM57

AKG D12 Neumann FET 47 AKG D12 Shure Beta 58 AKG D30

toms: Sennheiser 421

hi-hats: Shure SM57

toms: Sennheiser 421

toms: Neumann U67

Shure SM 57



Sex Pistols, “Anarchy in the UK” (producer: Chris Thomas) The Smiths, “The Queen Is Dead” (producer: Morrisey/Johnny Marr) The Staple Singers, “I’ll Take You Higher” (producer: Terry Manning) The Stone Roses, “Fool’s Gold” (producer: John Leckie) The Strokes, Is This It? (producer: Gordon Raphael) The Who, “Who Are You?” (producer: Jon Astley/Glyn Johns)



Band, track(Producer)

Table 1.3  Continued

Neumann U87 & AKG 414 Audio-Technica 4033A Neumann U87

Neumann U87

Neumann U87

Neumann U87


32 Understanding Records, Second Edition

Tracking (Making Audio Signals)


I asked Kevin O’Leary, a good friend of mine who works in Toronto as an audio and mix engineer, and producer, with credits that any recordist would kill for (including work on records by Three Days Grace, Shawn Mendes, Walk off the Earth, Billy Talent, The Tragically Hip, among others), how he chooses a microphone for recording particular sound sources. He was gracious enough to offer the following exclusive comments: In an ideal world, when choosing microphones I would like to be able to audition a number of mics on each source and select the best option after listening to all of them. In the real world, this is not often possible. As an engineer, it is my job to get everything sounding great as quickly as possible on most sessions. It is rare that sessions will have the budget to spend time comparing various microphones. To accommodate these sessions where we move very quickly, it is important to consider a number of things when choosing microphones. First, I find it very valuable to talk with either the artist, producer, or both before the start of a recording session, in order to fully understand what the vision for the project is. Understanding what the final product is intended to sound like is critical in choosing an appropriate microphone. If the artist has any prior recordings, I will listen to them to get an idea of what they normally sound like. I will also often ask the artist what they liked and disliked about previous recordings that they have done. All of this insight can help to inform my microphone choices. After gaining an understanding of the desired product, I can make an informed choice as to what microphones may work best. This is largely based off of the experience that I have working with many different microphones. Having spent countless hours in the studio, I have come to know how specific mics tend to respond to different sources. Based on this knowledge I will choose the microphones that I think are most likely to achieve the desired results for the recording. When the band arrives and sets up their gear in the studio, I will listen to what they sound like in the room. Every player and instrument is unique, and sometimes things will sound a little bit different than what you are expecting. In these cases, I won’t hesitate to make a decision to swap out one of the microphones I had chosen ahead of time. The final step in the process comes after actually hearing how everything sounds coming out of the speakers. If certain instruments aren’t sitting the way I would like, or are competing too much with other instruments, I won’t hesitate to swap out a microphone quickly to see if it can be improved.  The one time that I will always make every effort to audition multiple microphones is on a vocal. Despite having a lot of experience hearing different vocalists sing through different microphones, I am often surprised by the microphone that is chosen after doing a proper shootout of a few good


Understanding Records, Second Edition microphones that I think will work well. A specific microphone could, for example, really bring out the sibilance in the voice of one singer, and sound too harsh, whereas with a different singer it may downplay that sibilance and actually warm up the overall sound. It is important when auditioning different microphones to make sure that you are listening back to each at the same volume, otherwise you are likely to just choose the loudest microphone.

Directional response Another important component of a microphone’s operations principle, which plays a crucial (if often overlooked) role in the microphone selection process, is “directional response.” Every microphone is manufactured to hear sound in only certain directions, and directional response describes how they do so. There are three basic directional responses: (i) omnidirectional microphones, which hear equally well in all directions; (ii) bidirectional microphones, which hear only in front and behind; and (iii) unidirectional microphones, which hear in front, and to their sides, but only very slightly behind. Given the heart-like shape of their directional response, unidirectional microphones are often called “cardioid” microphones. Hyper-cardioid and super-cardioid microphones simply extend the basic cardioid directional pattern behind, to different degrees.

Figure 1.16  Common directional response profiles. From left to right: omni-, bi-, uni(cardioid), hyper-cardioid, and super-cardioid.

Tracking (Making Audio Signals)


Recordists have devised a number of practical applications for the concept of directional response. Norman Smith, for one, devised an interesting use for the bidirectional response to overdub vocal parts for the Beatles in the early 1960s. The Beatles continued to record their vocal and instrumental parts live in the studio after they adopted four-track technology in 1964, but producer George Martin took to reserving an extra track for overdubs. As Ringo Starr remembers: Most of our early recordings were on three tracks, because we kept one for overdubs. [This] kept us together as a band—we played and played and played. If one of them could sing it, the four of us could play it until the cows came home. There was none of this, “Well, we’ll put the bass on later, or the guitars.” We put most of it on then and there [in the studio “live” room], including the vocals.7

Amazingly, the Beatles didn’t use headphones to track until they recorded Revolver in 1966. To overdub vocal parts, band members had to stand in front of a massive RLS10 “White Elephant” speaker which output reference tracks to sing along with. This setup could easily have induced any number of phase interference issues. The microphone used to capture John Lennon’s and Paul McCartney’s vocal overdubs for “Money,” for instance, might also have captured the reference tracks that Lennon and McCartney sang along with, but delayed by a few milliseconds and, thus, slightly out of phase with the original recording. To guard against this possibility, Norman Smith selected either a Neumann U47 or a U48 large-diaphragm condenser microphone, set for a bidirectional response, and he placed the microphone directly between both singers, and directly perpendicular to the White Elephant speaker (see Figure 1.17), knowing that the bidirectional response would actively reject the speaker output. In this case, the bidirectional response also provided the added benefit of allowing the band to record two distinct vocal parts onto the same track, at a time when track space was at a premium. Sound doesn’t just travel in a straight line; rather, it spreads in numerous directions at once. Smith’s microphone selection and placement simply could not reject all that the White Elephant speaker emitted. Some of the reference tracks from the speaker inevitably leaked onto the overdubbed tracks. Thus, those overdubbed tracks introduced a slightly delayed version of the reference tracks which, combined with the originals, produced subtle


Understanding Records, Second Edition

Figure 1.17  Diagram of Norman Smith’s placement scheme. Set for a bidirectional response, the microphone (usually a Neumann U47 or U48) captured two distinct vocal performances onto the same track and actively rejected reference tracks output by the White Elephant speaker.

tonal distortions and some degree of comb filtering. As Kevin Ryan and Brian Kehew explain: The amount was very small, but if multiple overdubs took place, the leakage “stacked up” and was accentuated. It could also sometimes reveal itself as a very subtle phasing effect. This was the result of multiple “copies” of the same sound playing back simultaneously. The rhythm track, for instance, was recorded onto Track 1 of the tape. If a vocal was then overdubbed onto Track 3, a small amount of rhythm track leakage from the White Elephant speaker was recorded on Track 3 along with the vocals. If yet another vocal was overdubbed on top of that onto Track 4, the leakage was recorded once more. . . . Upon mixing, these recordings of the leakage could interact with each other, as well as the original rhythm track. Whenever multiple occurrences of the same sound are played back in near-perfect alignment, a subtle “swishing,” phasing effect can result. The effect is subtle, but headphone listening can reveal it to be quite audible in a number of tracks.8

The “swishing” effect which Ryan and Kehew describe in the passage above is clearly audible on the stereo mix for “I Wanna Be Your Man,” from the Beatles’

Tracking (Making Audio Signals)


With the Beatles. Readers can isolate the overdubbed vocal tracks by listening to only their left speaker or headphone: the band’s overdubbed vocals were mixed to the right side of the stereo plane while the bed tracks were mixed to the left. Listening to just the right side of the mix for “I Wanna Be Your Man” reveals Ringo Starr’s lead vocals, John Lennon’s and Paul McCartney’s backing (and doubling) vocal tracks, and the reference tracks coming from the White Elephant, and at 0:20 into the track, an unmistakably clear example of leakage phasing is audible on the drum fill which sounds then. Paul McCartney’s vocal overdubs for “Little Child,” also on the stereo release for With the Beatles, reveal similar leakage phasing throughout, as do vocal overdubs for “All My Loving” and “Devil in Her Heart.” Playlist Title: Understanding Records, Chapter One: Phasing & the Early Beatles A playlist comprised of tracks noted in the paragraph directly above. Remember to isolate the left channel or right channel, which can be done simply by only listening to the corresponding “left” or “right” headphone.

Microphone placement Directional response provides a conceptual bridge between microphone selection and microphone placement. When recordists select a microphone, they do so with a basic concept of how they will place that microphone already in mind. And they generally follow that concept through to success or failure. If the initial placement fails to yield a desirable sound, as is often the case, recordists adjust the microphone’s placement incrementally until they find a distance from the sound source that works. They may also deploy different directional responses to emphasize different aspects of the sound source. Placed in close proximity to a sound source, the unidirectional response, for instance, will exaggerate its bass content, creating a so-called “proximity effect,” while an omnidirectional response will provide a more even response. Similarly, the bidirectional and omnidirectional responses capture more reverberations than does the unidirectional response. Recordists can thus deploy microphone placement and directional response in tandem to refine the frequency response of selected microphones which, in turn, significantly alters the audible character of tracks on a record.


Understanding Records, Second Edition

Playlist Title: Understanding Records: Audio Concepts & Demonstrations Track: 4. Directional Response and Proximity Effect The same thirty-second guitar performance as was heard in Track 3 of this playlist is heard here. However, this time, listeners hear three iterations, each recorded using the same microphone, from an almost identical position about 5 inches from the 12th fret, pointed in to face the hole, but set for different directional responses. The first response was recorded using a cardioid response, with the next two using a bidirectional and an omnidirectional response, respectively. Listen for the “proximity” effect in the first sample, and the increasing level of room noise, and/or the level of the “lead” signal, in the next two iterations.

Close-mic placements Through microphone placement, recordists refine the sonic image that any given frequency response initially yields. Close-mic placements, that is, transduction using microphones placed within roughly three to 12 inches of a sound source, produce a distinctly “dry” (non-reverberant) sound, and, again, directional microphones tend to exaggerate the low-frequency content of a sound source when placed in close proximity. Seemingly paradoxically, closemic techniques can also add a significant amount of high-frequency “edge” to tracks, depending on how the microphone is angled. This is because (i) air absorption plays a relatively negligible role in the transduction; and (ii) if the mic is pointed “on-axis,” which is to say, directly at the center of, say, the speaker cone on an amp, the higher-frequency content blasts directly from the tweeter to the diaphragm with only a few inches of air to diminish it. Playlist Title: Understanding Records: Audio Concepts & Demonstrations Track: 5. On-axis and off-axis placement A brief electric guitar part is heard twice, once with an SM57 pointed on-axis, directly at the amplifier’s tweeter, and then with the SM57 pointed off-axis. Listen for how pronounced the high-frequency content is in each. If you are having trouble hearing a difference, focus on how pronounced the picking sounds are in each track.

The sound of close-mic tracking has become ubiquitous in the modern pop soundscape. Electric guitars, acoustic guitars, electric and acoustic basses, acoustic pianos, lead vocals, percussion instruments, drum kits—the gamut of instruments used on popular music records—are now, as a rule, tracked via close-mic placements. Precisely how close a microphone needs to be to a sound

Tracking (Making Audio Signals)


Figure 1.18  An SM57, placed off- and on-axis to record a Peavey 250 guitar amplifier.

source to qualify as a close-mic placement, however, remains an entirely variable measure. In this respect, close-mic tracking remains a bit of a fuzzy science. The precise distance between sound source and “close mic” varies according to the peculiar demands of each particular sound source, and the specific needs of the production at hand. Close mics are usually placed within a foot of vocalists, for instance, but within inches of drum heads. And this variability seems likely to remain a defining characteristic. In Recording Practice—and especially in pop Recording Practice—product far outweighs process: pop recordists in particular remain almost pathologically open to breaking common practice techniques for tracking, especially if they sense that doing so will produce an aesthetically superior product. Close-mic placements are easiest to hear on lead-vocal tracks. In fact, lead vocals are almost uniformly tracked for pop productions using close-mic placements now. Clear examples of close-mic vocal tracks abound in modern pop. The playlist noted below features a number of clear examples from Elliott Smith’s eponymous debut album for Kill Rock Stars and Either/Or. Smith’s vocals are always produced by close-mic placements on his early folk releases; furthermore, they are rife with both the unstrained character of close-mic leadvocal tracks and extra-musical sounds that characterize the close-mic technique, like heaving breaths, lip smacks, loud plosives, and exaggerated sibilance, nose sniffs, and so on. For a more conventional Top 40 close-mic vocal sound, readers should consult Lily Allen’s vocals in the verses of “The Fear”; Fergie’s vocals during the choruses of “Glamorous”; Imogen Heap’s lead vocals on “Hide and Seek”; Madonna’s spoken verses on “Erotica,” and, arguably, Samuel Beam’s whispered vocals for Iron & Wine’s cover-version of the Postal Service’s “From a Great Height.”


Understanding Records, Second Edition

Playlist Title: Understanding Records, Chapter One: Close-Mic Placements A playlist of obviously pronounced close-mic placements. The list begins with a number of tracks from Elliot Smith’s early folk records which, to my ear, were always characterized by close-mic placements, especially on his almost whispered lead vocals.

Distance-mic placements Despite their ubiquity, close-mic techniques are not the only placement option open to recordists when they track. In fact, recordists regularly forego closemic placements altogether, especially if doing so seems likely to produce an aesthetically superior product. Rather than position microphones within inches of instruments, recordists may instead opt to place them anywhere beyond about 3 feet from the sound source. These distance-mic placements register significant amounts of room ambience and reverberations alongside the direct sound of the sound source itself, and they also produce tracks which obviously lack the same high-frequency “edge” that close-mic placements produce, because air absorption plays a more obvious role in the transduction. Aside from its earsplitting feedback, which made the record so critically controversial in the mid-1980s, the Jesus and Mary Chain’s Psychocandy is notable for the almost total absence of close-mic placements throughout. Every sound on the record is obviously, and completely, the product of distance-mic placements, save the Reid brothers’ laconic vocal tracks (the tracks are, however, treated with reverb set for cavernously long decay rates). Other exceptions to the predominance of close-mic tracking include Robert Plant’s lead-vocal tracks on a number of cuts from Led Zeppelin’s Physical Graffiti, including “Custard Pie,” “Trampled under Foot,” “The Wanton Song,” “Boogie with Stu,” and “Sick Again”; Tom Waits’s lead vocals on many of his celebrated “junkyard” records (i.e., Swordfishtrombones, Rain Dogs, Frank’s Wild Years, Bone Machine, and The Black Rider); and, finally, Mick Jagger’s lead vocals throughout the Rolling Stones’ Exile on Main Street, but especially on “Rip This Joint,” “Sweet Virginia,” “Sweet Black Angel,” “Loving Cup,” “Ventilator Blues,” “I Just Want to See His Face,” “Stop Breaking Down,” and “Soul Survivor.” Playlist Title: Understanding Records, Chapter One: Distant Placements (Vocals) A playlist comprised of obviously “distant” placements for vocals. Contrast the vocals heard on this list with those on the preceding “close-mic” playlist.

Tracking (Making Audio Signals)


The critical distance The specific ratio of direct sound to ambient sound on distance-mic tracks depends on a number of mediating variables. Frequency response, directional response, and microphone placement all play an equally important role in determining this ratio, as do the physical properties of the sound source itself, and the acoustic character of the tracking environment. That said, proximity usually plays the deciding role. When we are in a reverberant environment, the sound we hear is a combination of direct sound and reverberations. The mix between the two depends on how close the listener is to the sound source and room acoustics. When the two become equal, that is, when the direct sound reaches an equal level with its reverberations, this is called “the critical distance.” Critical distance plays an important role in microphone placement techniques. It is clearly evident in the placement technique that Bob Ezrin developed for tracking David Gilmour’s acoustic guitars on Pink Floyd’s The Wall, for instance: Whenever I recorded an acoustic guitar, the first thing I would do is stand in front of the guitarist and listen to the performance. Then I’d plug one ear— because most microphones only hear from one pinpoint source—and move around, listening with my open ear, until I found the spot where the guitar sounded the best. In that sweet spot, I would really pay attention to the tonal contour of the guitar. Is it rich in midrange? Does it have a big boomy sound on the bottom and very little presence? The trick, then, was not to use a microphone that had the same tonal qualities as the guitar. For example, a Neumann U47 would sound too muddy on a guitar with a boomy low end.9

Bruce Swedien describes a similar practice. Rather than position the microphone in relation to the sound source, however, Swedien works the other way around, moving singers a few inches away from microphones with each successive overdubbed take. Recording choral vocals, for instance, Swedien explains: My first mic choice would be high-quality, good condition large-[diaphragm] condenser microphones, such as the AKG 414 EB, or a matched pair of Neumann M149s. These two beautiful microphones . . . are superb for high quality vocal recording. I would then ask the singers to step back from the mic about two feet or so and record a double of the original part. By having the singers step back from the mic during this vocal pass, in order to keep the track levels consistent, we are forced to raise the volume level of the mics on this pass, thus giving greater acoustical support for the sound. Finally, I will normally mix these four tracks in the final mix in the same proportion on the same side of the stereo panorama as they occurred in the performance.10

Understanding Records, Second Edition


Swedien used this technique on a number of Michael Jackson’s most commercially successful records. In Swedien’s words: Michael [Jackson] is such an expert at doubling background and other vocal parts that he even doubles his vibrato rate perfectly! . . . I’ll have him double the same track at the same position at the mic. After that track, I’ll have him step back two paces and record a third pass of the same melody with the gain raised to match the level of the previous two. That raises the ratio of [reverberations] to direct sound. Blended with the first two tracks, this has a wonderful effect. Finally, I might even have him step back further and record a stereo pass of the same line. You can hear this technique in action for yourself on Michael’s background block-harmony-vocals on the song “Rock with You” on Michael’s Off the Wall album, or on the choir in the song “Man in the Mirror” on Michael’s Bad album. This technique tricks the ear into perceiving a depth field that isn’t really there [through the addition of discreet “early reflections” (see “Reverb” in the chapter “Mixing”)].11

Room mics Should close-mic and distance-mic placements fail to yield a sufficiently powerful and engaging sound on their own, recordists may opt to deploy both placements in tandem. Because it captures a considerable amount of room ambience, the distance mic in a tandem placement is usually called a “room mic,” and it is typically placed far enough past the critical distance in a room that the room’s ambience and reverberations transduce at an equivalent, if not greater, volume than the sound source itself. Though this tandem technique is everywhere apparent in the modern pop soundscape, it is also now the industry standard for tracking rhythm guitars in rock. Jimmy Page made extensive use of this technique throughout the late1960s and 1970s. The rhythm guitar part on Led Zeppelin’s “Communication Breakdown” provides a celebrated example. Placing a small 10 W Supro practice amplifier in what Page once called “a little tiny vocal booth kind of thing,” the guitarist-cum-producer then placed a dynamic microphone roughly 6 inches from the amplifier with its diaphragm pointed directly on-axis, and a room mic placed about 10 feet away. He then “summed” (blended) both signals onto the same track, producing the distinctive electric guitar tone which permeates the record. In Page’s words: There’s a very old recording maxim which goes, “Distance makes depth.” I’ve used that a hell of a lot on recording techniques with the band [Led Zeppelin]

Tracking (Making Audio Signals)


generally, not just me. You’re always used to [recordists] close-miking amps, just putting the microphone in front, but I’d have a mic in the room as well, and then balance the two tracks; because really, you shouldn’t have to use an EQ in the studio if the instruments sound right. It should all be done with the microphones.12

Rick Rubin and Ryan Hewitt used the room-mic technique to track John Frusciante’s electric guitar parts for the Red Hot Chili Peppers’ BloodSugarSexMagik. Positioning a Shure SM57 dynamic microphone within inches of Frusciante’s amplifier, and angled directly on-axis to add some highfrequency edge to the track, Hewitt then added a room mic further back in the room, specifically, a Royer R-121 ribbon microphone placed exactly 15 feet away from the amplifier.13 The same technique was also used to track Noel Gallagher’s lead guitar on Oasis’ “Champagne Supernova.” Gallagher tracked the part alone in a cavernous live room, with multiple room mics distributed throughout, some of which were spaced more than 30 feet from the amplifier. The tandem technique is also clear in Billy Corgan’s distorted guitar tone on the heavier cuts off Siamese Dream and Mellon Collie & the Infinite Sadness, that is, “Cherub Rock,” “Today,” “Bullet with Butterfly Wings,” “Zero,” and “Fuck You (An Ode to No One).” Playlist Title: Understanding Records, Chapter One: Massive “Room” Guitar Tones A playlist comprised of tracks which feature massive electric guitar parts, made either with use of a room mic or by emulating use of a room mic through modeling technology.

Room sound When recordists add a room mic to a mix, they also add the acoustic character of the tracking environment. British and American records produced throughout the 1960s and 1970s often sound so different, in fact, because recordists working in both countries often took completely different approaches to the craft of microphone placement. American recordists were famously fond of close-mic techniques during this time, meaning they were less eager to feature as much “room sound” in their mixes as were their British counterparts. This was hardly an unbreakable rule, of course, but it does provide an interesting window into the sonics of records produced on either side of the Atlantic during this period.


Understanding Records, Second Edition

Table 1.4 lists a number of celebrated tracks produced in New York City’s most famous recording studios, between 1954 and 1995. Though they did not do so to the same degree as their British counterparts, recordists working in these American studios nonetheless often placed room mics while they tracked. The goal was to capture as much of the studio’s sonic signature as possible, that is, as much of the unique tonal coloration which each studio’s peculiar acoustics contributed to a record, without disfiguring the impact and presence of the broader production. They then blended the signal from the room mic with Table 1.4  Celebrated pop records and the studios in New York City where they were tracked Studio A&R Studio 1

Artist, track

Quincy Jones, “Soul Bossa Nova” (1962); Stan Getz & Astrud Gilberto, “Girl From Ipanema” (1964); Peter Paul and Mary, “I Dig Rock and Roll Music” (1967); Van Morisson, “Brown Eyed Girl” Allegro Sound The 4 Seasons, “Rag Doll” (1964); The Critters, “Younger Girl” (1966); Tommy James and the Shondells, “Mony, Mony” (1968) Atlantic Studio 1 Big Joe Turner, “Shake Rattle & Roll” (1954) Atlantic Studio 3 Aretha Franklin, “Baby I Love You” (1967); Cream, “Sunshine of Your Love” (1968) Bell Sound/Hit Dionne Warwick, “Walk on By” (1964); B. J. Thomas, “Raindrops Factory Keep Falling on My Head” (1969); Kiss, “Rock and Roll All Nite” (1975) Columbia Simon & Garfunkel, “Homeward Bound” (1966); The Lovin’ Studio A Spoonful, “Summer in the City” (1966) Columbia Big Brother and the Holding Company, “Piece of My Heart” Studio B (1968); Simon and Garfunkel, “The Only Living Boy in New York” (1970) Columbia 30th Dave Brubeck, “Take Five” (1961); Tony Bennet, “I Left My Heart Street in San Francisco” (1962); Sly and the Family Stone, “Sing a Simple Song” (1969) Electric Lady Stevie Wonder, “Superstition” (1972); David Bowie, “Fame” Studios (1975); Weezer, “Say It Ain’t So” (1995) Power Station/ Bruce Springsteen, “Hungry Heart” (1980); David Bowie, Avatar “Let’s Dance” (1983); Madonna, “Like a Virgin” (1984) Record Plant Alice Cooper, “School’s Out” (1972); John Lennon, “Just Like Starting Over” (1980) Sundragon Studio The Ramones, “Blitzkrieg Bop” (1976); Talking Heads, “Psycho Killer” (1977) Talentmasters James Brown, “It’s a Man’s Man’s Man’s World” (1966); The Capitols, Studio “Cool Jerk” (1966); The Who, “I Can See for Miles” (1967)

Tracking (Making Audio Signals)


the more present tracks derived from close-mic placements. The idiosyncratic acoustics of each studio environment listed in Table 1.4—what recordists would call their “room sounds”—thus played a crucial role in shaping the sound of records made there. In so doing, these “room sounds” played a critical role in shaping the tone of the American Top 40 in the decades immediately following the Second World War. Playlist Title: Understanding Records, Chapter One: New York Room Sounds A playlist comprised of tracks noted in Table 1.4, in the order they are listed. Please note that some of the noted tracks are not available on Spotify. In those cases, their position in the order has simply been skipped.

Of course, given the prevalence of the project paradigm in modern record production, room-mic placements may have become, for all intents and purposes, an endangered species in tracking. Restricted by the limitations of the project environment, which is usually neither “tuned” (acoustically treated) nor large enough to accommodate the distances that room-mic placements require, project scenarios often leave little choice but to use signal processing to simulate the acoustic information which a room mic adds. Kevin O’Leary (Three Days Grace, Shawn Mendes, Walk off the Earth) had the following to say about his own microphone placement technique for recording pianos: Often when recording a grand piano I find that I am looking to get a little bit more vibe than I am able to get from exclusively traditional microphone placements. However, I still want more openness and clarity than I could get from recording an upright piano. To achieve this, I will couple a more traditional setup of a stereo pair (over either the hammers or the soundboard, depending on what kind of sound we are looking for), with a third microphone. The third microphone is a dynamic mic, often an SM57 placed under the piano pointed up towards the soundboard. This microphone will get heavily compressed and blended in with the stereo pair to taste. It adds a unique flavour and can help fill out the mids and warm up the overall sound of the piano. One thing to be aware of with this technique is the phase relationship between the top microphones and the under microphone. Depending on the positioning of the under mic, you might not get a great phase relationship by simply flipping the phase 180 degrees one way or the other. You may find it necessary to move the microphone around a little bit to find the best possible phase relationship.


Understanding Records, Second Edition

Direct-injection Recordists may forego microphones altogether when they track, and use so-called “direct-injection” (DI) technology instead. Rather than route the output of an electric instrument to an amplifier, which is then transduced, recordists simply patch the instrument directly into preamplifier circuits on a mixing console, into signal-processing devices, or into analog-to-digital converters, depending on what resources they have at their disposal, and from there, through to storage. However, if the output signal of the “direct” instrument fails to achieve so-called “line level,” that is, if the instrument output remains too meager in level to be of any practical use to recordists, or if its output impedance does not match the impedance requirements of the input device, recordists may introduce a DI box as an intervening stage (“impedance” measures the level of opposition to the flow of electrons in a given system, or, for our purposes, the strength of the output of an instrument, measured in terms of ohms or “Z”; thus, high-impedance, or “high-Z,” instruments, like electric guitars, output signal at a weaker strength than do their “low-Z” counterparts, and they thus require preamplification to reach “line level”).14 Tracks registering below “line level” require too much amplification to reach a workable dynamic. Recordists boost the self-noise of the recording system over the audio itself when they try to pump such signals to workable levels. These tracks thus require an intervening stage, namely, some kind of DI processing. Because “line-level” instruments like samplers, sequencers, and synthesizers do not require DI, the process remains most common for recording so-called “high-Z” or high-impedance instruments like electric guitars and electric bass. Whether or not they require a DI box—and it should be noted that any digital interface worth using nowadays has onboard DI circuitry built-in—DI techniques produce a distinct sound. Most obviously, DI techniques produce an extremely “dry,” or non-reverberant, tone. They also produce a significant amount of high-frequency edge, because air absorption plays no role in the transduction. While some recordists complain that DI tracks sound “too dry,” or “too edgy,” DI techniques remain standard for tracking electric guitars and electric bass. The leeway that DI affords recordists for processing tracks later on in the (multitrack) production process, when the requirements of the mix are more fully evolved, is simply too beneficial to ignore. Moreover, DI allows recordists to track without generating “bleed” (leakage) between tracks. This can be extremely useful for bands whose music benefits from a primarily

Tracking (Making Audio Signals)


live rather than a sequentially compartmentalized multitracking process, but who lack access to sufficiently large live rooms (or budgets) for recording. Though they did not invent the practice, the Beatles were one of the first rock bands to regularly feature DI tracks on their records. In fact, the band’s seminal Sgt. Pepper’s Lonely Hearts Club Band was one of the first records to feature DI tracking techniques throughout. Following the lead of Peter Brown and Peter Vince, who used DI technology to track electric bass parts for records by the Hollies and the Shadows in 1966, almost every electric bass track on Sgt. Pepper’s was recorded via DI. Rather than patch Paul McCartney’s electric bass into an amplifier, as convention dictated, engineer Ken Townsend routed the bass through a DI box that he himself had made, and from there onto an open track on the mixing console. As Townsend himself recalls: One of the most difficult instruments to record was the bass guitar. The problem was that no matter which type of high quality microphone we placed in front of the bass speaker it never sounded back in the Control Room as good as in the studio. One of the problems could well have been that the Altec monitors in the Control rooms were “bass light,” so no matter what you did, it made little difference to what you heard. However, although you couldn’t hear it, there was bass on the tapes, and so the records were much better than you imagined at the time. . . . Most of my experimental work on bass guitars came on the Hollies and The Beatles sessions, and it is not easy to remember with which I came up with the idea of feeding the output from the bass guitar straight into the mixing console. I named this method “Direct-injection.”15

McCartney’s DI electric bass tracks are most prominent on Sgt. Pepper’s in “Lucy in the Sky with Diamonds,” “When I’m Sixty Four,” “Lovely Rita,” and “A Day in the Life.” McCartney would also direct-inject his electric bass tracks for “Only a Northern Song” and “I Me Mine,” and George Martin would do the same to create the searing and heavily “fuzzed” lead-guitar tone which introduces the Beatles’ “Revolution.” More recently, the Dave Matthews Band made compelling use of the DI tone on their second major label release, Crash. Cuts like “Two Step,” “Too Much,” and “Tripping Billies” prominently feature a direct-injected acoustic-electric guitar throughout. The album’s opening track, “So Much to Say,” offers the clearest example. The track begins with only an acoustic guitar tracked using conventional means (i.e., a small-diaphragm condenser microphone placed close to the guitar’s 12th fret). At 0:46 on the track, producer Steve Lillywhite then pans a directinjected acoustic-electric guitar track to the right side of the stereo spectrum.


Understanding Records, Second Edition

The transduced acoustic guitar track in turn pans hard left, creating a clear tonal dichotomy between the ultra-dry and ultra-edgy tone of Matthews’s directinjected guitar on one side and the ambience-infused sound of his transduced acoustic guitar on the other. R&B singer Adele’s “Daydreamer,” the opening track on the album 19, provides another exceptionally clear example: the track is solely comprised of a direct-injected acoustic-electric guitar and Adele’s vocals. Playlist Title: Understanding Records, Chapter One: DI A playlist comprised of obviously direct-injected bass, rhythm, and/or leadguitar tracks.

DI-transduction tandems Some recordists find the DI tone simply unacceptable. The dryness of DI tracks is overly jarring, they complain. The high-frequency edge they add to a production is earsplitting and, moreover, the attack transients are grossly exaggerated. To get a sense of this complaint, readers are encouraged to compare the picking sounds on the acoustic guitar track which opens Dave Matthews Band’s “So Much to Say” with those on the DI track that enters at 0:46. To remedy concerns about the tone of DI tracks, recordists often deploy DI and transduction techniques in tandem. The “dry” (DI) signal and the “wet” (transduced) signals are either (i) routed onto two separate channels, and then balanced to form a composite image in mixdown or (ii) blended immediately onto a single track. This allows recordists direct access to the transient detail and clarity of the “dry” track, while also providing access to the ambience captured on the transduced track, which can be useful for mitigating the abrasive “edge” of the dry track. Once again, the Beatles were among the first to deploy a DI-transduction tandem on record. After they recorded Sgt. Pepper’s, the band regularly directinjected and transduced Paul McCartney’s electric bass tracks (this development was no doubt inspired by the band’s adoption of eight-track tape in 1968). The tandem technique was used to record electric bass parts for Beatles tracks like “Blue Jay Way,” “Baby You’re a Rich Man,” “Hello, Goodbye,” and almost every electric bass part on The Beatles (aka “The White Album”), Abbey Road, and Let It Be (except “I Me Mine”). Of course, once the Beatles used the technique so regularly it could not be long before it became standard in the record industry. Nowadays, when they are not sequenced, electric bass parts are often tracked via either (i) DI or (ii) DI-transduction tandems. Clear illustrations of the tandem

Tracking (Making Audio Signals)


tone can be heard in the electric bass on Devo’s “Whip It,” Dire Straits’ “Money for Nothing,’” Duran Duran’s “The Reflex,” and the Stone Roses’ “Fool’s Gold.” “DI boxes, though they don’t always sound pleasant, do have their place in the studio,” notes Kevin O’Leary in another exclusive interview conducted for this book: They can be useful both sonically and functionally. When recording a bass guitar I will always record a DI. A good DI box on a bass will normally sound very good and offers you a lot of options down the road when mixing. This is not to say that I would not record an amp as well. If there is a good amp available, I will record that in addition to the DI when possible. With electric guitars I do not always record a DI, but often will. One of the main reasons that people record a DI with an electric guitar is to re-amp the guitar later if necessary. I am a big fan of getting good sounds and committing to them when recording, so I don’t normally use DIs for this purpose. I will record a DI in addition to the amp if the guitar tones are fairly distorted and are likely to need editing later on. It is much easier to find the transients and edit properly on a clean DI than it is on a heavily distorted guitar track. DI boxes can be useful sonically on electric guitars as well. They can add a very particular vibe to clean guitars that can work well in some cases. They can also sometimes add a nice clarity and subtle bite when sparingly mixed in with heavier sounding guitar amps. I rarely record a DI with acoustic guitars. However, if the guitar has a pickup, I will normally plug one in and have a listen. Although I hardly ever like what I hear with acoustic DIs, every once in a while it will actually complement the microphones nicely when blended in. It doesn’t take long to set up, and can be worth it to have a listen, even if you only end up recording it once in a while.

Playlist Title: Understanding Records, Chapter One: DI-Transduction Tandems A playlist comprised of DI-transduction tandems noted in the section directly above.

Sequencing The final tracking procedure which I examine in this field guide is sequencing, that is, operating a sequencer to create one or more component tracks in a multitrack production. The first sequencers to achieve any kind of prominence were computerized keyboards attached to synthesizer modules produced by Moog, ARP, Roland, and Buchla. These original sequencers could control up


Understanding Records, Second Edition

to sixteen “events,” or pitches, on a connected keyboard, at any given time. And they were most often used to create rhythmic “ostinatos” (sequences of pitches repeated with rhythmic regularity). Recordists would input pitches into the sequencer and, with a few more button pushes, the resulting sequences were (i) looped; (ii) sped up and slowed down; and (iii) tonally processed, using a connected synthesizer module. Moreover, even as the sequence sounded, recordists could “modulate” (modify) the frequency and amplitude of each event in the sequence, and perform other adjustments, using specific dials on the connected module. Pink Floyd’s “On the Run,” the second track on the band’s seminal Dark Side of the Moon album, provides a clear record of early sequencing practice in the progressive rock genre. Roger Waters, who was responsible for the majority of sequencing work on Pink Floyd’s records throughout the 1970s, used a Synthi-A to craft the sequence that courses throughout “On the Run.” The sequence fades in at 0:12 of “On the Run.” For the next 4 seconds, between 0:12 and 0:16, Waters subtly modulates the sequence, adjusting the cut-off frequency of a low-pass filter on the connected module, moving it up and down the audible spectrum (low-pass filters mute all frequencies above a selected cut-off frequency). Waters then repeats the process between 0:51 and 1:04, and again between 1:25 and 1:37. From 1:10 to 1:18, Waters modulates the amplitude of the sequence, creating the rhythmic trembling known as tremolo. And then between 1:45 and 2:20, he modulates its frequency and amplitude in turn. He modulates frequency from 1:45 to 1:50 and 2:09 to 2:15, and amplitude from 2:01 to 2:08 and from 2:18 to 2:20. Waters continues to modulate the sequence until a plane crash sounds at 3:03, hurried steps cross the stereo spectrum, and “On the Run” very slowly cross-fades into “Time,” the album’s third cut.16 A recent documentary film about the making of DSOM—titled, appropriately enough, The Making of the Dark Side of the Moon—provides footage of Roger Waters while he sequenced “On the Run.” What is surprising about the footage, from a modern perspective, is that Waters performed the Synthi-A precisely as he would have performed, say, the electric bass. Tracking before the emergence of modern “workstation” sequencers, which allow users to program modulations using “track automation,” Waters had no choice but to treat the Synthi-A as he would any other instrument, specifically, as a sound-producing technology, which, despite its sequencing capacities, required human intervention. In fact, the Synthi-A did not generate a single pitch of “On the Run.” Sequencers themselves don’t actually generate sounds. What a sequencer does is provide some connected sound-producing mechanism with “control” information,

Tracking (Making Audio Signals)


which tells that mechanism what sounds it should generate, at what volume, and for how long. Viewed from this perspective, sequencing looks much less like a species of performance practice and much more like a kind of musical programming. Playlist Title: Understanding Records: Audio Concepts & Demonstrations Track: 6. Sequencing and Modulation A brief sequence of ten “events” (or pitches) loops five times, before its frequency and then amplitude are “modulated” in turn.

The workstation paradigm Early sequencers only stored and triggered sequences of pitches. Today, however, sequencers—or, as modern software sequencers are called, “digitalaudio workstations” (DAWs)—store, trigger, sequence, and process (i) control information for virtual synthesizers; (ii) samples; and (iii) digital-audio tracks. Moreover, modern DAWs usually feature so-called “native” (built-in) analog-todigital, and digital-to-analog, conversion capacities, which means that they can store, process, and render transduced and direct-injected audio information as well as more traditionally sequenced data. Most records are made using DAWs now. Even if a band chooses to record entirely using analog technology, the mastering process requires conversion through some DAW for the tracks to be available for digital playback (i.e., CD or untethered downloading) and streaming. Most commonly, recordists work on the ProTools, Logic or Ableton DAWs. These workstations function exactly like early sequencers once did, but with signal processing, audio sampling, software synthesis, and editing capabilities thrown into the mix. These modern workstations also store transduced and direct-injected audio information, meaning they also function much like tapes did. Perhaps unsurprisingly, many recordists have heralded the emergence of the software DAW as the simultaneous death of the analog paradigm in Recording Practice. Sequencing, these recordists claim, constitutes an entirely new paradigm for record production. Rather than concentrating their creative efforts on tracking and mixing, recordists now track only to create raw audio they will later shape in the DAW. In other words, while the analog paradigm tended to focus on getting sounds right at the “point-of-entry,” that is, while the analog paradigm held tracking as the primary compositional activity of record production, the sequencing (or “workstation”) paradigm makes editing, signal processing, and mixing the crucial compositional activities.


Understanding Records, Second Edition

Whether or not sequencing indeed represents a massive paradigm shift in record production—and compelling arguments can be made either way— remains to be seen. What has certainly accompanied the DAW to the musical fore are certain tracking and processing procedures which can only be achieved using personal computers. I call these procedures “step inputting,” “real inputting,” and “digital-audio” recording. Before I can survey these techniques, though, I should first explain how it is that digital technology “hears” in the first instance.

Digital-audio: How computers hear Recordists call the changes in electrical current that microphones produce, and that audio-tape registers, “analog,” because the changes are directly proportional, which is to say, they are directly analogous to the changes in air pressure that produce them. However, every single thing that happens in a computer boils down to a sequence of 1s and 0s. For sound to exist on a computer, then, it has to be represented as numbers. In most cases, this “translation”—from sound into numbers—is achieved using an analog-to-digital converter (A/D or ADC), which measures incoming voltages from an analog device and converts those voltages into discrete numeric values. Translation in the other direction, that is, from numbers back into sound, requires a digital-to-analog converter (D/A or DAC), which translates the discrete numeric values in a digital-audio file back into analog voltages. Digital-audio mediations do not end at AD/DA conversion, however. To be of any use to a workstation, sound must pass through two more layers of data processing, namely (i) sampling and (ii) quantization. Just as a movie doesn’t actually consist of moving pictures, a digital record doesn’t actually consist of moving sounds. Movies and digital records rely on a kind of stop-motion animation to achieve the illusion of motion. Movies are actually rapid sequences of single photos, called “frames,” projected onto cinema screens at rates of roughly 24, 25, and 30 frames per second. Likewise, digital records are actually rapid sequences of sonic snapshots, called “samples,” transduced in such rapid succession that listeners hear a continuous sequence of sounds, rather than a series of discrete digital-audio recordings. To achieve this illusion, samples must be transduced at a staggering rate of at least 44,100 samples per second, as the famed engineer Harry Nyquist determined decades ago. Therefore, digital-audio technology must be able to create samples with equal rapidity. In fact, a sample

Tracking (Making Audio Signals)


rate of 44,100 samples per second (44.1 kHz) remains the industry standard for professional and amateur digital-audio systems. This is likely a holdover from the earliest days of the compact disc, though, when digital technology simply could not sample any faster; sample rates of up to 192 kHz can, and often do, figure in the recording process now. Whatever the sample rate, each time a digital-audio device samples a sound source it defines what it hears as a particular binary value. In other words, samples are “quantized,” or digitally mapped, to discrete digital values. During quantization, samples are converted into digital “words,” which are discreet sequences of binary digits (“bits”), each of which is as long as the “bit depth” of the digital-audio device itself allows. When a device has a bit depth of only 1 bit, for instance, samples are quantized to a value of either 1 or 0. Such a limited bit depth measures only the presence or absence of sound. A bit depth of 2 bits, on the other hand, doubles the number of quantization levels available for quantization up to four, in which case the device can begin to measure pitch and volume; a bit depth of 3 bits generates eight quantization levels; and so on.

Figure 1.19  A 2-bit resolution with four quantization levels and a 3-bit resolution with eight quantization levels. The most commonly used resolutions are 16 bits, and 24 bits, which offer 65,536, and 16,777,216 quantization levels, respectively.17


Understanding Records, Second Edition

Increasing bit depth increases resolution, which is the number of quantization levels a digital-audio device can use to map, or quantize, sounds to discrete binary values. Compact discs require a 16-bit resolution even though most digital-audio devices now allow recordists to track at resolutions of 32 bits or more. Though this may seem like yet another purely academic consideration, discrepancies in resolution can lead to a number of practical problems. CDs and CD players, rare as they are nowadays, comprise a 16-bit system, for instance. When recordists track for compact disc using modern 24- or 32-bit systems, they have to “dither” what they record to a resolution of 16 bits. Far from a merely procedural consideration, the dithering process can induce a number of distortions.18

Workstation capacities The workstation paradigm has been embraced by so many recordists that the capacities of DAWs themselves now underwrite modern Recording Practice in its entirety. This is not to say that recordists only use workstations to track their records now. In fact, some recordists conspicuously, and vocally, avoid the workstation paradigm altogether. Tool, for instance, when recording the album 10,000 Days, used reel-to-reel tape technology and analog processors exclusively. Nonetheless, the album was, and has since been, mastered for digital formats like CD, untethered downloading and streaming, using a DAW. Similarly, when Pearl Jam released Riot Act, they marketed the record as an “anti-ProTools statement.” As drummer Matt Cameron put it: This [Riot Act] is definitely our [read: Pearl Jam’s] anti-ProTools record. To us, a song like “Can’t Keep” is proof that it’s more interesting hearing musicians in a room playing hard, with the tempo fluctuating as the band heats up. Perfection is boring.19

Producer Steve Albini, who has been anything but shy about his own antidigital point of view, recently echoed Cameron’s concerns. As Albini sees it, the emergence of the workstation paradigm is a lamentably amateurish development in Recording Practice driven by non-technical people, wanting to get involved in recording, without investing time, and money, in equipment, and learning about it. Most small

Tracking (Making Audio Signals)


semi-professional studios are run this way. . . . There is a lot of use of ProTools in professional studios, but this is mostly for the special effects it allows, not for sound quality. These special effects soon fall out of fashion, and I don’t think this trend will define studios permanently. Do you remember when real drummers were told to sell their drum kits, because drum machines were going to take over? That same false prophesy is happening with regard to analog tape machines and ProTools now. It is a trend, and it will have some permanent impact, but it doesn’t replace analog systems, which are more durable, sound better and are more flexible.20

Musical luddism aside, most recordists see at least some value in the workstation paradigm, even as they acknowledge that a vast and quantifiable discrepancy exists between the sounds that analog and digital technologies can produce and more importantly between the kinds of workflows both paradigms encourage. During the mid-2000s, the number of workstation users increased exponentially. Not surprisingly, then, during that same period the functional capacities of workstations themselves came to exert an increasingly profound influence on the way that pop records were made. Records are now often made through a combination of transduction, DI, and sequencing techniques; and though I risk invoking the ire of a great many practicing recordists in saying so, I do not think it matters all that much which particular workstation recordists ultimately choose to adopt. Almost every DAW shares most of its functionality with every other workstation. Just as Les Pauls and Stratocasters are neither better nor worse than the other—they are different guitars preferred by players for different purposes—so, too, are DAWs basically the same. All that really changes from DAW to DAW is “workflow,” that is, the sequence of actions required to make use of their capacities, and each DAW features certain exclusive synthesis, sampling and/or signal-processing capacities, which some recordists covet over the unique capacities of other DAWs. Most DAWs are organized around the metaphor of a multitrack mixing console. The so-called “arrange window” in a DAW, which is where the lion’s share of work is done, features a series of channel strips, or tracks, organized sequentially as on a multitrack mixing console (see Figure 1.20). Each track can be assigned to (i) a General MIDI (GM) instrument (though, in my experience, this has become a moot feature now due to lack of use); (ii) a software synthesizer; (iii) a sampler; (iv) a drum machine; or (v) audio, which means it will be used to store transduced or direct-injected audio recordings. Each channel strip, in turn,


Understanding Records, Second Edition

features a set of open “plug-in” slots, reserved for algorithmic signal processors (e.g., compression, delay, etc.), and a set of open “send” slots reserved for bussing (i.e., auxiliary sends). A recordist need only assign a plug-in to a particular track, or bus a track to a send channel with processors active, to significantly alter a track’s dynamic and timbral content.

Figure 1.20  Typical channel strips on a digital mixing board. In this case, the channel strips are from a LogicX session file.

Tracking (Making Audio Signals)


Once they have assigned an instrument to a channel strip—and, if they so desire, signal-processing plug-ins—recordists can then input musical information. They do so by (i) step inputting; (ii) real inputting; and (iii) transduction (i.e., transduced and direct-injected audio). I will now explain each of these techniques, in turn.

Step inputting Step inputting encapsulates all that modern sequencing, especially computerbased sequencing, requires of recordists. To step input musical information so it becomes part of the sequence of musical events that a workstation triggers on playback, recordists enter “ticks” into either a “matrix editor” (aka a MIDI grid) or a “step sequencer.” The matrix editor is usually accessible in the arrange window of a workstation, while the step sequencer is usually located on the instrument interface itself. Matrix editors divide the audible spectrum into pitches, which are stacked vertically, usually according to the metaphor of a piano keyboard, and rhythmic subdivisions (see Figure 1.21). Recordists step input ticks, that is, vertical and horizontal values, into the matrix editor to indicate where in the audible spectrum, at what volumes and for how long, the assigned instrument should generate sound. If the matrix editor doesn’t suit their workflow preferences, however, recordists might instead opt to use a step sequencer (see Figure 1.22). Most virtual drum machines are equipped with a step sequencer: a row of rectangular slots, flipped onto their side, representing eighth, sixteenth, and thirty-second note subdivisions of the global tempo.

Figure 1.21  The matrix editor (aka MIDI grid) in LogicX, with four ticks (entries) of varying length and velocity (volume) inputted.

Understanding Records, Second Edition


Figure 1.22 The step sequencer on LogicX’s Ultrabeat virtual drum machine. Horizontal position designates when, in a 4/4 bar, a sample should sound, and vertical height designates the volume.

Drum machines (and samplers) are usually armed with libraries of (i) “hits” (or impulse samples) comprised of single-component musical units like a single kick drum hit or a rim shot, which recordists sequence together to create broader musical patterns like beats, grooves, riffs, and fills; (ii) “loops,” which are longer samples usually comprised of already-sequenced beats and riffs that can be used to create larger arrangements; and, finally, (iii) “open” slots, which recordists can fill with self-produced samples. Recordists select a sample, and a time in the sequence for that sample to trigger, by inputting ticks on the step sequencer. If they want a kick drum sample to sound on beats 1 and 3 of a common time (4/4) measure, for instance, and a snare drum sample to sound on beats 2 and 4, recordists need only select the desired samples from the sample library and tick the corresponding slots on the step sequencer (see Figure 1.23). Playlist Title: Understanding Records: Audio Concepts & Demonstrations Track: 7. Triggering Samples A drum machine sequence loops five times. The same sequence is then heard a number of times, but now triggering different samples after every five loops.

Real inputting “Real inputting” provides a DAW with the same information as step inputting. When recordists real input a sequence, however, they perform that sequence in real time on a connected instrument, usually a keystation (once called a “MIDI controller”), while the workstation is set in record mode. Recordists typically opt to real input their sequences when step inputting creates an overly precise sequence of sound events, or, alternatively, when the sequence they want to input is very complex, and step inputting thus presents an overly time consuming option.

Tracking (Making Audio Signals)


Figure 1.23  A sample library menu for snare hits on LogicX’s Ultrabeat virtual drum machine. The horizontal and vertical values in the step sequencer do not change when recordists choose a different sample, only the sample which those values trigger.

In fact, recordists have recently begun to openly rebel against the mechanical precision of step inputting, opting to intentionally misalign the tempo of component samples with the global tempo of tracks. They may also use differing swing-quantization settings for various sequences in a multitrack production, which algorithmically misaligns the rhythmic symmetry of input ticks; or they can subdivide the beat in their workstation according to variously misaligned global tempo schema. Flying Lotus’s 1983, Los Angeles, Cosmogramma, Reset, and


Understanding Records, Second Edition

L.A. EP 1x3 and Prefuse 73’s Preparations and Everything She Touched Turned to Ampexian clearly illustrate this principle at work in the modern pop soundscape. Though I could inventory some techniques here that recordists have used to generate the distinctive feel of tempo misalignments, I feel they are best understood through demonstration. Matt Shelvock (aka kingmobb) has prepared an exclusive video for this book, which demonstrates how he uses “tempo misalignment” techniques in his creative practice. Before proceeding any further, readers are urged to navigate to the YouTube channel I have created to house the video material I reference in this book, which you can find by visiting jayhodgsonmusic.com, to watch Matt demonstrate the nuances of his craft (the video is titled: “kingmobb explains tempo misalignments”). For the remainder of this book, I will indicate where and when readers should view video demonstrations of concepts and techniques precisely as I indicate where and when they should consult playlists on Spotify. All video demonstrations can be found, in the order they appear in this field guide, on a playlist titled “Jay Hodgson: Understanding Records Videos.” If you enter this playlist title into YouTube’s search engine, it will show up first on the list of results. As such, you will only see a “Video Title” designated in this text, and then a brief explanation of who the video features and what they demonstrate. If this doesn’t work, you may always navigate to the companion website for this book, jayhodgsonmusic.com, and follow the links to these videos from there. As with the Spotify playlists featured throughout this book, readers are urged to view these videos precisely when they are indicated, and before reading further. Video Title: kingmobb Explains Tempo Misalignments Matt Shelvock, aka kingmobb, takes us on a brief tour of some tempo misalignment techniques he has used on many of his commercial releases.

Video Title: kingmobb Explains Sample Curation Matt Shelvock, aka kingmobb, explains his sample-curation practice in one of many exclusive videos made for this book. A crucial, and often overlooked, component of sample-based music production, we will hear where Matt gets many of his samples from, how he curates them, and to what musical ends.

Playlist Title: Understanding Records, Chapter One: Tempo Misalignments A playlist comprised of tracks which clearly, and famously, feature tempo misalignments. The playlist begins with tracks by J. Dilla, Flying Lotus, Prefuse 73, and kingmobb.

Tracking (Making Audio Signals)


After they real input control information into a workstation, recordists can always use the matrix editor to revise the recorded performances. Among other things, recordists can adjust (i) pitch information; (ii) timing information; (iii) events in the sequence itself (i.e., they can delete ticks and move them to different horizontal locations); (iv) velocity (i.e., volume) information; and (v) which virtual instrument, or samples, the sequence triggers. In doing this, recordists use the matrix editor to adjust only those participatory discrepancies which they deem overly distracting, leaving intact the general impression of a live performance which real inputting conveys.

Figure 1.24  Matrix editor “before” and “after” tempo quantization, which automatically aligns tracks to absolute horizontal (rhythmic) values.

Figure 1.25  Matrix editor “before” and “after” tempo quantization, which automatically aligns tracks to absolute horizontal (rhythmic) values.


Understanding Records, Second Edition

Digital-audio Alongside their sequencing, synthesis, and processing capacities, DAWs also provide (i) a storage site for digital-audio tracks produced by transduction and DI; and (ii) a suite of post-production signal processors called plug-ins (I explain signal processing in the next chapter). Arming an audio track in the arrange window of the DAW routes all incoming audio onto that track. As it records audio data, the workstation renders it in waveform format, within that armed (recording) track. What results is a so-called “audio object,” as seen in Figure 1.26. Once the workstation renders an audio object, recordists can re-tailor it to suit the needs of any given project. They can re-size the audio object, for instance, should they want a longer or shorter duration. Recordists can also move the resulting resized audio object, back and forth within the track-slot, should they desire a different start and stop time. Finally, recordists can copy, cut, and paste an audio object, or a segment of an audio object, as they might a word or phrase in a word processor, and they can loop the resulting (pasted or copied) audio objects so that they repeat however often they see fit. Some workstations also allow recordists to force warp an audio object so that a defined region of the object triggers at a faster or slower rate than the rest of the object.

Figure 1.26  An “audio object” in LogicX. Note that the “audio object” begins shortly after the downbeat of measure 1 and ends on beat 3 of measure 2.

Figure 1.27  The same audio object as in Figure 1.26, resized to end precisely on the downbeat of measure 2.

Figure 1.28  The resized audio object from Figure 1.27, moved to beat 1 of measure 1.

Tracking (Making Audio Signals)


Figure 1.29  The audio object from Figure 1.28, but with beat 1 cut and pasted to the downbeat of measure 2, copied to beats 2 and 3 of measure 2, and then looped five times.

Indeed, modern DAWs allow recordists to transform the audio they record in a galaxy of ways. In the chapters that follow, I study a number of the most crucial tools and techniques that comprise this galaxy, and I explore some of the aesthetic programs recordists serve in deploying them. Video Title: Jay Hodgson Surveys the Fundamental Capacities of a Modern DAW In this video, I demonstrate some of the core constitutive techniques that make up the “workstation paradigm” of recorded musical communication, especially those which I survey above, as they are used in specific musical circumstances. I use LogicX, which is my preferred DAW for most creative processes, but all of the techniques I survey can be done on most professional workstations.

A quick note on editing Given how ubiquitous audio editing has become in the last decade, I should quickly address the process here. In fact, I would say that audio editing now structures the tracking process completely. I was recording bass at Noble Street in Toronto a few years back, working with a platinum-certified engineer, when this structure first revealed itself to me. Once upon a time, when I started recording in the early 1990s, everyone looked to the producer after every take. The producer would either nod their approval or motion for another take. It was really that simple. I remember a session I worked on with Johnny “Natural” Najera, in the mid-1990s, which to my mind perfectly encapsulated this ethos. The session was for a student production project—we were both enrolled at the Berklee College of Music, in Boston, at the time—which tasked students with reproducing, as faithfully as possible, the sound of the Animals’ “House of the Rising Sun.” I thought it would be an easy and quick session—the guitar part was certainly undemanding, I thought—and I had even made plans to meet up with some friends a couple of hours after. But I quickly had to cancel those plans. I didn’t leave the session, in fact, until early the next morning, sweaty and exhausted as if I’d just been for a jog.


Understanding Records, Second Edition

I should have known then that Johnny would go on to win Grammies, play a Super Bowl halftime show, and even record with the likes of Usher, Pharrell, Diplo, Kylie Minogue, Snoop Dogg, and so many other legends. On the session I worked for him, Johnny was relentless, as every great producer should be. He spent what seemed like hours working with each musician, not just nailing down tones and parts, but teasing out extremely subtle nuances in our performances. To be frank, after that session, I heard nuances in my playing that I had never taken the time to notice before, and I was extremely grateful. More importantly, when we left that session, the record was basically done. All that was needed was a bit of leveling, and the track was ready for mixing. I don’t think that sort of session goes on much anymore. Now things seem to be endlessly open and provisional, especially at the tracking stage. Recordists have become almost pathologically unwilling to commit at pointof-entry, deferring any definitive decisions for a later stage in the process. Indeed, recordists now seem to pursue something like “optimal provisionality” when tracking, meaning that parts are recorded and shaped to suit a later post-production process, whereas post-production was once only a way of finalizing sounds and performances that musicians nailed down during tracking. Perhaps this is a generational observation: Is it only those of us who have experience working with tape that see things this way? And, to be clear, I am not complaining; this is simply the way things are done now. Working with tape, the producer’s role was often simply to take responsibility for a session, to be a reassuring presence meant to let musicians know that someone, somewhere, had an eye on how all the various parts they were recording would later relate. This was clearly what Johnny was doing on that session way back when we were students at Berklee. On the session at Noble Street, however, it was the young editing engineer who ran the show, sprawled as he was across the couch at the back of the control room, with a vanilla milkshake in one hand and his other hand lazily tapping at the keys of a MacBook balanced on his lap. Every so often, he would look up from his computer and nod his approval. When he did that, though, it was to say that the audio material had been captured optimally for editing later on, that is, that it was optimally provisional.

Comping, timing, and tuning In fact, to my mind the terms “post-production” and “editing” are misnomers now. Modern recordists often deploy editing and tracking simultaneously

Tracking (Making Audio Signals)


(or, at least, in tandem) now. In this way, the distinction between “production” and “post-production” is artificial, as they are really connected parts of a broader procedural “whole.” This said, recordists may still edit for effect long after tracking is over, or they might edit to correct marginal errors that do not warrant retaking, as part of a process usually called “timing and tuning.” The list of motivations for editing audio is vast. What ultimately matters for a field guide such as this is simply that we recognize editing—which is nonlinear, non-veridic (even if done to produce veridic sounds), and aimed at reshaping tracked performances to suit “control room” evaluations—as a musical technique which is, for the most part, wholly unique to Recording Practice. In fact, many editing techniques allow recordists to transform the performances they capture during tracking, if not to idealize them. The simplest example of this practice is likely “comping.” When they “comp,” recordists assemble, or “comp,” a composite part from many takes, choosing only the best portions of each. In this way, “comping” resembles a kind of curatorial process, whereby only the most interesting, or musically superior, bits of each take are used for the final mix, with the portions that do not pass muster simply being discarded. Editing is also often done to correct minor timing errors and to fix any tuning issues a performance may have. Most DAWs can be used to fix timing errors without needing to access any outside plug-ins. Simply cutting, dragging, and sometimes resizing audio objects, along with deft fading and cross-fading, will easily achieve the desired effect. With tuning, however, most recordists now turn to plug-ins like Celemony Melodyne, which give them unprecedented control over nearly every melodic component of a track, from vibrato to pitch itself. The fact that LogicX now features tuning capabilities on par with Melodyne’s is indicative of the centrality of this sort of audio editing in modern recorded musical communications. Video Title: Alastair Sims Explains Editing, pt 1 (“Vocal Comping”), pt 2 (“Vocal Tuning”), and pt 3 (“Editing Guitar”). Alastair Sims, who has a gold record and a platinum certification for his work as an audio editing engineer on releases by the Tragically Hip and Rush, has graciously prepared three exclusive videos for this book, which provide a cursory overview of many of the audio editing techniques he uses when he works, including comping, timing, and tuning.


Understanding Records, Second Edition

Teo Macero and Bitches Brew Of course, editing is by no means exclusively digital; it is not even particularly modern. Recordists have edited tracks for decades now. Though it is certainly not the first example, Teo Macero’s production—or, perhaps better, his postproduction—of Miles Davis’s Bitches Brew offers an instructive example. Tape splicing and looping play the central role on most of the tracks on that record, and in many ways editing itself is the compositional activity listeners hear. Bitches Brew is non-veridic in its entirety, that is, the record does not sound like it could be, or like it was, achieved through a “live” performance. Nonetheless, and despite the controversy that has dogged it since its release, Bitches Brew remains one of the best-selling and critically celebrated offerings in Davis’s decades-spanning oeuvre. The record earned the trumpeter a Grammy nomination, an unexpected placing on the Billboard Top 40 record chart, the first gold certification of his career and, as noted, the contempt of many critics who saw the record as, in one anonymous reviewer’s words, “a nearly fatal commercial dive.” Those who valued Bitches Brew, however, considered it a paradigm shift in how jazz records should be made. Davis’s and Macero’s embrace of editing techniques on the record, like tape splicing and looping, struck many as “game changers” for the jazz genre (these techniques were already de rigueur in rock by 1969, of course). Though listeners hear a composite of ensemble performances on Bitches Brew, the editing techniques that characterize the record create a different musical focus than can be heard on more traditional jazz fare. For instance, “Pharaoh’s Dance,” the first track on Bitches Brew, is typical of the album. It is, according to critic Bob Belden, “a composite composition” more than a traditional jazz track. A glance at the edit slate for “Pharaoh’s Dance” would have explained why, I’m sure. Editing techniques, like tape splicing and looping, provide the compositional focus for the track, and they remain clearly audible throughout. Whereas Macero’s and Davis’s contemporaries in jazz typically used editing to perfect the performance they mixed their records to convey, Macero aggressively spliced, looped, and edited “Pharaoh’s Dance” into an entirely nonveridic state. A composite of thirty-five edits of material culled from three days of jamming in Columbia Studio B, “Pharaoh’s Dance” was, in fact, completed by Macero two days after tracking had concluded, and the very last note had already reached tape.

Tracking (Making Audio Signals)


“Pharaoh’s Dance” obviously progresses according to the peculiar logic of “post-production.” The traditional chronology of a jazz performance—that is, head (melody), improvisations, head (recapitulation)—is refigured each time Macero introduces another splice or loop. Fourteen obvious tape splices can be heard in the first three minutes alone, and, as each splice rudely interrupts the performance preceding it in medias res, attention is drawn away from the traditionally valued instrumental prowess and complexities of Davis’s and his crew’s performances to the way that Macero constantly assembles and reassembles those performances into a seemingly never-ending sequence of composite musical permutations. In this respect, the record looks ahead to what Greg Milner once called “the Protools world,” which is to say, the workstation paradigm of Recording Practice, which is “all about the arrangement, the orchestration, the mix . . . and not so much about playing and recording.”21 Playlist Title: Understanding Records, Chapter One: Tape Splicing & Miles Davis A playlist comprised of tracks from Miles Davis’s so-called “first electric period,” which spans roughly from 1968 to his initial retirement in 1975, that feature overt tape splicing.

Video Title: Understanding Records, Putting It All Together: Chapter One—Lonesome Somedays, pt 1 (“Tracking Guitars”) and pt 2 (“Tracking Vocals”) To see how all of the techniques I survey in this chapter currently cohabit the modern recordist’s toolbox, I have prepared a short video which surveys the tracking process for a song I wrote for this book, called “Lonesome Somedays.” There are in fact two versions of this song available on Spotify. The first features just an acoustic guitar and my vocals, and this is released under the heading “Lonesome Somedays (solo).” The video noted above surveys the process for tracking these two instruments. The other version of this song, simply titled “Lonesome Somedays,” features the traditional instrumentation heard in the solo version of this track, but now augmented by modern sequencing composed by my good friend and frequent music collaborator, Matthew Shelvock. By listening to the original solo version first, readers can get a sense of how added sequencing changes their experience of the song. Readers can also see how Matt sequenced parts for this track in his other videos for this book (he uses


Understanding Records, Second Edition his work on this track as examples in all of his videos throughout this book). Later in this book, readers will be prompted to view videos explaining how this track was processed, mixed, and mastered.

Playlist Title: Understanding Records: Audio Concepts & Demonstrations Track: 8. Lonesome Somedays (solo) Track: 9. Lonesome Somedays Both versions of “Lonesome Somedays” which were recorded to demonstrate how the various audio concepts and techniques surveyed in this book come together as musical practice.


Mixing (The Space of Communications)

When tracking is over, mixing begins. When they mix, recordists organize a record’s component tracks into particular spatial arrangements. Recordists usually liken this “spatial arrangement” to a three-dimensional canvas, which requires careful balancing front-to-back, side-to-side, and top-to-bottom. Moreover, every record is mixed. Even when recordists do not consciously mix a record—even in those incredibly rare cases when projects move directly from tracking to mastering—records always present some spatial arrangement of sound more than just sounds per se. Therefore, they are mixed. This has ever been the case. Recordists mixed records long before multitrack mixing consoles made market, of course. Phonograph cylinders and gramophone discs were clearly mixed. The musicians who made them simply arranged themselves into often awkward formations around the recording bells of acoustic recording technology while they recorded, and, in so doing, they mixed their records, or famous pioneers of recording, like Fred Gaisberg, guided them, sometimes by the elbow, toward and away from the gramophone’s recording horn while they sang and played. When Recording Practice began, then, producers mixed the performances they recorded precisely as they recorded them.1 The emergence of mixing as a “separate” or “discreet” phase in record production came almost fifty years after the likes of Fred Gaisberg, with the emergence of multitrack technology in the mid-1950s and, with that, the multitrack paradigm of record production. In fact, mixing arguably first emerged as a musical competence in its own right to “manage,” that is, to aesthetically cope with, some of the more overwhelming complexities that the multitrack paradigm suddenly interjected into the production process. In this chapter, I briefly survey this historical emergence, trace the aesthetic contours of the threedimensional canvas each mix construes, and introduce readers to the primary toolbox recordists use to create a mix, namely, signal processing.

Understanding Records, Second Edition


Mixing and the multitrack paradigm Though recordists were initially slow to take advantage of the possibilities that multitrack technology offered, by about 1967 most successful pop recordists insisted that at least two 4-track tape machines were needed to make a commercially viable release. A seemingly insatiable hunger for more and more tracks had seized the industry by then, which led to the development of a mixing technique known variously as “mixing as you go,” “reducing,” “bouncing,” “bouncing down,” and “4 to 4,” depending on who you asked. Whatever it was called—“bouncing” seems to have won the day—this practice represents the first multitrack mixing technique in the modern “separate phase” sense, insofar as it was a technique crucial to the production process that remained entirely the purview of recording engineers. This said, it is important to keep in mind that “bouncing” nevertheless remained wholly ingrained within the tracking process, a part and parcel of audio capture, during its commercial heyday in the late 1960s and early 1970s.

Bouncing To clear track space for the ever growing number of overdubs which recordists suddenly required in the late 1960s, mix engineers “bounced” tracks from one 4-track tape machine onto one or more open tracks on another connected machine. Bouncing, however, inevitably led to a notable degradation in sound quality. Given the low-frequency bias of tape in general, the high-frequency content of a track was successively eroded with each successive bounce. Consequently, bouncing became a mixing skill in and of itself. By 1967, capable mix engineers could bounce tracks without sacrificing sound quality, such that any lost high-frequency content could easily be recouped through overdubbing and signal processing. Interestingly, though it sounds like a logistical nightmare, many recordists continue to describe the era of bouncing as a kind of “golden age” for mixing in general. As Andy Johns recalls: You know why Sgt. Pepper’s sounds so good? You know why Are You Experienced? sounds so good—almost better than what we can do now? It’s because, when you were doing “4-to-4” (bouncing from one 4-track machine to another), you mixed as you went along. There was a mix on two tracks of the second 4-track machine, and you filled up the open tracks and did the same thing again. Listen to “We Love You” by the [Rolling] Stones. Listen to Sgt. Pepper’s [by the Beatles]. Listen to “Hole In My Shoe” by Traffic. You mixed as you went along. Therefore,

Mixing (The Space of Communications)


after you got the sounds that would fit with each other, all you had to do was adjust the melodies. Nowadays, because you have this luxury of the computer and virtually as many tracks as you want, you don’t think that way anymore.2

By 1967, the process Johns describes was arguably industry standard in the rock world. Not surprisingly then, bouncing eventually came to drive the creative process for many artists, recordists rationalizing their communications into discreet steps to create the evermore detailed arrangements that bouncing enabled. In fact, bouncing arguably came to serve as the creative impetus for most hit acid rock records produced in the late 1960s and early 1970s. To create Jimi Hendrix’s “Are You Experienced?,” for instance, engineers Eddie Kramer and George Chkiantz bounced the track no less than three times. The first take featured four tracks recorded live off the floor, specifically (i) Hendrix’s electric guitar; (ii) Noel Redding’s electric bass; and (iii–iv) a stereo reduction of Mitch Mitchell’s drum part. To clear room for the track’s characterizing “backward elements” (i.e., reverse-tape special effects) and a piano, Kramer and Chkiantz then bounced Hendrix’s guitar track to an open slot on a connected tape machine, and they reduced Redding’s bass and Mitchell’s drums to the second open track, which left tracks three and four open for overdubs. To make space for Hendrix’s lead vocal, a final reduction mix was then prepared. Track 1 on the third tape was still reserved for Hendrix’s electric guitar, and track two still had the reduction mix of Redding’s electric bass and Mitchell’s drums, but track three reduced the “backward element” and piano tracks, which left track four open for Hendrix’s vocal.3

Figure 2.1  Reduction mixes for the Jimi Hendrix Experience’s “Are You Experienced.”


Understanding Records, Second Edition

Another group that made copious use of bouncing was the Beatles, of course. By 1967, the band had made bouncing a core component of their tracking and mixing processes. And when simple bouncing would no longer suffice, the band had their producer George Martin sync two 4-track tape machines together so all eight tracks from both machines could be reduced onto open tracks on another 4-track machine, leaving even more tracks available for overdubbing. Psychedelic tracks like “I Am the Walrus” arguably pushed this mixing practice to an absurd and complicated limit: [George] Martin decided that Take 20 had been the best take . . . so a reduction mix was made, bouncing the strings and horns together in order to free up some space. . . . The [laughing] voices [heard in the song’s conclusion] were recorded onto track 3 of Take 25. The orchestra and chorale recordings on Take 25 had to be merged with The Beatles’ instruments and vocal on Take 17. Track 2 of Take 17 had been left free for this purpose. What Martin would ultimately do is manually sync Take 17 with Take 25, and bounce the orchestra and choir on Take 25 down to track 2 of Take 17. All subsequent mixing could then be done without having to sync the two machines. And, if necessary, the bounce could be done in pieces (this appears to have been done: close listening shows that the orchestra is slightly out of sync with the rhythm track on some sections and not in others, likely the result of having been bounced in pieces).4

Figure 2.2  Reduction mixes used to mix the Beatles “I Am the Walrus.”

Mixing (The Space of Communications)


Playlist Title: Understanding Records, Chapter Two: Bouncing A playlist comprised of tracks made by “bouncing down,” beginning with those noted in the section directly above, in the order noted. Listeners should pay attention to the composite image they hear when they listen to the tracks, and consider how that image was slowly pieced together through superimposed musical layers. It can be extremely instructive to listen to the Beatles’ “I Am the Walrus,” for instance, with an eye on the flowchart of bounces noted in Figure 2.2.

The structure of mixing: Past-tense auditory narratives As complicated a procedure as bouncing could be, mixing has arguably grown exponentially more complicated since the time of 4-track tape. Modern recordists now deploy a number of specialized tools and techniques to fuse the component tracks in their multitrack productions into coherent three-dimensional shapes. I examine many of these in what follows, but readers should be aware that mountains of printed research have been published, and too many tutorial videos to count have come online, each dedicated to explaining one engineer or another’s personal approach to mixing. Even someone who has devoted decades of their life to studying the craft will inevitably come up short, then, and miss a few of these techniques, when asked to describe the process comprehensively. Moreover, a formalized “best-practice” protocol for mixing remains notably absent from even the most parochial of recording textbooks and thankfully so! I would argue that no such protocol should exist. Mixing remains stubbornly entrepreneurial, and firmly individualized. And every mix decision should ultimately be contingent on, if not wholly determined by, the unique aesthetic needs of each particular project. This all said, a structure indeed underlies mixing. Whenever they mix a record—regardless of what tools they use, how they use them, what market values guide their work, and so on—recordists shape an auditory perspective on sound(s) more than just sound(s) per se. In fact, you don’t actually hear sounds like a kick drum, a synth bass, electric guitars, singing, and so on, when you listen to a record. What you hear is a single sound, produced by speakers and/ or headphones, which only sounds like a kick drum, a synth bass, an electric guitar, and singing. When you listen to Led Zeppelin’s “When the Levee Breaks,” for instance, you don’t actually hear drums, bass, electric guitars, blues harp, and plaintive vocal wailing. Likewise when you hear, say, Deepchild’s classic “Neukoln Burning,” or Julian Bream’s iconic rendition of Heitor Villa-Lobos’s


Understanding Records, Second Edition

“Prelude No. 3,” you don’t hear anything but a single sound, produced by speakers and/or headphones, which was carefully designed to trick your auditory apparatus into believing it detects the presence of, say, a kick drum, a synth bass, slowed vocal samples or sequences run up and down the neck of a classical guitar at breakneck speed. What you hear when you listen to these records—what you hear when you listen to any record—is a single sound that only sounds like other sounds, that is, a past-tense auditory narrative which relates how a sequence of sounds was once (ideally) heard. And “mixing” is simply the name we give to the act of composing such narratives. It is important to keep in mind that every listener is situated the same by the auditory narrative a mix construes, regardless of where or how they listen to it. There is simply no getting past mixes, just as there is no getting around screens when we watch movies. Listeners must hear the Sex Pistols perform from a front-row-and-center perspective when they listen to “Anarchy in the UK,” to name an arbitrary example. They must hear Johnny Rotten sing at the front-and-center with the drum kit behind him, while double-tracked electric guitars play to the left and right, and the bass sounds in the center, somewhere in between. And you cannot change this. You can walk to the left or right of your speakers, or right between and through them, and everything in the recording stays stubbornly put wherever it is mixed to be. You might even put on a pair of headphones and run two city blocks, but you will never get any closer to or further from, say, the snare drum. This can only be so if the record presents a past-tense auditory (not acoustic) narrative, if it presents a past-tense representation of how some sounds were once ideally heard, and not the sounds themselves. So why does this matter? Why should anyone care that we habitually mistake representations of sounds for the sounds they represent? What does it matter that recorded musical communications are comprised of auditory rather than acoustic information? I would argue that when we overlook this basic fact, we mistake the subject of recorded musical communications—the sounds and performances recordists represent when they make their records—for communication itself. In the process, we erase the artistry and craft of Recording Practice altogether from our hearing. This has always struck me as something like studying fashion photography but only in terms of how models pose, or, perhaps more cynically, like admiring the performance practice of Daffy Duck. At the very least, it is an inaccurate depiction of recorded musical communication which overlooks the artistry of mixing completely.

Mixing (The Space of Communications)


The soundbox To grasp the parameters of a mix, that is, to understand what comprises the auditory narrative a mix construes, I consider it most helpful to use an analytic tool that engineers developed decades ago to explain the craft to beginners. I learned this tool, for instance, in my very first recording class at Berklee College of Music, more than twenty years ago now. Recently, Alan Moore and Ruth Dockwray named this pedagogical tool a “soundbox”: When a stereophonic track is heard through headphones or over loudspeakers, the image of a virtual performance is created in the mind. This virtual performance, which exists exclusively on the record, can be conceptualized in terms of the “sound-box” .  .  . a four-dimensional virtual space within which sounds can be located through: lateral placement within the stereo field; foreground and background placement due to volume and distortion; height according to sound vibration frequency; and time.5 

This is not merely an academic construct. In fact, the soundbox has circulated as an instructive concept in engineering circles long before any academic ever sought to name it. Accordingly, it may be better to hear from a working engineer on the matter. I asked Alex Chuck Krotz to explain how he conceives of the space of a mix, or how he might describe “the soundbox” to interested readers. I would argue that Alex is more than qualified to speak in this regard. He is an audio engineer with an enviable list of credits, which includes work on records by the likes of Drake, Three Days Grace, Big Wreck, Billy Talent, Shawn Mendes, and numerous other household names. As Krotz explained: Mixing is the shaping of sound to craft a sonic landscape that’s best for a song. It’s very similar to a photograph of someone against a horizon landscape. There is the sky, the ground, the subject of focus front and center, and other smaller elements such as trees that surround and support the subject. Mixing is taking all the elements of the song, and making them fit into one beautiful landscape. There are many different ways to mix. Even with different approaches, the intent of creating a sonic landscape that is interesting to the listener remains the same. Let’s use the photograph analogy to highlight different mix components. In a mix there is a lead vocal; this can be compared to the subject of the picture. The person that is front and center, is like the voice you want to hear loud and clear, so the audience can easily understand what is being sung. All other elements in the mix are support. Drums, guitars and background


Understanding Records, Second Edition vocals, are all used to fill out the ground, sky, and trees of the picture. These elements need to be placed around the subject so the listener can distinctly hear each element. To do so, width, depth and height of the landscape need to be created. To create width, panning is used to move the elements left and right, spreading them out, so everything can be heard and nothing is hidden behind the vocal. Depth can be compared to the trees in this picture. Some appear to be closer, while others appear further in the distance. Volume and effects (such as reverb and delay) are used to help create that sense of depth. The more reverb and/or quieter volume, makes things feel further in the background, while less effects and/or louder volume, brings things more in the forefront. Think about someone in a cave. The closer they are, the less reverb is heard from the cave and the louder the direct sound. However, the further inside the cave they get, the direct sound lowers in volume and the reverb becomes more prominent. These are the same natural concepts used in mixing to create the illusion of depth and width between two speakers. The element of height in a mix is a simple concept, however, it tends to be less obvious.

Alex will explain more about “height” in the subsection headed “Vertical plane” below. For now, I should simply note that the “landscape” (or soundbox) which Alex describes above affords burgeoning mix engineers a means to conceptually objectify the aural perspective each mix construes. This, in turn, helps them begin to grasp the “anchor points” and other spatial formulae which structure modern recorded musical communications.

Soundbox: Basic anatomy There are three “parts” of a soundbox, at least as I teach it to beginners. I call these “parts” the horizontal plane, proximity plane, and vertical plane. I consider each of these, in turn, directly below. And I explain some of the most common techniques recordists use to shape and refine each “part” along the way. I examine signal processing, which is how a soundbox is ultimately shaped, in much greater detail in the second half of this chapter. What immediately follows is a broad portrait of mixing, drawn as recordists aesthetically conceive it. It is not a technical manual. Mixing isn’t just signal processing, after all. It is, rather, signal processing for a very specific purpose, namely, to shape a soundbox. Thus, what follows directly below is meant to help readers learn to hear that soundbox, which they must do before they can even begin to understand how it is shaped.

Mixing (The Space of Communications)


Figure 2.3  A “soundbox” diagram, from David Gibson’s The Art of Mixing (2005).

Horizontal plane The horizontal plane runs side-to-side across a stereo mix. As is evident from listening to any record released in the last few decades, for any market, a number of spatial patterns recur in modern recorded musical communications. Most mixes are based on a simple “anchor point” formula, in fact. Kick drum, bass, “lead” instruments like vocals and soloing synths or guitars, and snare (and/or clap in electronic genres) tend to run directly up the center of a mix. Accompaniment tracks almost always wind up situated along the periphery, and somewhere between the front and the back; lead tracks, especially lead vocals, are almost always mixed to sound front-and-center. When “anchor” tracks are not centered, recordists are aware that listeners expect them to be and, so, they position them elsewhere in the mix to achieve some psychoacoustic or aesthetic effect. They should, however, be very careful in so doing. One of the main reasons that anchor points exist is technical; specifically, anchor points ensure that mixes don’t create overly broad speaker excursions, or jagged laneways on vinyl, which can damage playback equipment. Playlist Title: Understanding Records: Audio Concepts & Demonstrations Track: 10. Anchor Points A phrase sounds twice, with anchor points set firmly: kick, snare, and bass directly in the middle. When the phrase repeats, however, the anchor points are


Understanding Records, Second Edition not set; the kick is sent hard left, and the snare and bass sent hard right. Then the phrase repeats with anchor points set “correctly” once more.

Playlist Title: Understanding Records, Chapter Two: Anchor Points A playlist comprised of original stereo mixes for the Beatles’ classic 1968 “white” album, alternating with the recent “2018 mix” versions of each track. Listeners will thus hear “Dear Prudence” in its originally mixed form, which had the bass panned hard left, guitar hard right, and the vocals and drums centered; then they will hear the “2018 mix” version, which has guitars spread evenly across the stereo spectrum, bass moved to center (if slightly to the right of center), and the vocals and drums in places similar to in the original stereo mix. The “2018 mix” versions almost unanimously “correct” the original stereo mixes in terms of reinstating their anchor points. Listeners are encouraged to note the most striking differences between the original stereo mixes and the 2018 mixes that they hear.

Mono’s demise: Stereo-switching The horizontal plane, and the mixing techniques which construct it, is a relatively recent addition to Recording Practice. Into the 1960s, in fact, long after stereophonic technology came to market, stereo mixing remained a largely neglected craft, especially in British recording studios. Records were made primarily for mono (monaural) reproduction, that is, for transduction via sound systems with only one channel regardless of the number of loudspeakers used. Thus, any positioning along the horizontal plane was done in tracking, through microphone placement. Mixing was still, for all intents and purposes, leveling the relative volumes of tracks, the majority of its aesthetic determinations being still very much holistically ingrained within the tracking process. Things may well have continued on this way to this day, had it not been for the introduction of stereophonic reproduction in the late 1960s. In fact, despite the bitter protests of many mono enthusiasts, by the end of the 1960s stereophonic reproduction had usurped monaural reproduction as the new paradigm in pop. Recording Practice became, suddenly, stereophonic in nature. The abruptness of this transition cannot be over-emphasized, in my opinion. Stereo technology presented a massive rupture in the way that recorded musical communications were made and heard, and recordists had little choice but to accommodate its peculiar dual-hemisphere demands. In 1967, for instance, when George Martin and the Beatles completed work on Sgt. Pepper’s Lonely Hearts Club Band, stereo was still so peripheral that neither the band

Mixing (The Space of Communications)


nor their producer felt it necessary to attend stereo mixing sessions for the album. It was left to engineers Chris Thomas and Geoff Emerick to complete the stereo master for that record (Chris Thomas, incidentally, would go on to mix Pink Floyd’s Dark Side of the Moon and produce The Sex Pistols’ Nevermind the Bollocks). However, only two years later, in 1969, when it came time to release Abbey Road, a mono mix was never even attempted. Once all that mattered to the band and its producer, mono had simply become irrelevant after only two short years. Stereo mixes made during the mid-1960s document the abrupt and awkward transition from mono to stereo which occurred at the time. When stereo recording first became widespread, recordists had only a three-way switch available on their mixing consoles for moving tracks along the horizontal plane. Thus they could only assign tracks to the left channel, the right channel, or combined into the center. And, in fact, this “stereo-switching” is apparent everywhere on the Billboard charts throughout the mid- and late 1960s. While these tracks may sound only historically quaint now, they are extremely significant in terms of historical record, insofar as they audibly encapsulate the very moment in time when mixing became an aesthetic practice in its own right, something recordists do after tracking, using mixing consoles and signal processors. As usual, it was the Beatles who made the most daring use of stereo-switching initially. Throughout the stereo mix for “A Day in the Life,” for instance, the final track on Sgt. Pepper’s Lonely Hearts Club Band, John Lennon’s and Paul McCartney’s lead-vocal tracks are switched to every possible position along the horizontal plane. During the first two strophes, Lennon’s vocals are switched to the right extreme. During the third strophe, however, Lennon’s vocal track is switched to the center and, then, to the left; astute listeners may notice that, as Lennon’s vocal track reaches center along the horizontal plane, it dips in volume. Meanwhile, during the bridge, McCartney’s vocals are switched to the right. For the final strophe, Lennon’s vocals are switched to the leftmost position along the horizontal plane.

Panning and masking It has been decades since recordists situated tracks along the horizontal plane with the flick of a switch. Modern recordists now use so-called “pan pots,” that is, panoramic potentiometers, usually located directly above the volume faders on hardware and software mixing consoles. Pan pots split audio signals into left and right channels, each equipped with its own discreet gain (volume) control.


Understanding Records, Second Edition

Figure 2.4  The pan pots on my Toft ATB-08 mixing board.

Signal passes through both channels at an equal volume while the pan pot points directly north. Twisting the pot to the left or to the right, however, attenuates the input signal as it passes through the opposite channel. Twisting a pan pot to the left, for instance, attenuates the input signal in the right channel, making it sound like the panned track is gradually moving to the left side of the stereo plane, while twisting a pan pot to the right does the opposite. Thus, tracks seem to move in the direction that recordists point the pan pots on a mixer, even though recordists actually attenuate those tracks on the opposite side of the horizontal plane. Playlist Title: Understanding Records: Audio Concepts & Demonstrations Track: 11. Panning To demonstrate this concept, I have panned the lead-vocal part on a sample from a track I am currently producing from hard left to hard right, and back, over and again over the course of the sample.

Recordists pan tracks for any number of reasons. Sometimes dynamic panning is done to add motion and interest to a dull and uninteresting mix. The slowly panning hi-hat tracks which introduce “Elephant Stone (re-mix),” track number three on Silvertone Records’ twentieth-anniversary reissue of the Stone Roses’ eponymous debut, provide a particularly clear example of this. The drummer

Mixing (The Space of Communications)


Reni’s hi-hat tracks slowly pan back and forth across the horizontal plane during the first minute of the track, for no apparent reason, except to add some earcatching excitement to an otherwise stagnant mix. Other times, tracks are panned to direct listeners to certain musical events in a mix that might otherwise pass by unnoticed. In these cases, soloing instruments are usually panned to the front-and-center, only to pan back to the horizontal periphery where they resume their accompaniment role once lead vocals re-enter. This all said, the most common use for panning is simply to combat “masking.” Co-located signals—that is, signals distributed to the same position along the horizontal plane of a mix—always risk masking (obfuscating) one another, especially when they share similar spectral profiles. A number of instruments involved in a typical pop production, for instance, present overlapping frequencies at various spectral regions. When they perform simultaneously, and in the same horizontal regions, they compete to be heard. For example, hi-hats often mask snare hits; rhythm guitars usually mask lead vocals; the bass will mask the kick drum; and so on. While some recordists have managed to use masking to their advantage—think of Phil Spector’s notorious “Wall of Sound” production aesthetic (the Ronettes’ “Be My Baby” should suffice to illustrate it)—when masking is obfuscating it is usually anathema to modern recordists, something to be avoided whatever the cost.

Proximity plane The proximity plane runs from the front-and-center to the auditory horizon at the very back of a mix. Also called the “fore-aft” of a mix, recordists use a number of techniques, often in tandem, to situate tracks along this plane. That said, proximity remains an entirely relational acoustic construct: tracks only sound closer and farther relative to other tracks in a mix. To push a track back in a mix, recordists attenuate its volume and its highfrequency content relative to the volume and equalization of tracks situated nearer to the fore. “Perspective is all about contrast,” Paul White (2009) explains, “so while sounds can be pushed back by making them less bright and more reverberant, they must be balanced by brighter, drier sounds at the front of the mix.” Roey Izhaki concurs: All depth considerations are relative. We never talk about how many meters away [a sound is]—we talk about in front or behind another instrument. The [proximity plane] of a mix starts with the closest sound and ends at the farthest

Understanding Records, Second Edition


sound. The depth field is an outsider when it comes to our standard mixing objectives. We rarely talk about a balanced depth field since we do not seek to have our instruments equally spaced in this domain. Coherent depth field is a much more likely objective. . . . A classical concert with the musicians walking back and forth around the stage would be chaotic. Likewise, depth variations are uncommon, and usually instruments move back and forth in the mix as a creative effect, or in order to promote importance (just as a trumpet player would walk to the front of the stage during his solo).6

Auditory horizon All proximity plane motion occurs along a front-to-back (fore-aft) continuum. The furthest distance away which a track can be pushed before it slips out of earshot is called the “auditory horizon.” Behind the auditory horizon of a mix is silence. If a track fades in, for instance, it begins its trek toward the mix from behind the auditory horizon, that is, from a distance too far away to be heard by the aural perspective a mix construes. The amount of silence before the horizon is breached—and, then, before the track achieves full audibility—represents a certain distance the recorded performance must travel to be heard. If a track fades out, on the other hand, it ends its trek past the auditory horizon, beyond earshot. Fading in and fading out have become particularly popular devices in modern pop. In lieu of an introduction or some final cadence, recordists slowly push and pull tracks ever closer and farther along the proximity plane of a mix until they emerge into, or slip beyond, its earshot. In fact, the approach that recordists take to fading can play a crucial, if often unremarked, role in defining their style on record. Fading is extremely common in progressive and album-oriented rock, for instance. Bands like Pink Floyd, ELP, and, more recently, Explosions in the Sky constantly fade tracks, pushing and pulling the component tracks on their records into, and out of, earshot. Meanwhile, garage rock bands like the Stooges, the Strokes, Wolfmother, the Hives, and the White Stripes—and Top 40 acts like Fergie, Katy Perry, and Justin Bieber—rarely fade tracks, if at all.

Balancing depth Recordists typically follow conventions established in live performance when they balance tracks along the proximity plane of a mix. As I have already noted, lead tracks tend to occupy the front-and-center of a modern pop mix, though

Mixing (The Space of Communications)


it is the kick drum, and electric bass, which usually occupy the fore alongside vocals in reggae, dub, hip-hop, and most electronic dance music productions. At the very back of a pop mix is the drum kit and, sometimes, synth pads and percussion; the kick drum and snare, however, are often pushed ahead of the rest of the drum kit, occupying roughly the same stratum as the rhythm guitar. In fact, nowadays, many pop productions feature the kick drum at the front-andcenter, competing with the vocals for the lead position in the mix. To further elaborate on the notion of depth in mixing, I asked my good friend Alastair Sims to provide some commentary on how he imagines, and achieves, the depth dimension. Here is what he told me: The idea of depth in mixing is something that is discussed frequently. I work almost entirely in the box and most of the time with virtual instruments and digital sources. Don’t get tricked into thinking “depth” is only possible with analogue gear. When talked about between engineers, it seems to have this mythical status. But in practice I find it much simpler, and view it more as a mindset or thought process you have to go through. For me, the key is figuring out the proper balance for a project. When I say balance I mean volume and frequency balance. Simply put there are 3 places something can be placed in a mix in the foreground (think vocals), middle ground (think rhythm section) and background (think ambiance, percussion, and other chordal elements). For 90% of modern music, the vocal will be in the foreground, with the rhythm section (drums, bass, and sometimes guitar) just behind creating the foundation. The remaining chordal, ambient and percussive elements fill in the spaces and musical gaps. Using this as a template where you have different “layers” of sonic texture is for me how I create depth.  These 3 layers primarily are volume based. The loudest element is in the foreground and quietest is in the background. But there are two other methods for putting an instrument or mix element into one of these positions. Frequency is an important one. Something with more high frequency naturally seems to push its way to the front of a mix, whilst something with more low mids and lows can fall more into the background. Lastly the use of effects such as reverb and delay will not only add depth to a sound but will also pull it back a little bit in the “picture” of the mix. This process can even begin before mixing. When producing or building a track having in mind that there should always be these three main layers is essential. There may not always be a lead vocal in the project you’re working on, so what are you going to have in the foreground? Or better yet what is most important and what will you be showcasing? Perhaps you’re working on a

Understanding Records, Second Edition


project with only voice and acoustic guitar, there are a number of techniques to build up depth. You have the vocal in the foreground, the guitar in the middle, and perhaps reverbs and delay as the layer gluing everything together and stretching into the background. Alternatively, if you are not only mixing it, but recording and producing it you could add background vocals to fill in space in the background. There are endless ways to create depth, but the key for creating it is remembering that you can’t have everything fighting for attention in the foreground. You’ll have to pick where things sit in the soundscape and that means some parts you love are going to be pushed back into the background. At first this can be disheartening if you have built an interesting sound, or written a great little melody. But remember that the overall purpose of a mix is to create one sonic “picture” to support the main musical or lyrical idea. You do this by having a middle ground and background that leave space and prop up the elements in the foreground.

Vocal priority Regardless of where they situate tracks along the proximity plane, recordists “layer depth”—they position tracks along the proximity plane—to prioritize the components of a multitrack production. Recordists balance the “depth dimension” of a mix to establish a pecking order between tracks, in other words. Usually it is the vocals, or a soloing lead instrument, which gets prioritized in this way. Bob Dylan’s Time Out of Mind provides an interesting, and ever-evolving, approach to the problem of vocal-track proximity. Over the course of the record’s eleven tracks, producer Daniel Lanois presents a remarkably varied set of proximity planes; the mix for each track offers a totally different conception of where the component instruments in a typical folk-rock arrangement should be situated, at least in terms of their proximity to the fore. “Love Sick,” for instance, which opens Time Out of Mind, positions Dylan’s lead-vocal track far in front of its accompaniment, even as the next track on the album, “Dirt Road Blues,” buries his vocals further back in the mix. “Not Dark Yet,” “Standing in the Doorway,” and “Make You Feel My Love” feature standard folk-rock proximity planes, with vocals far in front and pads far behind, just as “Cold Irons Bound” and “Can’t Wait” position Dylan’s vocals somewhere between front-and-center and buried. This all said, an even more varied approach to vocal proximity can be heard on a number of Led Zeppelin records released between 1970 and 1975.

Mixing (The Space of Communications)


Vocal priority as marketing In fact, vocal priority is a market-driven phenomenon more than anything else. I remember the first time I realized this very clearly. During my first year as a student at Berklee College of Music, in Boston, I met some phenomenal musicians and producers. It was a stroke of sheer luck that my roommate, Rohin Khemani, a good buddy from high school, was such a “hot shot” drummer, because he was always in demand for session work, and this brought him into contact with most of the serious students at the college. I owe Rohin a lot, actually. He threw together the first band I had at Berklee, while I was holed up in our room practicing and writing all day, and he took a very active role in getting us gigs. Back then, in the early 1990s, you had to make a demonstration recording to hand around to the people who booked clubs and cafes if you wanted to play there, and once we thought we were ready, we opted to record the demo in a friend’s living room rather than waste money on studio time for what was basically going to be a musical business card. The recording was incredible, in my opinion. Indeed, what a future talent pool Rohin gathered there! Rohin himself would go on to record and tour the world with Red Baraat; Ludvig Girdland, our violinist and my very good friend, would record with the likes of the Eels and ELO, and perform on soundtracks for movies starring Robert Redford, Morgan Freeman, Jack Nicholson, Diane Keaton, and Keanu Reeves; and Billy Mohler, who played bass, would produce and record with the likes of the Calling, Macy Gray, Limp Bizkit, Lee “Scratch” Perry, Wayne Shorter, Awolnation, and Lady Gaga, among others.7 You’re probably wondering the same thing I’ve been wondering since that day: What was I doing in that band? As I said, everything sounded great in the living room while we were recording. But we had very little experience mixing a record, and we couldn’t figure out why the track sounded off when we would listen back to various takes. I remember all of us gathered around the little Mackie mixing board, which was connected to an ADAT machine, arguing over where the violins should be panned, how high the snare should be in the mix, whether my vocals needed another take, when Billy finally spoke up. “We’re mixing it like a jazz record!,” he said. “It needs to be pop.” What he meant was that we had to give my vocals top priority in the mix. Slowly we pumped them up by a dB or two, and then another, and then another, until, finally, the mix sounded like the pop record it was supposed to be, and not a jazz-fusion jam session.


Understanding Records, Second Edition

The demo we made that day got us some gigs, and we wound up being a very busy and somewhat popular band for a while in Boston. That said, the only copy of that demo I know to exist now sits on an office shelf here at Western University, behind me as I write this. To be honest, I couldn’t say for sure if anybody has ever actually heard that demo. As I said, we did well as a band and played hundreds of gigs over the years, but our sound evolved very quickly from what we captured that day. On that session, though, my bandmates gave me an education in mixing I may never have learned anywhere else, namely, that vocal proximity can play the key role in marketing a track to different listenerships. As I tell my students whenever I can: there’s no such thing as a bad session—every session is an opportunity to grow! Playlist Title: Understanding Records, Chapter Two: Vocal Priority A playlist to demonstrate the wide variance that can be found in “vocal priority” on a number of recent rock and folk mixes. Listeners are encouraged to compare and contrast the vocal priority on each mix, and to compare and contrast the aesthetic effect of those varying priorities.

Figure 2.5 The cover for the cassette version of the Jay Hodgson Group’s first audio recording. I’m still amazed at the talent of the three other musicians in that photo. From left, they are Jay Hodgson, Rohin Khemani, Billy Mohler and Ludvig Girdland.

Mixing (The Space of Communications)


Buried vocals As with every convention, exceptions to the vocals-at-the-fore convention abound. Loud and aggressive musical genres—like, say, thrash, black metal, speedcore, and so on—usually bury lead vocals behind accompaniment tracks, while maintaining their coherence in the mix through deft horizontal imaging and signal processing, to reinforce the perceived loudness of the productions (check out Kylesa’s “Don’t Look Back” for a great example of this effect). Similarly, recordists will often convey the impression that a track has suddenly increased its volume by pumping accompaniment tracks to a level almost above the lead vocals in a mix; this pushes the vocals back along the proximity plane precisely as it pulls accompaniment tracks closer to the fore (compare the level of the electric guitar in the verse versus the chorus of the Strokes’ “Juicebox”). Again, though, it is extremely rare that lead-vocal tracks are ever buried to the point that they become inaudible. Recordists may also bury a lead-vocal track to buttress some broader aesthetic or thematic effect contained within a song. The Cure’s “Secrets,” for instance, the third cut on the band’s Seventeen Seconds LP, features an extremely quiet, if not whispered, lead-vocal track, albeit doubled an octave above. The distant placement for Robert Smith’s vocals deftly reinforces the song’s thematic content: ostensibly about unexpectedly meeting a long-forgotten love affair at a party, Robert Smith’s hushed delivery neatly encapsulates the emotional turmoil the singer struggles to keep secret. Album producer Mike Hedges would push this technique to an extreme limit only two tracks later on the same album, by opting to bury Smith’s lead vocals on “Three” so far back in the mix that they remain, for all intents and purposes, completely inaudible. In fact, when I first heard the track, I swapped out my headphones, believing that a wire had come loose. Some bands bury lead vocals as a matter of style. Records by Led Zeppelin, the Stone Roses, the Jesus and Mary Chain, and the Strokes, among many others, provide clear illustrations of this. On numerous tracks by these bands, the lead vocals sit much further back along the proximity plane than is the norm in the rock genre. Of all of the records I might mention here, though, it is the Strokes’ Room on Fire which, once again, provides the most extreme example, though some tracks on the Stone Roses’ Second Coming give that record a run for its money in this respect. The proximal position which producer Gordon Raphael determines for Julian Casablancas’s vocals throughout Room on Fire can only

Understanding Records, Second Edition


be described as “buried.” This said, Raphael deftly applies equalization and distortion to augment the vocal tracks with enough high-frequency edge that they remain coherent throughout. Playlist Title: Understanding Records, Chapter Two: Buried Vocals A playlist featuring a number of tracks with “buried” vocals. Listeners are encouraged to consider the aesthetic effect produced by “burying” the vocals in these tracks. Does it make sections, or the track in general, sound “louder” or “more dynamic”? Does it cause them to “lean in,” as it were, that is, pay attention and try to hear what the vocalist is singing? Does it cause them to ignore the lyrics and concentrate more on delivery and tone? There are no accidents in mixing. Why might recordists choose to bury such a crucial part of these performances under its accompaniment?

Vertical plane The vertical plane covers the top-to-bottom of a mix. Though the concept of a vertical plane may seem straightforward—sounds are located side-to-side and front-to-back in a mix, so why shouldn’t they also be situated top-tobottom?—the notion that sound possesses any kind of vertical nature remains scientifically controversial. There is no definitive measure of height for sound, after all; and one sound is not objectively taller than another, even if it emanates, or seems to emanate, from a higher elevation and its waveform is, indeed, larger in mass. In fact, scientists who study the neurological basis of psychoacoustic phenomena maintain that, when it comes to hearing, vertical connotations are entirely relative and, thus, subjective. Daniel J. Levitin summarizes the scientific objection: The terms [i.e., high and low] are culturally relative. The Greeks talked about sounds in the opposite way because the string instruments they built tended to be oriented vertically. Shorter strings or pipe organ tubes had their tops closer to the ground, so these were called the low notes (as in low to the ground) and the longer strings and tubes—reaching up toward Zeus and Apollo—were called the high notes. .  .  . Some writers have argued that high and low are intuitive labels, noting that what we call high-pitched sounds come from birds (who are up in trees or in the sky) and what we call low-pitched sounds often come from large, close-to-the-ground mammals such as bears or the low sounds of an earthquake. But this is not convincing. Low sounds also come from up high (think of thunder) and high sounds can come from down low (crickets and squirrels, leaves crushed under foot).8

Mixing (The Space of Communications)


Despite Levitin’s caveats, published accounts of the mixing process, especially those written by working recordists, universally posit a “vertical plane.” And practical experience of making, and listening to, records bears out the existence of a top-to-bottom continuum in every mix. I’ve always wondered, in fact, whether the high vertical placement of tweeters, as well as the low placement of woofers, on consumer speaker systems has played the key role in this conditioning. There is also the fact that low tones tend to disperse more widely than do higher tones, which means that the higher tones are more likely to seem to project from a stable sound source while the lower tones more readily reflect off room furnishings. In any event, whether it is a product of cultural or neurological conditioning, or a practical reality which simply confounds our current scientific understanding of hearing, the vertical plane plays a crucial role in modern mixing. To demonstrate the practical reality of the vertical plane, I asked my good friend Alex Chuck Krotz to provide some commentary on (i) how he conceives of the vertical nature of sound when he mixes; and (ii) what are some things he commonly does to shape the height of his mixes. I have already mentioned Alex’s impressive credits in an earlier section of this chapter subtitled “The soundbox.” Here is what he had to say: The element of height in a mix is a simple concept; however, it tends to be less than obvious. Since there are generally only two speakers and we hear through two ears, the illusion of vertical height within a mix needs to be created through the manipulation of frequencies. The upper frequencies (commonly referred to as “top end”) make an element sound like it is higher in the vertical spectrum of the mix. The lower frequencies (commonly referred to as “bottom end”) makes an element feel lower in the vertical spectrum of the sound field. The frequencies in the middle (the “mids”) will help keep the element in the middle, along the center horizon. Some elements or parts in a song naturally lend themselves to specific areas in the sonic space. Kick drums and bass guitars are generally more low-end, so they would be “the ground” of the mix landscape. There is only so much top end available in a kick drum to move it up in height, so it will naturally maintain that foundation. Then elements like guitars or keyboards in higher registers that naturally lend themselves to fill out the top of the mix, make up “the sky.” Then the main elements of the mix, such as lead vocals or main rhythm guitars, will have a good amount of top, bottom and mids, which makes them feel centered in the mix, occupying the horizon line of the landscape. Most elements will be starting somewhere around the horizon line of the mix, and can be manipulated vertically further up and down with EQ shaping to place them.


Understanding Records, Second Edition

Frequency/height Sound achieves height in a mix through relational—what I have elsewhere called mirrored—processing. As is evident from even just Alex Chuck Krotz’s brief commentary above, when they discuss the vertical plane, or the height, of a mix, recordists actually discuss the practice of “frequency balancing.” Within the context of each particular mix, tracks with abundant high-frequency content simply seem to occupy a position along the vertical plane located over and above tracks with a duller equalization. An electric bass that lacks energy above, say, 2 kHz, for instance, sounds like it emanates from an elevation under a synth pad with a high-pass filter set to shelve everything below 2 kHz—even if both tracks seem to emanate from similar elevations when they sound in isolation. Again, the vertical plane is an entirely relational construct which accrues through tandem (relative) equalizations. However they situate sound along the vertical plane, the concept of “balance” guides recordists as they do so. Recordists may opt to stunt the vertical plane of a mix, filtering the high-frequency content from tracks, as is the norm in genres like trip hop and ambient dub (Massive Attack’s Mezzanine (1998) provides an obvious example of this approach). More often than not, though, recordists fashion a broad, and broadly expansive, vertical plane when they mix their records. That is, they shape a vertical plane which encompasses a broad array of frequencies across the audible spectrum. While almost any Top 40 production will demonstrate the sound of a broadly expansive vertical plane, Shania Twain’s “Up” and Warren Zevon’s “The Werewolves of London” have always struck me as particularly clear examples of expansively vertical mixes.

Signal processing Now that I have surveyed some of the more important spatial and aesthetic considerations recordists address when they mix their records, I can turn my attention to the most common signal-processing tools and techniques they use to do so. Signal processing is modifying an audio signal by inserting a “signal processor”—that is, a device designed to modify audio signals in particular ways—into the audio chain and manipulating its adjustable parameters. This practice plays a crucial role in modern Recording Practice. All manner of harmony, rhythm, and melody are conveyed to listeners in a processed state on

Mixing (The Space of Communications)


record now, and recordists are expected to be as proficient at signal processing as they are at tracking. It should come as no surprise, then, that recordists have become extroverted in their use of signal processing. Whereas signal processing was once used to preserve the realism of a recorded performance, a majority of pop records now feature “non-veridic” applications, which is to say, they do not sound as though they were, or even like they could be, performed in a live setting.9 Of course, what sounds veridic and non-veridic varies from time to time, and from place to place. A veridic punk rock production, for instance, will sound totally different than, say, a veridic techno track. Moreover, what sounded non-veridic decades ago can epitomize veridic production values today. What ultimately matters, for the purposes of this field guide at least, is that recordists continue to make a fundamental distinction between what they do in the studio and on stage and that they deploy a variety of signal-processing techniques to do so.10 In the following section, I examine the most common signal-processing techniques that recordists routinely use to craft their recorded musical communications, mostly as they inhere in the mixing process. I examine many of the same processes in the next chapter, and a few more, but more specifically as they relate to the mastering process. I have organized this section into a series of discreet entries, each devoted to one of seven different signal-processing techniques, namely (i) equalization, (ii) dynamics, (iii) distortion, (iv) feedback, (v) delay, (vi) modulation, and (vii) reverb. What follows is only an introduction to signal processing in modern Recording Practice. Signal processing is an immense, and immensely complicated, musical craft which I can only very briefly survey within the confines of this field guide. Readers who familiarize themselves with the signal-processing techniques surveyed here can nevertheless expect to hear signal processing at work everywhere in modern music.

Equalization Equalizers (EQs) adjust the amplitude of audio signals at particular frequencies. They are, in other words, frequency-specific volume knobs. Recordists originally used equalizers to compensate for the distorted frequency response of early microphones. If a microphone exaggerated, say, the upper-midrange (1 to 7 kHz) of a sound source, recordists inserted an equalizer into the audio chain, attenuated all signal in that range and, in so doing, adjusted the equalizer until the recorded sound matched, or, was equalized with, the sound source.


Understanding Records, Second Edition

Recordists use equalizers to manipulate component frequencies within the “input spectrum,” that is, recordists use equalizers to adjust the volume of those particular frequencies, which, combined, comprise the total “spectral content” (frequency content) of a recorded sound. “All musical instruments create sounds with multiple frequencies, called complex waveforms, even if only playing one note,” writes David Franz. “Depending on the particular note played, an instrument will produce the specific pitch for that note (known as the fundamental pitch) plus additional pitches with lower amplitudes (known as harmonics and overtones) that color the overall sound.”11 Ultimately then, through equalizing, recordists adjust the volume of the component frequencies in a sound source and, thus, its timbre. There are numerous EQs on the market. However, recordists chiefly use five basic kinds: (i) parametric, (ii) semi-parametric, (iii) graphic, (iv) peak, and (v) program. Though each of these equalizers has its fairer features, parametric and graphic EQs nevertheless remain most common. Parametric equalizers allow recordists to determine every parameter of equalization, including (i) “band,” that is, where in the frequency spectrum to equalize; (ii) “Q-value,” that is, how much of the frequency spectrum to equalize; and (iii) “amount,” that is, the amount of equalization, or the size of the boost or cut (attenuation) to apply. A graphic equalizer, on the other hand, features only a sequence of fixed bands, and the Q-value of each remains predetermined. Thus, users of graphic EQs choose only the amount of boost or cut to apply to each fixed band and only as the predetermined Q-value proscribes. For example, the width of each band on a 10-band graphic EQ is one octave, meaning that the input spectrum can be adjusted only in one octave swaths, while the width of each band on a 31-band graphic EQ is one-third of an octave.

Depth equalization Aside from “hyping” tracks—that is, equalizing tracks to produce unorthodox musical timbres—recordists use a number of equalization techniques. For instance, recordists commonly use equalization to increase the “depth” of a mix. Human ears tend to interpret duller sounds, specifically, sound sources which lack upper-midrange and high-frequency content, as less proximate than sounds with a plethora of such components (sounds with more high-frequency content also tend to sound louder, which only further increases their perceived proximity).

Mixing (The Space of Communications)


Figure 2.6  Close-up of the pots on my API 550b EQs, which are in my 500 series API lunchbox. I used these EQs to master Graze, Graze (NK 42), which was nominated for a Juno Award in the category of Best Electronic Recording in Canada in 2014.

Recordists regularly use this quirk of human hearing to their advantage. Introducing equalization boosts and cuts at select frequencies across a mix, to emphasize the high-frequency content of some tracks and the bass and midrange of other tracks, recordists increase the general impression of depth which that mix creates. The long crescendoing sample that sounds from 0:08 to 0:37 of Eric Prydz’s “Call on Me” provides an instructive example. Comprised entirely of a well-known vocal phrase sampled from Steve Winwood’s “Valerie,” the sample initially sounds with all of its midrange and high-frequency content muted. Over the course of a twenty-nine second fade-in, however, Prydz broadens the spectral profile of the sample until its full frequency content finally sounds at 0:35, after which point Prydz triggers another sample from the Winwood original—“I’m the same boy I used to beeee”—and “Call on Me” begins in earnest. Prydz repeats this technique without any accompaniment just over one minute later, at 1:49 into “Call on Me,” and again with full accompaniment at 2:31. In all three cases, as Prydz broadens the input spectrum of the Winwood sample, he also increases its proximity, and thus the general impression of depth which the mix creates.

Mirrored equalization Another common equalization technique in mixing is the so-called “mirrored” equalization. This practice entails using EQs to give tracks with similar frequency components complimentary “spectral contours” (frequency profiles). Recordists wedge equalization boosts and cuts into a mix at select frequency ranges where


Understanding Records, Second Edition

instruments would otherwise overlap and “mask” each other. For instance, bass guitar and kick drum usually fight for the same frequency space on rock records; both instruments present overlapping frequency ranges and thus compete for audible prominence in those ranges. Only making matters more difficult; the kick drum and bass guitar are usually paired in a pop production, meaning that they are typically arranged to sound at roughly the same times. Thus, recordists routinely apply equalization boosts on the kick drum in roughly the same regions they cut on the bass guitar, and vice versa, that is, they mirror their equalizations for both instruments. When applied to tracks “panned” to either side of the stereo plane in a mix, mirrored equalization can enhance the perceived width of a mix. For instance, recordists working in rock often equalize electric guitar parts differently, and, when appropriate, they pan the tracks to opposite sides of the stereo spectrum (if they work with an experienced band with two guitarists, both players should already mirror their equalizations to some degree). The distinct spectral contours that both electric guitars subsequently take prompts listeners to localize a different sound source for each, located in the extreme left and right periphery of the mix, respectively. Applying a very brief delay of, say, 8 to 45 ms to only one of the panned tracks can further enhance the impression of width this technique creates, but I’ll look more closely at this technique in the section headed “The Haas trick” below.

Figure 2.7  Channel EQs on my Toft ATB08 mixing board, the middle two dialled to roughly “mirrored” settings.

Mixing (The Space of Communications)


Mirrored equalization is characteristic of a number of rock productions. However, perhaps the most obvious examples are heard throughout the Strokes’ Is This It?. Every track on the album features mirrored equalization at least once. The most obvious example is “Last Night,” track seven on Is This It?, which features an abrupt transition from mono to stereo imaging on the electric guitars after exactly ten seconds. In fact, throughout the record, Albert Hammond Jr.’s and Nick Valensi’s guitar parts are mirrored and panned either moments before or precisely when Julian Casablancas’s vocals enter, a mixing move which wedges more space into the center of the mix for Casablancas’s vocals to fill. This technique is clearly audible at eleven seconds into track six, “Alone Together”; eleven seconds into “Is This It?”; fourteen seconds into “The Modern Age”; seven seconds into “Soma”; ten seconds into “Last Night”; and sixteen seconds into “New York City Cops.” Playlist Title: Understanding Records, Chapter Two: Mirrored EQ A playlist comprised of tracks noted in the section directly above, in the order they are noted, and others that feature clear examples of mirrored equalization applied to “hard panned” electric guitars.

Hi-pass filters To further combat “masking,” the first thing I do when I start a mix is copy and paste a hi-pass filter, that is, a single-band EQ which mutes everything below a selectable “corner” frequency, onto every track in my session file save the kick and bass elements. I then choose a corner frequency for each track low enough in the frequency spectrum that it doesn’t disfigure the track, extracting so much bass content that it sounds obviously filtered, but high enough that the kick and bass elements punch through the mix unmasked. This simple practice is the most important thing a recordist can do to ensure that their mixes retain their low-end clarity. Playlist Title: Understanding Records: Audio Concepts & Demonstrations Track: 12. Hi-pass/Low-pass A brief demonstration of the hi-pass/low-pass principle. The same excerpt is played three times. The first time, the track sounds in an untouched state. The second time, a hi-pass filter is applied to the vocals, “cornered” at 1 kHz. The third excerpt features a low-pass filter applied to the vocals, “cornered” at the same frequency (1 kHz).

Understanding Records, Second Edition


Video Title: Alex Chuck Krotz Explains EQ. One of Canada’s top engineers, with credits on records by the likes of Drake, Three Days Grace, Big Wreck, Billy Talent, and Shawn Mendes, Alex Chuck Krotz, talks about how he conceives of EQ as a creative tool, which EQs he tends to use for which particular applications and some of the more common equalization moves he makes when mixing, in this video which he prepared exclusively for this book.

Dynamics While equalization reshapes the amplitude of an audio signal at select frequencies, “dynamics processors” react to, and reshape, the amplitude of an audio signal across the frequency spectrum. They are, in other words, automatic volume knobs.12 Most often, users select a threshold, expressed in decibels, which serves as an automatic trigger for the device. When the amplitude of the input signal registers either above or below that threshold, the dynamics processor either amplifies or attenuates the input signal by a select amount, usually called the “ratio” or “range,” depending on the device used and how it is set. Recordists can furthermore specify how quickly, and for how long, they want the processor to work after it has been triggered, by setting unique “attack” and “release” times, respectively. The most common dynamics-processing techniques used today are compression/limiting and gating. I will now examine these techniques in order.

Compressors/Limiters “A compressor and limiter both operate on the same principle,” according to Bill Gibson: The only difference between these dynamic processors is the degree to which they affect the signal that exceeds the threshold. Whereas a compressor functions as a gentle volume control—which, normally, should be transparent and seamless to the audio process—a limiter is more extreme, radically decreasing the level of the signal that passes above the threshold. Each of these tools has a distinct and important purpose for the creation of professional sounding audio. . . . When used [correctly] a high-quality compressor/limiter is transparent throughout the majority of the audio material, while performing important level control during peak amplitude sections.13

Mixing (The Space of Communications)


Only making matters more difficult for the inexperienced recordist, “there is nothing built in to human hearing that makes it particularly sensitive to the sonic signatures of compression,” as Alexander Case (2007: 161–62) notes: There is no important event in nature that requires any assessment of whether or not the amplitude of the signal heard has been slightly manipulated by a device. Identifying the audible traits of compression is a fully learned, intellectual process that audio engineers must master. It is more difficult than learning to ride a bicycle. It is possible that most other people, including even musicians and avid music fans, will never notice, nor need to notice, the sonic fingerprint of compression. [Recordists] have this challenge all to themselves.

The craft of compression is comprised of five basic adjustments: (i) threshold, (ii) ratio, (iii) attack, (iv) release, and (v) knee. As noted, compressors feature a threshold that users set to a particular decibel value; the compressor triggers (turns on) whenever the amplitude of the input signal exceeds that value. Once triggered, the compressor attenuates the input signal according to a selected attenuation ratio. If the input signal registers, say, 6 dB over the selected threshold and the attenuation ratio is set for 3:1, the resulting compressed signal will rise only 2 dB above the threshold. Set for a ratio of 3:1, the compressor allows an output of only 1 dB above the threshold per every 3 dB above the threshold the processor registers. A ratio setting of 10:1 or higher on a compressor results in “limiting,” that is, it limits the dynamic range of an audio signal such that it never exceeds a selected decibel value. This said, most recordists use a dedicated limiter, and not a compressor set for a ratio of 10:1, to achieve this effect nowadays (I discuss limiting, and specifically “brickwall” limiting, in greater detail in the next chapter of this field guide). In general, it is enough to note here that compressors compress the distance between a waveform’s peaks and valleys, which is to say, compressors compress the dynamic range of audio signal, while limiters limit the peak amplitude of audio signal to an absolute decibel value.14

Pumping Compression is easiest to hear when it is creatively misused. The most common, and obvious, misuse for compression produces what recordists call “pumping,” that is, audible unnatural level changes associated primarily with the release of a compressor. While there is no single correct way to produce pumping, the effect usually accrues when recordists fail to assign a long enough release time to


Understanding Records, Second Edition

adequately handle a decaying signal. The resulting sound usually features a slight “pump” in volume, as the compressor stops attenuating and the signal returns to its pre-compressed level, even while the sound itself remains in its decay phase. In fact, pumping is now a common technique in rock, hip-hop, and electronic dance music. House, techno, hip-hop, and what was once called Intelligent Dance Music (IDM) all very often feature the technique, as do tracks by modern rock bands like Radiohead and, periodically, the Strokes. Flying Lotus’s 1983, Los Angeles, and Cosmogramma are replete with examples, as are most instrumental records by the late hip-hop producer J-Dilla, though readers may find it easier to hear the technique at work in a Top 40 rock production rather than in experimental IDM or hip-hop. Phil Selway’s drum track for Radiohead’s “Exit Music (For a Film)” provides a celebrated example. From the time they enter at 2:51, Selway’s drums are increasingly compressed to the point of pumping, and pumping is particularly audible on the cymbal-heavy portion of Selway’s performance from 3:20 to 3:36. In fact, the cymbal crash which sounds from 3:26 to 3:28 of “Exit Music (For a Film)” pumps so dramatically, and so obviously, that producer Nigel Godrich may as well have foregone the compressor and simply automated a volume swell on the cymbals. Pumping is also clearly audible on the electro-percussion loop which underpins Radiohead’s “Idiotheque,” track eight from the band’s groundbreaking album Kid A, especially in the wild dynamic fluctuations that characterize the snare sample’s sustain-and-release profile after the entrance of the sampled pads at 0:11. Benny Benassi’s “Finger Food,” which opens Rock and Rave, offers another clear example, and pumping is likewise clear on the ride cymbals throughout Portishead’s “Pedestal,” track nine on Dummy. Playlist Title: Understanding Records, Chapter Two: Pumping A playlist comprised of tracks noted in the section directly above, in the order they are noted, and others that feature clear examples of pumping.

Side-chain pumping A more advanced, if heavy-handed, application of dynamics processing entails use of a compressor’s “side-chain” feature. When it is side-chained, a compressor uses the amplitude envelope (dynamics profile) of another track as a trigger, and it is not to be confused with a compressor’s side-chain filter (SCF), which allows recordists to high-pass or low-pass a compressor’s detection circuitry. Applied to, say, a synth pad, a compressor which is side-chained to a kick drum playing

Mixing (The Space of Communications)


regular four-on-the-floor quarter notes will compress (attenuate) the synth pad when the amplitude of the kick drum surpasses the threshold setting of the compressor. This produces a sequence of quarter-note volume swells in the synth pad, offset from each iteration of the kick drum by a selected release time. The result is similar to pumping, but side-chain pumping fluctuates according to the amplitude of another track and regardless of the properties of the compressed track. Side-chain pumping has found a special place in the hearts of recordists working in “electronic” genres lately, likely due to how effective the process can be in generating musical excitement compared with how easy it is to produce in the modern workstation. Eric Prydz’s “Call on Me” is largely credited with having popularized the technique, even if Daft Punk’s “One More Time” could easily make a claim for that honor. “Call on Me” features obvious side-chain compression on the synth pads whenever the kick drum sounds (i.e., 0:00–0:25, 0:37–0:50, 0:52–1:03, 1:08–1:19, 1:30–1:45, and 2:16–2:42) and a conspicuous absence of side-chain compression whenever the kick drum is tacit (i.e., 0:26– 0:37, 0:51–0:52, 1:03–1:08, 1:19–1:30, 1:46–2:16, and 2:43–3:00). The sidechain pumping effect is clearly audible during the first fifteen seconds of the track, when only the kick drum and a synth pad sound in tandem, but it is also abundantly clear in subsequent sections, especially from 0:39 to 1:50. Another clear example of side-chain pumping can be heard on Madonna’s “Get Together,” track two on Confessions on a Dance Floor. During the song’s thirty-eight second introduction, the amplitude envelope of the synth pads remains constant, making the side-chain pumping that sputters in on those pads with the entrance of the kick drum at 0:38 all the more conspicuous. The synth arpeggios which sound between 1:28 and 1:55 of Benny Benassi’s “My Body (featuring Mia J.)” are also instructive. From 1:28 to 1:42, the kick drum sounds the obligatory “four-on-the-floor” pulses and the side-chained compressor on the synth follows dutifully along; between 1:42 and 1:55, however, the kick is tacit and the synth track repeats completely unattenuated. More recently, Loud Luxury made use of the device on their smash hit “Body,” side-chain pumping obviously sounding in each of the song’s main chorus/drop sections; the device is particularly obvious in the first drop (0:55 in), when the accompaniment fluxes rhythmically in time with the kick even as the main vocals go on untouched. And if readers are still struggling to hear the effect in any of these tracks, I recommend checking out Flying Lotus’s “Tea Leaf Dancers,” which features an entire mix side-chained under the kick.


Understanding Records, Second Edition

Playlist Title: Understanding Records: Audio Concepts & Demonstrations Track: 13. Side-chain Pumping A brief demonstration of side-chain pumping. Listeners will hear eight hits of a kick drum, followed by a pad. Then they will hear the kick drum and pad play together without any processing. Finally, the pad is side-chained under the kick.

Playlist Title: Understanding Records, Chapter Two: Side-chain Pumping A playlist comprised of tracks noted in the section directly above, in the order they are noted.

Side-chain as EQ: Automatic relationships Of course, side-chaining began as a simple technique in mixing to help combat masking. A series of automatic side-chaining relationships are usually established when mix sessions begin, in fact. For instance, the bass element in a track will most likely be side-chained quickly under the kick; the hi-hat will be side-chained under the snare; overhead drum tracks are usually side-chained under kick (and, often, also under the snare); the drum buss and padding elements are side-chained under vocals; backing vocals are side-chained under the lead vocals; and as we’ll see shortly, effect sends are often side-chained under the “dry” tracks that generate them (i.e., a thick reverb is often side-chained under the vocals they process in modern pop productions). By using these automatic side-chain relationships, recordists use compression as a kind of equalization device, prioritizing tracks across the audible spectrum by altering their dynamics profiles.

Noise gates As compressors attenuate signal above a threshold, noise gates attenuate signal that registers below it. Unlike compressors, however, these gates attenuate signal by a fixed amount, called the range. Recordists chiefly use gates to reduce the input signal to silence at quiet intervals, which can require a range of more than 80 dB of immediate reduction, though the device has plenty of other established uses. Aside from “attack” and “release” settings, gates also typically feature a “hold” setting, which determines the length of time—usually anywhere from zero to three seconds. The gate remains active once the signal which triggered it has subsided under the threshold. Most gates also feature a “look-ahead” function, which delays the input signal by millionths of a second, so the gate can examine its amplitude and determine a “soft,” that is, a gradual, response to any portions of the input signal which require attenuation.

Mixing (The Space of Communications)


Figure 2.8  The noise gate from LogicX’s suite of native plug-ins.

Gated reverb Though gates are often used to remove self-noise from the quieter portions of noisy tracks, they also have their creative misuses. One of the best-known misuses for gating is the so-called “gated-reverb” effect, which producer Steve Lillywhite and engineer Hugh Padgham famously used to craft Phil Collins’s notoriously massive drum timbre on Peter Gabriel’s “Intruder.” Recorded in a cavernous barn with room mics distributed at strategic intervals to capture the kit’s ambient reflections in the room, the resulting drum track was compressed first and then gated. Lillywhite and Padgham used compression to ensure that Collins’s drum hits and their ambient reflections sounded dynamically equivalent, the usual amplitude envelope of a Top 40 drum kit in this case replaced by a cavernous, elongated decay and sustain profile, while the gate truncated the release profile of each hit, ensuring an almost instantaneous transition from sound to silence. Lillywhite and Padgham then applied different release settings on the gates for each component drum track, allowing about a quarter note of sustain on the kick drum and only an eighth note of sustain on the snare, for instance. Playlist Title: Understanding Records, Chapter Two: Gated Reverb A playlist comprised of tracks that feature clear examples of gated reverb on the drums.

Multi-latch gating Another celebrated misuse for gating, which I call “multi-latch gating,” can be heard on David Bowies’s “Heroes,” track three on the middle record from his and


Understanding Records, Second Edition

Brian Eno’s so-called “Berlin Trilogy” (i.e., Low, Heroes, and Lodger). “Heroes” tells the story of a doomed couple, living on either side of the Berlin Wall, whose clandestine meetings Bowie claimed to have observed from the windows of Hansa by the Wall Tonstudio while he, Brian Eno, and producer Toni Visconti recorded there. The track features a number of quirky production techniques which have long since become the stuff of legend for subsequent generations of recordists. Likely the most celebrated production quirk on “Heroes” is the treatment Visconti devised for Bowie’s lead vocal. Vicsonti tracked Bowie using three different microphones, distributed variously throughout Hansa’s cavernous live room, specifically, at a distance of 9 inches, 20 feet, and 50 feet, respectively. Applying a different gate to each microphone, Visconti ensured that signal made it through to tape only when the amplitude of Bowie’s vocals became sufficiently loud to surpass each gate’s threshold. Moreover, Visconti muted each microphone as the next one in the sequence triggered. Bowie’s performance thus grows in intensity precisely as evermore ambience infuses his delivery until, by the final verse, he has to shout just to be heard. This multi-latch system was, apparently, a common recording technique in the classical genre (apocryphal stories of the process being used to record Glenn Gould, for instance, abound). Visconti is nonetheless typically credited with having invented (or, at the very least, popularized) multi-latch gating in rock, perhaps indicating how little sharing of notes went on between rock and classical recordists in the mid- and late 1970s. In any event, the distanced effect which multi-latching generated on “Heroes” is obviously reinforced by Visconti’s treatment of Bowie’s backing vocals during the final verse and chorus. Beginning at 4:28, it is the backing line which appears driest and loudest in the mix; the conventional dynamic relationship between lead and backing vocals is, in this case, reversed so Bowie’s lead vocals are buried far back in the mix while his backing vocals swell to the fore. The more Bowie shouts just to be heard, in fact, the further back in the mix Visconti’s multi-latch system pushes his vocal tracks, creating a stark metaphor for the situation of Bowie’s doomed lovers: shouting their love for one another over the Berlin Wall, the more passionate their (undoubtedly fictional) love grew the further apart they must have felt.

Envelope following Just as compressors can side-chain to a different track than the one they attenuate, so too can gates key to another track than the one they attenuate.

Mixing (The Space of Communications)


Keyed to a vocal track, for instance, a gate on an electric guitar part will only open to allow sound through when the amplitude of the vocal track exceeds its threshold. One of the more obvious and creative uses for keyed gating produces “envelope following,” that is, the deliberate refashioning of one amplitude profile either to match or, at the very least, to closely imitate another. This technique has become particularly popular in modern electronic productions, and it has come to define modern trance records as much as the bpm and synth choice. In fact, so prevalent is envelope following in modern trance that some commentators now refer to the process as “trance gating.” Trance DJs routinely key sustained synth pads to more rhythmically active tracks in order to produce the characteristic syncopated sound of trance gating; the technique can clearly be heard, for instance, on the synth pad which starts fading in at 0:45 of DJ Nexus “Journey into Trance.” Trance gates can be produced any number of ways, including via a dedicated trance gate plug-in which combines sequencing capacities with a keyed gate. The synth sequence is keyed, internally, to a rhythmic sequence. As Doug Eisengrein explains: A still common yet more creative use for [keyed gating] involves a noise gate in conjunction with any form of rhythmic music. Dance-floor tracks such as techno or synth-pop can be enhanced by placing a gate on a synth line or pad and using a sharp repetitive sound such as a hi-hat, clap or conga loop for the key input. Now, each time the relevant percussive element triggers the gate, the synth line will be chopped up, resulting in a rhythmically locked arpeggio of sorts. I can imagine you trance addicts licking your chops now in anticipation of your next studio session. By adding this simple ingredient into the stew, your tracks can churn and bubble along just a little bit funkier.15

Playlist Title: Understanding Records: Audio Concepts & Demonstrations Track: 14. Envelope Following A brief demonstration of the “trance gating” principle, namely, envelope following. A short percussion part repeats twice, and then it plays along with a sustained sample pad. The pad is then “keyed” to the percussion part, that is, the pad follows the envelope of the percussion track, while the percussion part is no longer routed to output, leaving only the pad to sound in its altered state.

Keying for feel and groove Envelope following is sometimes used to tighten a sloppy performance. In funk and disco, for instance, rhythmic precision was a key ingredient but the unique way that each musician interpreted the global pulse of a song played an equally


Understanding Records, Second Edition

crucial role. Nile Rodgers famously used keyed gating on a number of Le Chic productions to superimpose his rhythmic feel onto tracks performed by other musicians. Rodgers himself recalls: There was the gimmick of my rhythm playing—which was pretty accurate, pulsing on the money—being used as a trigger for other instruments that weren’t playing nearly as funky. On the very first record that we recorded, “Everybody Dance,” we did it with one of my jazz-musician friends playing clavinet. He was not funky at all. So, when you hear that really cool solo that he plays on the song, it’s actually him just playing whole notes while the rhythm is keyed by my guitar. That was our very first recording, and [producer] Bob Clearmountain taught us how to do that. He said, “Oh man, the keyboard player sucks! Why don’t you play the rhythm Nile, and just let this guy play whole notes. . . .” We used that trick quite successfully later on, the most successful being on the Diana Ross song “Upside Down,” where I keyed the funky rhythm of the strings.16

Another use for this process is to increase the “groove” of rhythm section tracks in a production. In this case, the gate only attenuates the signal by a few decibels, perhaps even just by half of a decibel. With the “range” set, the keyed tracks only pump by half a decibel when the gate opens. When keyed to, say, the snare drum for a song’s chorus, rhythm section tracks sway forward on the two and four, thus buttressing the groove.

Keying the kick Envelope following can also be used to EQ tracks. A common candidate for this treatment in modern productions is the kick drum. Recordists insert a noise gate into the signal chain for a low-frequency oscillator set to generate a signal somewhere in the second or third octave, between roughly 40 and 90 Hz. The gate is then keyed to the kick, and its threshold adjusted, such that the low frequency only sounds in unison with the kick, buttressing its every iteration with added bass components (albeit offset by the attack and hold settings on the keyed gate). Clear demonstrations of this “buttressing” technique can be heard throughout hip-hop in general, but especially on recent releases by FlyLo, Prefuse 73, J-Dilla, Samiyam, and Hudson Mohawke. Playlist Title: Understanding Records: Audio Concepts & Demonstrations Track: 15. Keying the Kick A brief demonstration of “keying the kick.” Eight hits of a kick drum sound. A brief 50 Hz sine wave then sounds, and then it is “keyed” to the kick.

Mixing (The Space of Communications)


Vocoding Imogen Heap famously demonstrated another creative use for envelope following in her breakout hit from 2005, “Hide and Seek.” Heap’s production for the song is remarkably sparse, given how lushly orchestrated it sounds on first blush. Sung into a Digitech Vocalist Workstation harmonizer, Heap’s lead vocals (which are manually doubled) and a few overdubbed vocalizations provide the song’s only melodic content, while the harmonizer, set in “vocoder mode,” automatically fills in the song’s background harmonies, using the pitches that Heap input on a connected keyboard while she sang. Keyed to Heap’s vocal line, the gate on the harmonizer only opens when Heap sings, and it abruptly attenuates the signal to silence during each pause in her delivery. The effect is strangely unsettling, as though a chorus of synthesizers were singing along with Heap, leaving only bits of reverb and delay to trail off into the periodic silences. Playlist Title: Understanding Records, Chapter Two: Vocoded Harmonies The vocoding effect noted above has become standard on many modern pop productions, specifically to buttress lead vocals with vocoded harmonies. This playlist begins with Imogen Heap’s “Hide and Seek,” but is followed by subtler examples which feature vocoded harmonies, such as Zedd’s “The Middle” and tracks from Ben Howard’s most recent release (which often sounds as though they have applied vocoding to the vocal reverb bus).

Ducked reverbs and delays Another common use of keyed gating is to “duck” the amplitude of one signal under another, that is, to make one track quieter whenever another gets louder. To achieve this effect, recordists insert a modified gate—a gate with its “ducking” function engaged—or a dedicated “ducker” into the audio chain of whichever track they want to attenuate. They then key the ducking mechanism to the track they want to make more prominent in the mix. Whenever the amplitude of the ducking signal registers above its threshold, the gate attenuates the volume of the ducked signal by a fixed range. Some analysts have noted that ducking closely resembles side-chain pumping. However, most agree that it is attenuation by a specified range which characterizes ducking, rather than the variable attenuation which a side-chained compressor creates.17 While ducking clearly resembles side-chain pumping in procedural terms, the techniques are aesthetically distinct. As I note above,


Understanding Records, Second Edition

side-chain pumping alters the dynamic contour of tracks and, in the process, transforms pads and ambience into rhythmic upstrokes. Ducking, on the other hand, increases the textural density of tracks and extends their temporal envelopes. Some of the most common candidates for this treatment, however it is achieved, are echoes and reverberations generated by lead-vocal lines. Using whichever ducking mechanism they prefer—and, to be clear, most recordists nowadays use a side-chained compressor—recordists routinely duck reverb and delay lines under the dry vocal tracks they process. Ducked echoes and reverberations thus emerge at an unobtrusive volume whenever vocals are present, but pump to the dynamic fore the second the vocals rest. A clear example of this technique can be heard on Céline Dion’s “The Power of Love.” Produced by Canadian David Foster, “The Power of Love” has all the hallmarks of a Dion track from the early 1990s. Every detail of Foster’s production seems geared to showcasing Dion’s trademark massive vocals, which range from a whisper to straightforward belting numerous times throughout the song. The ducking effect is most obvious during the first verse and chorus, as the lush reverb and delay on Dion’s vocals swell to the fore each time she pauses. The end of the first vocal line in the first chorus—“because I’m your lady”— provides the song’s clearest example, as the echoing delay and thick reverb on Dion’s vocals swell to audibility the very second she stops singing. Interestingly, as the song progresses and Foster’s layered arrangement becomes increasingly dense, the ducking effect becomes less pronounced as the arrangement fills in. By the second verse, at 1:23, it sounds like Foster has dispensed with the delay on Dion’s vocals completely, though the reverb remains constant and ducked throughout the song. The effect is also exceedingly clear on Adele’s “Cold Shoulder.” The introduction to Danger Mouse’s “Lazer Beam,” track three on From Man to Mouse, provides another clear illustration. And so, too, does Jay Da Flex’s dubstep remix of the Wu Tang Clan’s “Deep Space,” MF Doom’s lead vocal on Dabrye’s “Air,” and almost any cut from Mos Def ’s The Ecstatic. Ariana Grande’s recent “Raindrops (An Angel Cried)” offers a master class demonstration of the technique. That said, it is FlyLo who once again makes most overt, and experimental, use of the device. The celebrated producer routinely invokes ducked echoes on lead-vocal tracks, in conjunction with erratic stereo panning, to create the texturally dense—if not over-saturated—soundstage which characterizes his most recent productions.

Mixing (The Space of Communications)


Playlist Title: Understanding Records: Audio Concepts & Demonstrations Track: 16. Ducked Reverb and Delays A brief demonstration of the “ducked reverb/delay” principle. A vocal track sounds, with reverbs and delays left untreated to go “willy nilly” and mask the vocals. The same excerpt then repeats, with the reverbs and delays ducked under the main vocals.

Playlist Title: Understanding Records, Chapter Two: Ducked Reverbs/ Delays A playlist comprised of tracks noted in the section directly above, in the order they are noted, and others that feature clear examples of reverbs and delays ducked under lead-vocal tracks.

Video Title: kingmobb surveys lateral Dynamics Processing in His Beat Construction Matt Shelvock, aka kingmobb, who we met last chapter through his videos on tempo misalignment and sample curation, discusses here the role lateral dynamics processing plays in his compositional process, specifically with regards to “Lonesome Somedays.”

Distortion Once it is input into a device, even the most minute change in the waveform of an audio signal qualifies as “distortion.” Though there are many different kinds of distortion, recordists and musicians are mostly concerned about “amplitude distortion,” that is, changes in the amplitude of a signal. When the volume of an input signal is too great for a device to handle, or when it is magnified beyond what the device is equipped to convey, the transient peaks of its waveform are often radically limited, a process recordists call “clipping.” Solidstate and digital technology tends to clip immediately, “hard clipping” the input into a roughly square shape, adding odd-order harmonics to the output, which the human ear tends to interpret as “harsh” or “dissonant.” Tube and analog gear, on the other hand, will respond to overabundances of amplitude gradually, generating a beveled “s-curve,” or soft-clipped shape, which adds “even-order harmonics” to the output that the human ear tends to interpret as “consonant” (see Figure 2.9). In either case, amplitude distortion almost completely transforms the input waveform, and in so doing, significantly alters its spectral content.


Understanding Records, Second Edition

Figure 2.9  An undistorted, soft-clipped, and hard-clipped waveform.

Perhaps no genre so enthusiastically embraced distortion as rock ‘n’ roll. The recordists and musicians who invented rock ‘n’ roll championed distortion as a staple timbre, incorporating it into productions with wild abandon during the mid- and late 1950s. It was arguably this predilection, in fact, which most obviously set rock ‘n’ roll apart from tamer competitors during its formative years, though the genre’s suggestive rhythms did little to endear it to conservative listeners. As Ikutaro Kakehashi, a key player in the development of fuzz distortion, remembers: Nowadays there are various kinds of distortion. But back then, in the 1950s, well, clipping was all we could use. That’s why we [read BOSS] had to go ahead with development without an idea of what was right, without knowing what standard we should be aiming for. . . . The top players of the time sounded pristine. None of the top players used fuzz. . . . That was the cycle that we were caught up in. I wonder if it was Hendrix who pulled us out? When he became popular, fuzz suddenly took off.18

The slash and burn method Rock recordists have developed an extremely broad palette for distortion. Timbres such as “overdrive,” “fuzz,” “distortion,” and “feedback,” among others are now integral to numerous sub-genres of popular music. At first, though, rock musicians and recordists had very few tools for creating distortion. Early champions of the timbre like urban blues guitarist Muddy Waters and Brit-rock pioneer Dave Davies had little choice but to slash the speaker cones of their

Mixing (The Space of Communications)


amplifiers to create distortion. Or, they intentionally patched into old amplifiers with malfunctioning circuitry and burnt-out tubes to generate the timbre of distortion, as did bebop trailblazer Charlie Christian and journeyman R&B star Ike Turner. The Kinks’ breakthrough hit from 1964, “You Really Got Me,” provides a clear illustration of this “slash and burn” technique for creating distortion. A crucial volley in the first wave of the British Invasion, “You Really Got Me” generates much of its much ballyhooed brutal and anarchic excitement through the distortion which permeates Dave Davies power-chord guitar riff. This riff is heard most iconically during the first three seconds of the track. Davies created this distortion by taking a razor blade to the speaker cone of a small 8 or 10 W Elpico amplifier, and he combined the output from that amplifier with signal split to his standby Vox AC30 amplifier; meanwhile, older brother and singer Ray Davies quietly reinforced the riff on a Fender telecaster while his brother played. As Dave Davies himself remembers: I was getting really bored with this guitar sound—or lack of an interesting sound—and there was this radio spares shop up the road, and they had a little green amplifier in there next to the radios. It was an Elpico—I’ve got a picture of it on my web site—and I twiddled around with it and didn’t know what to do. I tried taking the wires going to the speaker and putting a jack plug on there and plugging it straight into my AC3O. It kind of made a weird noise, but it wasn’t what I was looking for. I started to get really frustrated, and I said, “I know! I’ll fix you!” I got a single-sided Gillette razorblade and cut round the cone . . . from the centre to the edge, so it was all shredded but still on there, still intact. I played and I thought it was amazing, really freaky. I felt like an inventor! We just closemiked that in the studio, and also fed the same speaker output into the AC3O, which was kind of noisy but sounded good.19

The overdrive method Rather than slash and burn their amplifiers, early pioneers of distortion could also simply “overdrive” the gain (volume). This technique was arguably the first distortion to regularly permeate pop records, in fact. Electric guitarists produced overdrive by “driving” (boosting) the gain on their guitars and amplifiers into nonlinear regions, which is to say, well past the point of distortion. Charlie Christian, for instance, whose work with distortion during the late 1930s and early 1940s was at least as visionary as his guitar playing, had only a 15 W Gibson EH-150 amplifier, equipped with a 10 inch speaker cone and a single 6L6 RCA vacuum tube, to project his ES-150 hollow-body electric guitar past the Big


Understanding Records, Second Edition

Bands and Swing Orchestras he played with.20 Competing with, say, the Benny Goodman Orchestra, whose main goal was to rile listeners to dance, his natural tendency would have been to crank the gain as loud as possible. Unlike other kinds of distortion, overdrive developed within an almost entirely tube-based environment: solid-state technology would not come to market until almost two decades after the likes of Charlie Christian and Les Paul first cranked the volume on their amplifiers.21 Overdrive, then, is usually associated with the sound of tube-based gear cranked, or overdriven, into its nonlinear region. Tube amplifiers gradually squash and compress an input signal as it distorts, creating a soft-clipped waveform. The distortion this gradual compression generates does not usually reach the same amplitude, nor does it sound as dissonant, as the distortion that hard clipping generates. Daniel Thompson explains this nicely: Clipping occurs when the amplitude of the input signal is too great for the device to handle. The upper portions .  .  . of the wave are either rounded off or drastically cut by the device, yielding a changed waveform. The exact result of this [is] called harmonic distortion. .  .  . When solid-state devices distort, this generates a waveform that resembles a square wave. In doing so, it also generates the harmonic components of a square wave, notably odd harmonics. By contrast, tube-based gear, which uses vacuum tubes instead of transistors for internal amplification, as well as analog tape, will start to gently distort before even reaching the so-called “point of distortion,” and subsequently eases into distortion gradually. .  .  . Soft clipping is heard to be more musical and more pleasant than hard clipping, and is often referred to as sounding “warm,” where hard clipping is said to sound “harsh” or “cold.”22

A celebrated landmark in the early development of distortion can be heard on Les Paul’s overdriven guitar part for “Bugle Call Rag,” which he recorded live with the Nat King Cole Trio at the Los Angeles Philharmonic Auditorium on July 2, 1944. The recording was first in a series of critically celebrated live albums that Norman Granz produced from the Philharmonic stage throughout the 1940s and early 1950s. Recently released as track three on the compilation album Jazz at the Philharmonic, “Bugle Call Rag” documents a fairly conventional bebop performance from the mid-1940s, replete with the obligatory “head-and-solos” arrangement, orderly turn-taking for the improvisations (even if the sequence of solos seems to have been determined on the spot) and the requisite frantic pace. Entertaining as his performance on the electric guitar is, it is Les Paul’s subtly overdriven tone which proves most compelling. Equipped with one of the guitarist’s hand-wound “super-hot” pickups, the guitar Paul used at this

Mixing (The Space of Communications)


time was already prone to producing amplitude distortion regardless of which particular amplifier it patched into. Thus, every note that Paul picks throughout “Bugle Call Rag” is overlain with a soft veneer of overdrive though the distortion becomes notably opaque on the syncopated accents Paul adds beginning at 1:33. “Bugle Call Rag” is, admittedly, not the most obvious example of overdrive on record. However, it is one of the earliest examples and, as such, provides an invaluable document of one musician’s efforts to consciously incorporate overdrive into his performance practice. Over time, musicians would become much more extroverted in their use of the effect. One need only sample rock records produced in the mid- and late 1960s by, say, Jimi Hendrix (“Purple Haze”), Cream (“Sunshine of Your Love”), the Beatles (“Helter Skelter,” “Everybody’s Got Something to Hide Except for Me and My Monkey” and “Yer Blues”), Buffalo Springfield (“Mr. Soul”), The Rolling Stones (“Jumpin Jack Flash”), the Who (“My Generation”), the Kinks (“All Day and All of the Night”), Led Zeppelin (“Good Times Bad Times”), CSNY (“Ohio”) and Pink Floyd (“One of These Days”) to grasp how quickly the rock world became infatuated with the timbre. In fact, it was not long until recordists thought to overtly apply the effect to other tracks on a record. One of the first lead-vocal tracks to feature overdrive, for instance, can be heard on the Beatles’ “I Am the Walrus,” arguably the band’s most notorious psychedelic offering from the acid rock era after only “Revolution 9.” Pathologically insecure about his singing, John Lennon routinely insisted that producer George Martin “pour ketchup” on any vocal tracks he recorded, that is, Lennon insisted that Martin should process his vocal tracks extensively. Though Martin apparently had deep reservations about doing this, the producer nevertheless seems to have complied with Lennon’s every request.23 Thus on the night of September 5, 1967, in Abbey Road Studio One, Martin and engineer Geoff Emerick felt free to overdrive Lennon’s vocal line for “I Am the Walrus” before it reached tape. The duo overdrived the volume on the REDD.47 preamplifier in the REDD.51 mixing console they used that night, and to increase the likelihood of distortion they even had Lennon sing through a rewired “talkback” microphone on the mixing console.24 The duo then doubled the overdriven track using Ken Townsend’s famous Automatic Double Tracking (ADT) machine (see “Doubling”).25 Though Lennon’s vocals are consequently overdriven throughout “I Am the Walrus,” the distortion is particularly opaque on the words “me” (0:25), “together” (0:25–0:29), “how” (0:30), and “come” (0:44), and throughout the final strophe (2:53–3:16) and extended outtro, especially on the singer’s emphatically absurd “jooba-joobas” between 3:23 and 3:34.


Understanding Records, Second Edition

Playlist Title: Understanding Records, Chapter Two: Slash, Burn & Overdrive A playlist comprised of tracks noted in the sections directly above, in the order they are noted, and others that feature clear examples of overdriven guitar and vocal tracks.

The stomp-box method Aside from arranging tacit sections, the slash and burn method of generating distortion obviously mandated use of distortion throughout the entirety of a production. Slashing the speaker cone on an amplifier, or boosting the volume of an amplifier with a loose connection, ensured that every tone it output bore the sonic imprint of distortion. However, distortion was not always appropriate throughout a song. Thus, recordists were forced to make difficult decisions about whether a production could include distortion at all. Beginning in the early 1960s, however, corporations such as Gibson Electronics brought stand-alone signal-processing units to market which afforded instrumentalists unprecedented control over when, and for how long, distortion should permeate their performances. Affectionately dubbed “stompboxes” by users, these processors were (and still are) designed for insertion into the audio chain as an intervening stage before final amplification. Each stompbox lays flat on the floor at the foot of musicians who, at any point during a performance, can stomp the large button on top to activate its effect (some devices, such as wah-wah and expression pedals, use a “treadle,” that is, a large foot-activated potentiometer). Moreover, each stomp-box usually features a series of dials and switches so recordists can individually adjust each processing parameter to taste. When a stomp-box is disengaged, the input signal is routed directly to output through a so-called “bypass” circuit. Simply activating a stomp-box completely transforms the timbre of electric instruments. The timbral complexity of rock music thus blossomed once stompboxes made market and, within less than a decade, stomp-boxes had reshaped signal processing into a core rock ‘n’ roll performance technique. By 1967, in fact, now-legendary rock guitarists like Jimi Hendrix and Pete Townshend could arrange entire improvisations around a core repertoire of stylistically unique signal-processing techniques, with only the odd bits of straightforward performance practice thrown in for good measure. Hendrix’s iconoclastic rendering of the American national anthem, “The Star Spangled Banner,” at the Woodstock Music & Art Fair in 1969, and his performance of “Wild Thing” at

Mixing (The Space of Communications)


the Monterey Pop Festival two years before remain lasting and widely celebrated early monuments to this development in rock performance practice.

The Maestro Fuzz-Tone FZ-1 The first stomp-box to reach market was Gibson Electronics’ Maestro Fuzz-Tone FZ-1, which made its American debut in spring of 1962. This device provided a distinctive “fuzz” distortion to any guitarist who could afford the device. Though nothing definitive has been written about the origin of the FZ-1 “fuzzbox,” an apocryphal creation myth for the device nonetheless circulates in modern rock criticism and in collecting publications. According to this legend, a Nashville studio musician named Grady Martin accidentally stumbled onto the fuzz sound while he worked a session for Marty Robbins, specifically, to record “Don’t Worry.” While Martin setup to record a 6-string bass part, either a connection in the mixing board came loose or a tube somewhere along the audio chain burnt out, nobody seems to know for sure. Either way, the signal for Martin’s bass quickly degraded and, eventually, distorted completely. Though conventional wisdom said that the resulting fuzz should be erased, engineer Glen Snoddy showcased the timbre by way of a 6-string bass solo during the song’s bridge (1:25 to 1:43). Unorthodox and risky as it was, Snoddy’s gamble paid off: not only did “Don’t Worry” become the first hit record to feature what would soon be a staple timbre in modern country and rock music, but it topped the country charts and made it to the third place on the Billboard pop charts in the process. The next year, in 1961, guitarist Billy Strange further popularized the fuzz tone by applying it to his lead-guitar work on Ann-Margaret’s first Top 20 hit in the United States, namely, “I Just Don’t Understand.” And he used the timbre again on Bob B. Soxx and the Blue Jeans’ “Zip-a-Dee-Doo-Dah,” which made the American Top 10 in 1962 no doubt thanks in large part to the production work of Phil Spector. Guitarist Big Jim Sullivan used the fuzz timbre for his guitar solos on P. J. Proby’s “Hold Me” and “Together,” using fuzzboxes custommade by famed electronics whiz Roger Mayer. Then came the Ventures, a prolific American surf rock band who counted themselves among the many listeners happily shocked by these records. So inspired was the band by the fuzz timbre, in fact, that they tasked their friend, session stalwart Red Rhodes, with devising a reliable means of reproducing fuzz for their records. A few months later, Rhodes presented the band with a custom-made fuzzbox which they used, in turn, on their Top 5 novelty number, “2000 Pound Bee.”


Understanding Records, Second Edition

Meanwhile, Glen Snoddy continued to engineer country sessions. Many of the musicians Snoddy engineered requested fuzz for their productions, and on other sessions Snoddy himself suggested the timbre. Demand for fuzz among country recordists soon reached the tipping point as records by the likes of Merle Haggard, Buck Owens, Waylon Jennings, Wanda Jackson, and Kay Adams, among many others, showcased the timbre. Snoddy thus set to work devising a reliable means of producing fuzz for sessions and, late in 1961, he hit upon the solution of a stomp-box. Convinced that he had a potential moneymaker on his hands, Snoddy pitched the device to Gibson Electronics who gladly shared Snoddy’s vision. The result was the production of the first Maestro Fuzz-Tone FZ-1 in the spring of 1962.26 Playlist Title: Understanding Records, Chapter Two: Snoddy Style Fuzz A playlist comprised of tracks which feature “Snoddy style” fuzz distortion. The Rolling Stones’ “Satisfaction” concludes the playlist.

The fuzzbox goes mainstream By the time Gibson manufactured the first FZ-1, a market for the fuzz timbre was clearly emerging in select rock and country music communities across North America and England. That said, it would still take a few more years for that market to reach anything like widespread proportions. Despite the initial optimism of Gibson’s marketing and sales personnel, who managed to convince retailers to purchase no less than 5,458 units of the FZ-1 in 1962 alone, sales of the FZ-1 fizzled. After its initial marketing push, Gibson managed to ship only three of the fuzzboxes throughout 1963 and 1964.27 Everything changed in 1965, however. That year Keith Richards used the Maestro Fuzz-Tone FZ-1 for the iconic chorus riff on the Rolling Stones’ “(I Can’t Get No) Satisfaction.” When the track topped the charts on both sides of the Atlantic Ocean in August of 1965, sales of the FZ-1 climbed into the thousands. Between August and December of 1965, a span of only four months, every last commercially available FZ-1 was sold; Gibson manufactured, and sold, an additional 3,454 units in January of 1966 alone. By December of 1966, just sixteen months after “(I Can’t Get No) Satisfaction” reached pole position on the charts, Gibson had shipped and sold a staggering 15,540 Maestro Fuzz-Tone FZ-1 fuzzboxes.28 An early artifact of stomp-box processing can be heard at 0:35 of the Rolling Stones “(I Can’t Get No) Satisfaction.” Having completed the first verse, for which

Mixing (The Space of Communications)


Richards uses an unprocessed tone, the guitarist somewhat sloppily stomps on the FZ-1 for the return of the chorus riff and a button click clearly penetrates the mix (the sound is similar to the clicking noise made by a retractable ballpoint pen). Another click can be heard at 1:35. This time, however, Richards stomps on the device roughly one second too late and the first two pitches of the chorus riff sound without fuzz. Finally, at 2:33, Richards prematurely stomps the FZ-1, adding fuzz to the last pitch of the song’s ramp section. Recognizing his mistake, the guitarist immediately stomps the fuzzbox off and a button click once more penetrates the mix. Despite its comic quirks, Richards’s performance on “(I Can’t Get No) Satisfaction” clearly established fuzz as a staple timbre in rock. Other guitarists almost immediately followed his lead in adopting fuzzbox processing as a core technique. Jeff Beck, for instance, used a Sola Sound MK I fuzzbox on the opening riff for the Yardbirds’ “Heart Full of Soul,” as did Paul McCartney on his doubled electric bass track for the Beatles’ “Think for Yourself ”; and Dave Davies reportedly did the same on the Kinks’ “All Day and All of the Night,” mercifully sparing his amplifiers any further injury. Meanwhile, Jimi Hendrix developed a tone almost wholly defined by fuzz, opting for chiefly American fuzzboxes like the Mosrite Fuzzrite and the Dallas Arbiter Fuzz Face, and countless other units custom-made by the likes of Roger Mayer. And, of course, Jimmy Page made fuzz a staple timbre of heavy metal and hard rock, incorporating distortion into the heavier portions of Led Zeppelin I and Led Zeppelin II by way of, most famously, Sola Sounds MK II fuzzbox (a Gibson overdrive pedal custom-made once again by Roger Mayer), a Marshall 1959 SLP amplifier and a small 15 W Supro practice amplifier, both cranked to their limits. By the early 1970s, the fuzzbox market had become crowded. Sola Sound introduced the British public to domestically manufactured fuzzboxes in 1965, marketing its MK I, MK II, and VOX Tone Bender fuzzboxes to great success. The next year, in 1966, Mosrite sold its first Fuzzrite fuzzbox in America and the Dallas Arbiter Group brought to market the notorious Fuzz Face, both of which would quickly become “go-to” devices for the likes of Jimi Hendrix, Eric Clapton, David Gilmour, Carlos Santana, Pete Townshend, and Jimmy Page. Finally, in 1971, Electro-Harmonix released its Big Muff, which defined the sound of distortion in British and North American progressive rock for the next two decades. In fact, the Electro-Harmonix Big Muff continues to figure in guitar rigs for some of the world’s most influential rock bands, including the Smashing Pumpkins, Mudhoney, Mogwai, Sonic Youth, and the White Stripes.


Understanding Records, Second Edition

Table 2.1  Some celebrated stomp-box distortion units and their users Manufacturer



Gibson Electronics

Keith Richards

Sola Sound

Maestro FuzzTone FZ-1 MK I

Sola Sound Dallas Arbiter Gr

MK II Fuzz Face


Companion FY-2


Super Fuzz


Fender Blender


Big Muff

Pete Townshend, Jeff Beck, and Jimmy Page Jeff Beck, Jimmy Page, and Eric Clapton Jimi Hendrix, Eric Clapton, Jimmy Page, and Eric Johnson Jim Reid, William Reid, and Johnny Greenwood Pete Townshend, Eric Clapton, and J. Masics Billy Corgan, Kevin Shields, and Bilinda Butcher Carlos Santana, David Gilmour, Billy Corgan, James Iha, J. Masics, Jack White, Stuart Braithwaite, James “Munky” Shaffer, Thurston Moore, Lee Ranaldo, Kim Gordon, Steve Turner, Mark Arm, and Kurt Cobain

Re-amping Since the time of FZ-1s, recordists have developed a highly nuanced understanding of the various distortions their amplifiers and processors can be made to generate. This tacit knowledge now forms the basis of an increasingly common processing technique called “re-amping.” When musicians track parts with their processing effects activated—when they track with, say, distortion and chorus effects actively processing the signal—there is little they can do later if the effects conflict with or mask other tracks in the production, or need changing for any other reasons. Consequently, recordists have taken to tracking parts without processing, only to process the signal later through a practice called re-amping.29 This entails sending the dry signal through various amplifiers, stomp-boxes, and outboard gear, and re-tracking the newly processed signal through a combination of close-mic, distance-mic, room-mic, and DI techniques. Now that re-amping is common, recordists are expected to be tremendously knowledgeable about the various timbral modifications available to them through the gear they use. Each particular amplifier, microphone, preamplifier, outboard processor, stomp-box effects pedal, audio interface, and cable is but

Mixing (The Space of Communications)


another signal-processing device, so far as modern recordists are concerned. Riku Katainen, guitarist for the Finnish metal band Dauntless, clearly demonstrates this expertise in a blog entry for the weekend of November 28 to December 1, 2008, which describes the re-amping procedure he undertook to craft both the rhythm and lead-guitar tones on Dauntless’ Execute the Fact: I started out tone hunting with [a] very basic combination: Tubescreamer + Mesa/Boogie Dual Rectifier + Mesa/Boogie Rectifier 4x12”. .  .  . I removed the Tubescreamer and took ProCo Rat distortion and boosted the amp with it. OMG! It had some “Swedish death metal” flavour, but not as extreme as Boss HM2 would have been. . . . Then the problems started. . . . [A]fter two broken fuses and one faulty power tube, we could not use the Mesa. We had to start from the beginning and find the same sound from a Peavey 5150 II. To our surprise we managed to match it 98 %. .  .  . We were very happy and re-amped all the rhythm guitars through this combination: Little Labs Redeye, ProCo Rat Distortion, Peavey 5150II, Mesa[-]Boogie Rectifier 4x12” Shure SM57, MS Audiotron MultiMix, RME Fireface 800. I wanted to keep it simple, so I recorded the cab with only one microphone. . . . I used the basic on-axis “dustcap edge” placement. . . . After the rhythm guitars were done, I changed the boost pedal to Tubescreamer to make it slightly smoother for lead guitars . . . and I was done.30

Re-amping does not always require such a compartmentalized process, however. Live re-amping, that is, re-amping done during live tracking sessions, played a crucial role in shaping the notoriously lo-fi timbre of Julian Casablancas’s lead vocals on the Strokes’ Is This It? and Room on Fire, for instance. Each track on these records features the same “hyped” vocal timbre, which was the product of a re-amping system that Casablancas himself devised, in conjunction with the band’s producer, Gordon Raphael. When Casablancas recorded demonstration tracks for what would become Is This It?, the vocalist used only an AudioTechnica 4033A condenser microphone, patched through a small 8 inch Peavey practice amplifier, with its bass knob turned all the way down. When it came time to re-record tracks for Is This It? Casablancas could think of no reason to change this signal chain—it had worked for demonstration recordings, the singer reasoned, so why should he change it just because a major label was now involved? Though he acknowledged the merit in Casablancas’s reasoning, Raphael nonetheless pushed the singer to consider more professional options, most of which included singing vocal overdubs into a Neumann TLM103 large-


Understanding Records, Second Edition

diaphragm condenser microphone. After what was apparently a difficult and protracted back and forth between the singer and his producer, Casablancas and Raphael eventually reached a happy compromise. As Raphael himself remembers: I would usually work with Julian for an hour just to get the voice tone. Until the final result was achieved he would be extremely suspicious and unhappy, and invariably the final result would have some kind of messiness or not-quiterightness about it, at which point he would smile and say, “This is great.” So, that was one technique, and then the second technique was something that Julian had discovered on his own at home while making the demos. He liked to sing through his Peavey practice amp, which is about eight inches tall, and I’d mike that with a Neumann TLM103, so he’d still be singing into the Audio-Technica [4033a]—Julian found the Neumann distasteful!—but I’d still be “Neumanning” it in order to get the exact details of what this horrible little amp sounded like. He wanted it shitty, but not too shitty. He would always say things like, “This sound needs to have its tie loosened.”31

Video Title: Dylan Lauzon Talks About Guitar Tone Dylan Lauzon, a professional guitarist and songwriter with, most recently, Nikki’s Wives, explains how he uses distortion to shape his guitar tone, and the role that re-amping plays in the band’s live and recorded output, in a video prepared exclusively for this book.

Specific uses for distortion: Sectional distortion Three more common uses for distortion require attention, namely, “sectional” distortion, “lift” distortion, and “reinforcement” distortion. Of these, sectional distortion is easiest to hear: recordists simply use distortion for only one or two sections of a song to achieve its obvious effect. A definitive feature of grunge records made in the early 1990s, sectional distortion characterizes records by the likes of Nirvana, the Smashing Pumpkins, Alice in Chains, Soundgarden, and countless other bands. Each of these groups crafted a slew of hit productions almost universally divisible into (i) quiet and timbrally subdued verses and (ii) massively distorted, exploding choruses. Some of the best-known productions to feature this device include Nirvana’s “Smells Like Teen Spirit,” “In Bloom,” “Lithium,” and “Lounge Act,” from the band’s breakthrough album Nevermind and “Heart Shaped Box” from In Utero; Alice in Chains’ “Rooster” from Dirt; the Smashing Pumpkins’ “Today,” from Siamese Dream, and “Bullet with Butterfly

Mixing (The Space of Communications)


Wings” from Mellon Collie and the Infinite Sadness; and Soundgarden’s “Black Hole Sun” from Superunknown.32 Playlist Title: Understanding Records, Chapter Two: Sectional Distortion A playlist comprised of tracks noted in the section above, in the order they are noted.

Specific uses for distortion: Lift distortion “Lift” distortion differs from sectional distortion in a number of ways. Most obviously, lift distortion is typically applied to individual tracks in a multitrack production. Recordists often use lift distortion to emphasize the attack profile of electric or synthesized bass tracks. This affords recordists a rare opportunity to “lift” tracks which might otherwise be buried (overwhelmed) in a mix, without having to apply a straightforward volume boost to those tracks. As Alex Case puts it, “Any single track of a multitrack project fighting to be heard in a crowded mix can achieve distinct audibility through the harmonic lift that comes from at least a little well chosen distortion.”33 Far from a quick fix, however, lift distortion shifts the spectral location of the masking problem, forcing recordists to clear room elsewhere in the mix for the lift itself. Though lift distortion has shaped the Top 40 since at least 1965, when Paul McCartney patched into a Sola Sound MK I fuzzbox to overdub a distorted bass part for the Beatles’ “Think for Yourself,” the device has most recently captured the attention of techno and electronic dance music recordists. The Chemical Brothers’ “Loops of Fury” provides an early example, though the duo applies distortion to the synth-bass track only variably on the record, allowing the track to sound dry at 3:29, for instance, and lifted at 3:46. Benny Benassi’s “Satisfaction” from his electro-house debut, Hypnotica, and Cassius’s “The Sound of Violence” from Au Rêve, use lift distortion throughout; both productions provide ideal demonstrations of the device in that they feature distortion as an integral component of the synth-bass timbres (rather than an aggressively dissonant addition). Since he produced Hypnotica, Benny Benassi seems to have adopted lift distortion as a common technique. On his most recent record, Rock and Rave, Benassi uses lift distortion on every cut, though it is clearest on “Finger Food,” “My Body (feat. Mia J),” “Who’s Your Daddy (Pump-kin Remix),” “Rock ‘n’ Rave,” “I Am Not Drunk,” “Free Your Mind—On the Floor (feat. Farenheit),” “Come Fly Away (feat. Channing)” and “Eclectic Strings.”


Understanding Records, Second Edition

Specific uses for distortion: Reinforcement distortion According to Roey Izhaki: Just like the parallel compression technique, where a compressed version is layered underneath the original, signals are sometimes distorted and then layered below the original [see Chapter 3 for more on parallel compression techniques]. This gives added control over the amount of distortion being added. Consequently, this lets [recordists] drive the distortion harder to produce a stronger effect, but then layer it underneath at lower levels so it is not too obvious.34

Though any track could theoretically benefit from the select application of this kind of “reinforcement” distortion, the technique is most often applied to rock vocals. The distinctive technique involves starting with an original track and then layering a heavily distorted double “under” (i.e., at a quieter volume than) the original track, which remains dry and undistorted. Given that the reinforcing (distorted) track is intentionally buried by recordists, reinforcement distortion can very easily go undetected by listeners, comprising a felt rather than heard element of productions. Reinforcement distortion does not necessarily require signal processing. Jimmy Miller, for instance, often reinforced Mick Jagger’s vocals on the more energetic numbers he produced for the Rolling Stones by having Jagger or Keith Richards shout a second take, which he then buried deep in the mix. “Sympathy for the Devil,” for instance, sometimes features a shouted double in the right channel throughout (especially obvious on “but what’s puzzling you is the nature of my game”), though the track is faded so that it only sporadically breaches the threshold of audibility; “Street Fighting Man” offers another obvious example. “Let It Bleed” provides another example of shouted (manual) reinforcement distortion, though Miller buried the shouted reinforcement track so far back in the mix that it takes headphones and an entirely unhealthy playback volume to clearly hear. By the time Miller produced the shambolic Exile on Main Street, however, he had dispensed with such preciousness altogether: the producer regularly pumps Jagger’s and Richards’s shouted reinforcement (sometimes harmony) tracks to an equal level with the lead vocals on the album. The first cut on the album, “Rocks Off,” and the ninth and twelfth tracks, “Happy” (featuring Richards on lead vocal) and “Ventilator Blues,” provide superb demonstrations of the technique. Other examples on the album include “Casino Boogie,” “Torn and Frayed,” “Sweet Black Angel,” “Turd on the Run,” “Loving Cup,” and “All Down the Line.”

Mixing (The Space of Communications)


More recently, in the early and mid-1990s, processed reinforcement distortion found a welcome home in Top 40 alternative rock productions. The Breeders’ “Cannonball” clearly showcases the technique, particularly during the introduction (0:00–0:15) and pre-chorus (i.e., 1:24–1:31), and throughout the second strophe (1:08–1:14). The Wallflowers’ “One Headlight” also makes effective, albeit subtler, use of the reinforcement technique, specifically, on Jakob Dylan’s overdubbed backing vocals; each successive chorus on the track features a slightly more distorted version of Dylan’s backing line until, by the third chorus (at 3:57), his voice sounds more like a dot-matrix printer than anything else. “Gratitude” and “Stand Together,” from the Beastie Boys’ Check Your Head, likewise feature reinforcement distortion on Mike D’s, Ad Rock’s, and MCA’s vocal tracks. And, perhaps most famously, Nirvana made copious use of the device during the chorus for “Smells Like Teen Spirit,” “Lithium,” and “In Bloom” from Nevermind; during the second refrain on “Sliver” (“gramma take me home!”) from the B-sides and rarities compilation Incesticide; on the chorus of “Heart Shaped Box” and in the vocal harmony for “Pennyroyal Tea” from In Utero; and on the chorus for the posthumous scorcher, “You Know You’re Right.” Playlist Title: Understanding Records, Chapter Two: Reinforcement Distortion A playlist beginning with the tracks noted in the section directly above, in the order they are mentioned, which have obvious examples of “reinforcement distortion” for the main vocal line.

Feedback Feedback is a “special case” kind of distortion. The often shrill timbre accrues when the input and output of an audio system—say, a microphone and speakers—combine to create a mutually reinforcing “feedback loop.” A feedback loop is created when a signal enters an input, usually a microphone or a guitar pickup, and exits through loudspeakers or an amplifier. The amplified signal is subsequently received again through the input and is sent back through the amplifier, and back through the input, and back through the amplifier, and so on, until something like the scream of a kettle boiling comes bursting through the line. Moreover, the screaming only intensifies until a fresh audio signal is input (by the same device) and the “loop materials,” that is, the physical materials which resonate at the feedback frequencies, are made to vibrate at different rates, thus breaking the loop.


Understanding Records, Second Edition

Figure 2.10  Block diagram of the signal flow for a common feedback loop.

Pitched feedback and whammy slamming Rock musicians and recordists are often far more interested in provoking and manipulating feedback than they are in preventing it from ever arising. Since guitarists like Jimi Hendrix and Pete Townshend first destroyed their amplifiers to create it, a bona fide technique for provoking and manipulating feedback has emerged. Guitarists boost the volume on their amplifiers and turn to face the speaker cone(s) with their guitars, either sustaining a pitch or leaving the strings to resonate unhindered. The requisite feedback loop will then generally accrue. Once they have established the requisite loop so feedback accrues, guitarists change the resonating feedback frequencies by changing the angle between the guitar and the amp. This physical movement re-dimensions the feedback loop and, in turn, alters its resonance, allowing guitarists to create melodies comprised entirely of “pitched” feedback. A clear example of this technique can be heard during the guitar solo for Tool’s “Jambi” on the album 10,000 Days. Robert Fripp’s lead-guitar work for David Bowie’s “Heroes” provides another clear example: the guitarist’s pitched-feedback lines sound prominently in the right channel from 0:06 to 0:16 and 0:19 to 0:26, and in the left channel from 0:35 to 0:41, of the extended cut of the song which appears on the album (i.e., as opposed to the curtailed single release). As producer Toni Visconti recalls:

Mixing (The Space of Communications)


Fripp [stood] in the right place with his volume up at the right level and getting feedback. .  .  . Fripp had a technique in those days where he measured the distance between the guitar and the speaker where each note would feedback. For instance, an “A” would feedback maybe at about four feet from the speaker, whereas a “G” would feedback maybe three and a half feet from it. He had a strip that they would place on the floor, and when he was playing the note “F” sharp he would stand on the strip’s “F” sharp point and “F” sharp would feedback better. He really worked this out to a fine science, and we were playing this at a terrific level in the studio, too. It was very, very loud, and all the while he was [creating] these notes—that beautiful overhead line—Eno was turning the dials [on the filter bank of his EMS VCS3 synthesizer, which Fripp patched into] and creating a new envelope. We did three takes of that, and although one take would sound very patchy, three takes had all of these filter changes and feedback blending into that very smooth, haunting, overlaying melody you hear.35

A celebrated live demonstration of pitched feedback took place during the first minute of the Jimi Hendrix Experience’s performance of “Wild Thing” at the Monterey Pop Festival on June 18, 1967. Captured for posterity by famed “rockumentarian” D. A. Pennebaker, as scene seventeen of his documentary film Monterey Pop, Hendrix begins the performance by provoking feedback from his Marshall “stack” amplifier and speaker cabinet. Warning the front row to guard their hearing—actually, he recommends they plug their ears—Hendrix impulsively grabs the whammy bar on his Fender Stratocaster and upends the instrument, stabbing the neck downward between his legs. A convulsive dance ensues: swooning, shaking, and swaying from side to side, Hendrix struggles to embody each unorthodox sound that he coaxes from his guitar and amplifier. More pitched feedback and whammy slamming, that is, manipulation of the whammy bar, ensues until, after about a minute, Hendrix subtly checks his tuning and cues the band to enter. Actually, by the time Hendrix played Monterey, he had already demonstrated some of the more radical possibilities of pitched feedback and whammy slamming. Track nine on the guitarist’s debut record Are You Experienced?, entitled “Third Stone From the Sun,” provides something like a master class demonstration of both techniques. Between 2:34 and 5:12 of the track, Hendrix never once picks a string on his guitar, relying almost totally on pitched feedback and whammy slamming instead. When the guitarist slams (presses) down on the whammy bar, as he does at 2:27 for instance, the string tension on his guitar slackens, producing what sounds like a slow-motion downward glissando, and when he pulls up on the bar, as at 3:43, the opposite effect is heard.


Understanding Records, Second Edition

Feedback lead-ins Though Jimi Hendrix remains feedback’s Dionysian image and Robert Fripp its Apollonian avatar, neither guitarist was first to feature the distortion on record. In fact, feedback has been an integral component of the Top 40 soundscape since at least December of 1964, when the Beatles’ “I Feel Fine” reached pole position on both the British and North American pop charts, becoming the first chart topper to showcase feedback distortion in so doing. “I Feel Fine” begins with the sound of a single note plucked percussively by Paul McCartney on his electric bass. Producer George Martin then orchestrates a cross-fade to pure feedback, generated by John Lennon using a Gibson J-160E semi-acoustic guitar leaned against an unspecified amplifier. Thus does “I Feel Fine” present listeners with a six-second musique concrète sound sculpture before anything like a pop song can be said to emerge. According to Paul McCartney: John had a semi-acoustic Gibson guitar. It had a pick-up on it so it could be amplified. .  .  . We were just about to walk away to listen to a take [of “I Feel Fine” (1964)] when John leaned his guitar against the amp .  .  . and it went, “Nnnnnnwahhhhh!” And we went, “What’s that? Voodoo!” “No, it’s feedback.” “Wow, it’s a great sound!” George Martin was there so we said, “Can we have that on the record?” “Well, I suppose we could, we could edit it on the front.” It was a found object—an accident.36

Feedback lead-ins would soon become the industry standard in pop productions. Producer Chas Chandler, for instance, used the device on Jimi Hendrix’s “Foxey Lady,” the producer grafting feedback onto a hammered and pulled guitar trill during the song’s ten-second introduction. George Harrison would overdub a feedback lead-in, comprised solely of pitched feedback and whammy slamming, onto the first few seconds of the Beatles’ “It’s All Too Much.” And, a short while later, Hendrix himself would adopt the technique to introduce “Crosstown Traffic,” track three on the guitarist’s self-produced Electric Ladyland. Other more recent feedback lead-ins can be heard on the first eighteen seconds of the Strokes “New York City Cops,” track seven on the European release of Is This It?; the first minute of Ben Folds Five’s “Fair”; the opening forty-five seconds of Midnight Juggernauts’ “Road to Recovery”; Nirvana’s “Radio Friendly Unit Shifter”; the first few seconds of the Jesus and Mary Chain’s “Tumbledown” and “Catchfire,” and, briefly, in the introductions for the Stone Roses’ “Waterfall” and Hole’s “Mrs. Jones,” and on the Cure’s “Prayers For Rain.” The same thing happens on “New York City Cops”; feedback swells from 0:15 to 0:17, even as

Mixing (The Space of Communications)


feedback from another amplifier trills unabated, at which point Hammond Jr. plucks the downbeat of the main riff and the song begins in earnest. Playlist Title: Understanding Records, Chapter Two: Feedback Lead-Ins A playlist comprised of tracks noted above, featuring feedback lead-ins.

Transitional feedback swells Combined with a quick volume swell, feedback can also function as a transitional device. In this case, electric guitarists provoke feedback a few bars before the beginning of a heavier—or, at the very least, a more energetic—section of a song, only to break the loop by picking something on the downbeat of that section, the distortion mixed to an ever-louder volume all the while before. Though examples abound on rock records made since the time of the Beatles (guitaristic bands like Meshuggah and Tool tend to be exceedingly fond of this device), arguably the gold standard for transitional feedback is Weezer’s “My Name Is Jonas,” track one on the band’s eponymous debut CD. Recorded sometime between August and October 1993, at Electric Lady Studios in New York City, and produced by ex-Cars front man Ric Ocasek, “My Name Is Jonas” begins with only a fingerpicked acoustic guitar and a tambourine rattling in the background. A massive wall of feedback quickly swells to the fore, however, and the song transitions into the first of its many heavy sections (i.e., 0:06–0:12, 0:15–0:55, 1:03–1:06, and 1:10–3:08). The Strokes have made transitional feedback yet another signature mixing move, featuring the device at least once on each of their first three records. “Reptilia” provides likely the clearest example: feedback grows in intensity from 0:06 to 0:12, at which point Albert Hammond Jr. picks the downbeat of the first verse and the feedback loop immediately breaks. The device also clearly figures from 0:07 to 0:12 on “Juicebox,” track three from the band’s First Impressions of Earth, and from 1:52 to 2:00 of Rage against the Machine’s “Killing in the Name Of,” Tom Morello provides a characteristically unique twist on the technique, adding an endless glissando to the transition.

Feedback fade-outs Recordists will sometimes opt to feature a cacophony of feedback to end tracks. Most often this is done to generate, rather than relieve, emotional tension. In most cases, a track builds to some kind of release, thematically and musically, before it cross-fades immediately to feedback, and, in turn, the feedback is almost


Understanding Records, Second Edition

always faded out to silence. This suggests that whatever emotional issues a song explores—and, more often than not, feedback fade-outs figure in the ending for songs which explore decidedly dark themes—they remain unresolved at song’s end. Consider, for instance, the disorganized conclusion to tracks like Nine Inch Nails’ “Mr. Self Destruct” and “Corona Radiata,” and Modwheelmood’s remix of Nine Inch Nails’ “The Great Destroyer.” The Jesus and Mary Chain proved themselves extremely fond of the feedback fade-out in the mid-1980s and early 1990s. In fact, some might argue that, barring the band’s sophomore offering (Wastelands), the Jesus and Mary Chain fashioned an entire career out of catchy pop hooks, cavernous reverb, and megalithic doses of feedback. The duo’s debut record, Psychocandy, prominently features earsplitting feedback on almost every track, for instance. But even on later tracks, like “Reverence,” feedback ebbs and swells unabated throughout the production; the Reid brothers even go so far as to feature feedback in the lead role during the bridge of “Reverence,” from 1:50 to 2:08. Other tracks on the Jesus and Mary Chain’s Reverence feature feedback in a subtler role, the distortion faded in and faded out as a tactile texture at various points (i.e., “Teenage Lust,” “Sugar Ray,” “Sundown,” and “Catchfire”). And still other tracks on the same album feature straightforward feedback fade-outs, including “Teenage Lust,” “Tumbledown,” “Catchfire,” “Sundown,” and “Frequency.” Playlist Title: Understanding Records, Chapter Two: Feedback Swells and Fade-outs A playlist comprised of tracks featuring feedback swells and feedback fade-outs.

Delay The “simple delay line” is the building block of all delay-processing techniques. Recordists feed an audio signal into a delay line. That delay line then shunts the input signal directly to output, precisely as it splits a copy of the signal, stores it somewhere for a certain period of time and then sends it to output as well. Recordists then fine-tune the delay line by adjusting its (i) “delay-time” setting, that is, the amount of time which elapses between the arrival of the shunted input signal and its delayed copy at output; (ii) “mix” setting, that is, the amount of input signal and delayed signal which the delay line outputs; and, finally, (iii) “feedback,” that is, the amount of the output signal which gets routed back to input for another round, and hence the length of time that the delayed signal will remain active.

Mixing (The Space of Communications)


Recordists have crafted a staggeringly diverse array of musical techniques just by adjusting the three parameters I listed above. For instance, to create reverberations, an effect I examine in greater detail in subsection “Reverb processing” of this chapter, recordists need only adjust the delay-time setting on a delay line for any value less than about 40 ms. To emphasize the comb filtering which delay times of under 40 ms inevitably induce, recordists may furthermore increase the mix setting on a delay line so the amount of direct and delayed signal sent to output is roughly equivalent. Of course, multitrack productions are continuously evolving; should an echo sound wind up being more appropriate than reverberations, recordists can simply readjust the delaytime setting on the delay line to a value of more than 50 ms. And, finally, should they want those echoes to continue repeating well after the delayed signal has faded to silence, recordists can increase the feedback setting on the delay line until the desired number of echoes accrue.

Tape delay Recordists initially had no choice but to use analog tape machines to add delay to an audio signal.37 Decades have passed since the heyday of tape, however. Recordists now, as a rule, chiefly use digital (algorithmic) processors to create their (digital) delay lines. That said, the demand for the distinctive sound of tape delay remains sufficiently high among recordists to warrant inclusion of at least one digital tape delay emulator in the suite of processing plug-ins which come bundled with most DAWs. These tape delays, as they are known, emulate the distinctive “warble” and “flutter” of old tape machines, that is, they simulate the subtle (and not so subtle) variances in pitch which accrue given imprecise and irregular tape speeds, just as they emulate the low-frequency bias of tape in general (digital tape delays thus filter progressively more high-frequency content from each subsequent echo in a simple delay line, as did the analog machines on which they are modeled).38

Slapback echo “Slapback echo” remains one of the most iconic tape delay sounds in modern pop and rock. Initially comprised of only a single distinct echo, delayed by anywhere from about 75 to 200 ms, the sound of slapback achieved early prominence on records produced by Sam Philips for his Sun Records imprint in the mid-1950s. Celebrated examples of slapback can be heard on the vast majority—though, contrary to popular belief, not on all—of Elvis Presley’s so-called “Sun Sides.”


Understanding Records, Second Edition

Slapback figures, for instance, on Presley’s lead vocal, and Scotty Moore’s lead guitar, tracks for “Mystery Train,” “Baby, Let’s Play House,” “Blue Moon of Kentucky,” “Tomorrow Night,” “Trying to Get to You,” and “I’ll Never Let You Go (Little Darlin’).” However, though Presley’s vocals provide the most iconic demonstrations of slapback, the effect is, to my ears, much clearer on Jerry Lee Lewis’s lead vocals for many of his own “Sun Sides,” especially “Whole Lotta’ Shakin’ Goin’ On” and “Great Balls of Fire.” The role that tape machines themselves played in shaping the sonic character of slapback was crucial. When Philips sold his contract with Elvis Presley to RCA in 1957, for instance, Presley’s new producer, Chet Atkins, famously struggled—and ultimately failed—to reproduce the distinctive slapback echo Philips devised for Presley at Sun Records. This was in large part because the tape machines at RCA ran at a different speed than the ones at Sun. As Atkins himself recalls: In order to recreate the slapback echo sound that had characterized all of Elvis’ Sun records, a speaker was placed under a stairwell out in the hall. We were recording on those RCA machines and they ran at a different [tape] speed than the Ampex machine we used at Sun. So we were careful about trying to capture the Sun sound. And, of course, we didn’t, but it was enough to fool people. But not enough to fool all of the RCA executives in New York. Displeased with the new records, they initially wanted to have Sholes go back to Nashville and have Elvis re-record the materials. The idea soon got vetoed, and following the release of “Heartbreak Hotel,” the fears were quickly forgotten.39

The difference between the vocal treatment on, say, Presley’s “Mystery Train” and “Heartbreak Hotel” clearly demonstrates the crucial role that the mechanical idiosyncrasies of tape machines themselves once played in determining the sound of slapback. In fact, once upon a time, expert listeners could hear the brand of tape used to make a record. As Daniel J. Levitin remembers: When [my] band broke up .  .  . I found work as a producer of other bands. I learned to hear things I had never heard before: the difference between one microphone and another, even between one brand of recording tape and another (Ampex 456 tape had a characteristic “bump” in the low-frequency range, Scotch 250 had a characteristic crispness in the high frequencies, and Agfa 467 a luster in the midrange). Once I knew what to listen for, I could tell Ampex from Scotch or Agfa tape as easily as I could tell an apple from a pear or an orange.40

Mixing (The Space of Communications)


The sort of particularity which Levitin describes has long since vanished, however. Again, most recordists now use digital plug-ins to create slapback echoes, which means that idiosyncratic differences in the mechanical operations of a processor seldom shape the sound of delay nowadays. Recordists simply shut the feedback setting on their delay processor completely off, so the delay line produces only a single echo, or for a very low value so relatively few echoes accrue. Or they dial in a “slapback” preset on the plug-in and make minor modifications to its settings. Though the specific delay-time settings that recordists use to create slapback may vary in accordance with the demands of each particular production, delay times generally range between about 70 and 200 ms. Moreover, though recordists usually opt to include more than one slapback echo, they still tend to restrict the feedback setting so it generates no more than three or four echoes.

Slapback as referent Slapback echo remains the most referentially fixed of all delay-processing techniques. When combined with particular instruments, performance styles, and compositional techniques, slapback can clearly reference the sound of rockabilly and rock ‘n’ roll records produced during the mid- and late 1950s. It is no surprise, for instance, that Robert Plant’s lead vocals on Led Zeppelin’s “Rock n Roll,” the band’s homage to early rock ‘n’ roll, features a slapback echo throughout. Indeed, the distinctively “retro” sound of slapback is everywhere apparent on John Lennon’s Rock ‘n’ Roll. An homage to 1950s rock ‘n’ roll in general, Rock ‘n’ Roll features eighteen cover versions of the rock ‘n’ roll genre’s most canonic tracks, including “Be-Bop-A-Lula,” “Ain’t that a Shame,” “Sweet Little Sixteen,” “Peggy Sue,” and “Stand By Me.” Co-produced by Phil Spector and John Lennon himself, every track on Rock ‘n’ Roll features some kind of slapback echo on the lead-vocal track(s), even if the comparatively quick delay-time settings on the vocals for “Stand By Me,” “Peggy Sue,” and “Just Because,” for instance, produces something much more like doubling than anything else. That said, on tracks like “Be-Bop-A-Lula,” “You Can’t Catch Me,” “Do You Want to Dance,” “Sweet Little Sixteen,” “Slippin’ and Slidin,’” “Bony Moronie,” and “Angel Baby,” the slapback echo is equally as prominent as Lennon’s lead vocal itself. More recently, slapback echo came to occupy a prominent position in the rockabilly, jump-blues, and psychobilly revivals that ascended to prominence during the early and mid-1990s. This revival was spearheaded by the likes of Brian Setzer, Colin James, and the Reverend Horton Heat. Lead vocals, electric


Understanding Records, Second Edition

guitars, and acoustic bass all regularly feature slapback on their albums, including the Brian Setzer Orchestra’s Guitar Slinger, The Dirty Boogie, and Vavoom!; the first three volumes of Colin James’s Colin James and the Little Big Band; and the Reverend Horton Heats’s Smoke ‘Em If You Got ‘Em, It’s Martini Time, Space Heater, and Spend a Night in the Box. So focused was the Reverend Horton Heat on reproducing the sound of classic 1950s rock ‘n’ roll records, in fact, that the psychobilly rocker re-recorded Smoke ‘Em If You Got ‘Em in its entirety on two-track tape when the initial multitrack masters failed to yield the requisite retro sound and feel. Precursors of the revival, such as the Stray Cats, likewise prominently feature slapback on their records. Setzer’s lead vocals for, among other tracks, the Stray Cats’ “Rock This Town,” “Rumble in Brighton,” “Runaway Boys,” and “Everybody Needs Rock and Roll” all provide clear examples. Playlist Title: Understanding Records, Chapter Two: Slapback All of the tracks mentioned in the sections on slapback above can be heard in this playlist, in the order they are mentioned. Every track on this playlist features slapback on vocals, and sometimes on other instruments as well.

Doubling Singers and musicians have doubled their parts for decades. Some singers even adopt the technique as a definitive characteristic of their vocal sound. One would be hard-pressed to find Elliott Smith, for instance, singing anything on record which is not doubled, and the likes of John Lennon and Kurt Cobain relied heavily on the technique as well. Until the mid-1960s, recordists had no choice but to overdub doubled tracks. This was a grueling, labor-intensive process which most musicians came to detest intensely. It was this distaste for manual double tracking, in fact, which provided Ken Townsend, an engineer who worked at Abbey Road Studios in the mid- and late 1960s, to invent the first signal processor designed to “automatically” double tracks. Townsend dubbed the machine, appropriately enough, the “Automatic Double Tracking” (ADT) machine. In Wayne Wadhams’s words: John [Lennon] liked to double and triple his vocals for a thicker, fuller sound, but he was also impatient, and dreaded the tedious process required to get both the pitch and timing of each syllable exactly right on a second or third pass. This led to the discovery of ADT (Automatic Double Tracking). Thenceforth, when a good lead vocal track was complete, it was played back into a second recorder equipped with a variable speed control. By manually varying the record

Mixing (The Space of Communications)


and playback speed of the second machine, a “copy” vocal came back perhaps twenty milliseconds (twenty thousandths of a second) after the original vocal. This signal was then mixed with the actual lead vocal and recorded onto an open track of the original tape. The result sounded like a perfectly doubled lead vocal. ADT was subsequently used on almost every lead and background vocal track of Revolver.41

The Beatles eagerly adopted ADT as both an efficiency aid and a powerful aesthetic tool. Vocal tracks on Beatles records released after Revolver regularly feature ADT; in fact, the band took to using ADT on a number of instrumental tracks as well. To produce an effected sound using ADT, Martin would pan both the input signal and its ADT-delayed copy to the same horizontal position in a mix. To emulate manual doubling, however, Martin panned the tracks to opposite sides of the stereo spectrum. Ryan and Kehew explain: When the original signal and delayed signal were panned to the same spot in the stereo picture, a distinctive sound emerged. The presence of two separate sounds could be discerned but the signal sounded affected: it did not sound entirely like natural double tracking. This, of course, was one of the qualities that greatly attracted The Beatles to the effect, aside from its time-saving benefits. However, when the two signals were panned to different parts of the stereo image, the double-tracking effect could be quite convincing indeed. A listen to the brass in “Savoy Truffle” illustrates this nicely; with the original signal panned to one side and the delayed signal panned to the other, the illusion of two separately overdubbed parts was remarkable. The vocals on “Ob-La-Di, Ob-La-Da” and “Birthday” were handled similarly. Quite often on later Beatles records, this effect was used to create a lush stereo image, an especially useful effect when only four tracks were available.42

Playlist Title: Understanding Records, Chapter Two: ADT A playlist comprised of some Beatles tracks with ADT on the lead vocals.

Since the heyday of ADT, recordists have had the benefit of digital processors for doubling. Recordists simply insert digital processors into the audio chain for a track, and they set the delay time for a period of anywhere from 20 to 50 ms. As I explain later, this delay-time setting can be modulated (modified) using a low-frequency oscillator (LFO) to emulate so-called “participatory discrepancies,” that is, subtle differences in delivery which accrue given more than one performance of a musical line, but this usually requires panning the input and doubled tracks to different positions along the stereo spectrum.


Understanding Records, Second Edition

When recordists do not require participatory discrepancies, however, they simply set the delay time for a period of anywhere from 20 to 50 ms, and they shut the feedback setting completely off (or, sometimes, to an extremely low setting). Clear examples of this technique at work in the modern pop soundscape can be heard in the lead-vocal tracks for Boston’s “More Than a Feeling” (sporadically on); Led Zeppelin’s “Immigrant Song,” “The Song Remains the Same,” “Trampled under Foot,” “Kashmir,” and “Nobody’s Fault but Mine,” and, more recently, during the refrain of the Jesus and Mary Chain’s “Reverence” and throughout Porno for Pyros’ “Orgasm.” This all said, recordists still often opt to manually double vocal tracks, especially when the goal is to thicken (add prominence to) those tracks. Further emphasis can be applied by panning both tracks to either side of the stereo spectrum. Elliott Smith, for one, uses this technique extensively on his self-produced Either/Or. “Speed Trials,” “Between the Bars,” “Rose Parade,” and “Angeles,” for instance, all feature manually doubled vocal tracks panned to either side of the stereo spectrum. Tracks like “Alameda” and “Pictures of Me,” on the other hand, feature doubled tracks panned to the same horizontal locations, at the frontand-center of Smith’s mixes for those cuts. Smith also combines centered and panned double tracks at various points on certain tracks. “Cupid’s Trick,” for instance, features centered vocals until the song’s explosive chorus, that is, when a call-and-response between centered (“shooting me up”) and panned (“it’s my life”) double tracks emerges. The track “2:45 AM” features centered vocals until 1:58, when Smith introduces double-tracked vocals on either side of the stereo spectrum, and the album’s concluding cut, “Say Yes,” features centered vocals for only its first seventeen seconds, after which point Smith doubles the vocals and pans both tracks to either stereo extreme. In the hip-hop genre, manual doubling is often introduced to emphasize only certain words in a lyric, usually—though by no means always—terms used to end phrases. This can work to reinforce the thematic content of a lyric, even if its primary function is simply to buttress the flow (delivery) of rapping MCs. Readers can select basically any hip-hop album at random and be assured that at least one or two tracks on the album will prominently feature this device. Clear demonstrations can be heard on Jay-Z’s “99 Problems”; the Wu Tang Clan’s “Wu-Tang Clan Ain’t Nothin’ Ta F’ Wit”; House Of Pain’s “Jump Around” and “Shamrocks and Shenanigans”; the Beastie Boys’ “Intergalactic” and “So What’cha Want”; Pharcyde’s “Oh Shit” and “Drop”; and Fat Lip’s “What’s Up Fatlip?”

Mixing (The Space of Communications)


Playlist Title: Understanding Records, Chapter Two: Manual Doubling A playlist comprised of all the tracks mentioned in the subsection above. Readers are encouraged to consider this playlist in conjunction with the playlist for “reinforcement distortion,” which I directed them to earlier in this chapter.

Echoes (unsynced) Delay-time settings of more than 50 ms produce echoes. Common-sense would seem to suggest that, when recordists opt to introduce echoes, those echoes should always be in sync with the tempo of tracks, so each echo metrically subdivides the global pulse. If echoes propagate at a rate which somehow clashes with the underlying meter of a song, they can easily induce timing errors, and confusion, especially when delay times are set for anything longer than an eighth-note value. This said, recordists continue to feature unsynced echoes on their productions. In fact, there are numerous reasons why recordists might opt for unsynced rather than synced echoes. Most often, though, unsynced echoes simply have an added emphasis in a mix, insofar as their position outside of the groove continuum sets them apart from other mix elements. The fact that unsynced echoes naturally avoid masking each other, while synced echoes are prone to masking, only makes the unsynced option all the more attractive. While the Edge’s guitar playing in general demonstrates basically every delayprocessing technique in the modern rock recordist’s toolbox, his guitar work for “Stuck in a Moment You Can’t Get Out Of,” track two on U2’s All That You Can’t Leave Behind, provides a particularly clear demonstration of the unsycned principle at work in modern rock. Just before the song’s pre-chorus at 0:29, the Edge’s dry guitar is suddenly delayed. The dry track remains on the left side of the stereo plane while the delay line is bussed (sent) to an open channel on the opposite side. Because the Edge subdivides the basic tempo of the track with a sequence of straight quarter- and eighth-note arpeggios at this point, producer Daniel Lanois had little choice but to unsync the delay line. Synchronized echoes would have overlapped the dry line, potentially inducing masking and, perhaps most egregiously in the soft rock world that U2 dominates, inducing inappropriate dissonances each time the harmony changes. Lanois further buttresses the dry track against masking through spatial separation, panning both tracks to opposite sides of the stereo spectrum. Beyond solving masking issues, panning also works to produce an effect like mirrored equalization in the pre-chorus for “Stuck in a Moment You Can’t


Understanding Records, Second Edition

Get Out Of.” Sent to either extreme of the stereo spectrum, the dry guitar track and its echoes prompt listeners to widen the perceptual width of the mix. Practical experience teaches the human ear to associate echoes with large spaces. Somebody shouts into a cavernous space and, after a certain period of time, the length of which is wholly dependent on the size of the space itself (longer delay times denote larger spaces), their voice reflects back as an echo. Knowing this, when listeners hear the Edge’s dry guitar track emanate on the extreme left and, shortly after, bounce back as an echo on the extreme right, practical experience suggests a large horizontal space between the sound source (the electric guitar) and its reflections (echoes), which perceptually wedges even more room into the center of the mix for Bono’s vocals.

Multitap echoes (synced) Despite its widespread use in pop, the feedback mechanism in a traditional delay line can be seriously limiting for recordists. “Echoes are spaced at regular intervals, their level drops in a predictable way and their frequency content is correlated,” notes Izhaki.43 And this regularity is not always appropriate. Rather than a run of uniformly diminishing quarter-note echoes, each panned to the left side of the stereo plane, for instance, a mix may require a run of six quarter-note echoes on the right side of the stereo plane, three sixteenth-note echoes on the left side, and then a sequence of crescendoing triplet eighth-note echoes panned back and forth across the stereo plane. While such a sequence would undoubtedly produce a totally bizarre effect, the control and flexibility required to create it is nonetheless necessary at times, given the unique demands of certain productions. To create a run of independent echoes, each with its own timing, volume, and stereo location, recordists usually turn to so-called multitap delay processors. These processors allow recordists to set distinct delay times, volumes, pan positions, and feedback settings for each echo the processor produces. Obviously, this sort of control is most easily achieved via digital means nowadays, but it is possible to achieve using non-digital means. Converting a tape delay into a multitap processor only requires fitting a device with multiple moveable playback heads, each with its own amplifier. And multi-head delay devices, like the Roland SpaceEcho, and the WEM Copy Cat, remain coveted additions to any recordist’s and performer’s signal-processing arsenal. Since multitap plug-ins have become a standard in the suite of digital processors which are now bundled with most DAWs, the multitap delay line has achieved a newfound prominence in modern electronic dance music (EDM)

Mixing (The Space of Communications)


genres. In fact, these productions feature multitap processing in a primarily rhythmic role now, with recordists multitapping irregular sequences of echoes to generate rhythmic propulsion and momentum for tracks. Though examples of multitapping abound in the genre, Tosca’s “Suzuki” still provides an exceptionally clear demonstration. The track begins with a single harmonic, percussively plucked on an electric bass, followed by a sequence of regularly diminishing echoes. When the harmonic repeats, however, it is followed by a multitapped sequence of echoes that randomly ping-pong back and forth across the stereo spectrum: right-right-left-right-left-right-left-left-right-left-left-left-left and so forth. Compared with the regularized echo delay heard on, say, the electric bass line which introduces Porno for Pyros’ “Pets,” the multitapped echoes on “Suzuki” clearly belong to an entirely different genus.

The Haas trick To increase stereo separation, that is, to widen the horizontal plane of a mix, recordists sometimes use a kind of delay processing, and panning, in tandem. Some recordists call this tandem technique “the Haas trick,” because it builds on theoretical work on the psychoacoustic basis of reverberation and delay done by the famous German theorist Helmut Haas. As Izhaki tells it: The Haas trick is, essentially, a demonstration of the Haas effect. Haas concluded .  .  . that the direction of sound is determined solely by the initial sound, providing that (a) the successive sounds arrive within 1–35 ms of the initial sound [and] (b) the successive sounds are less than 10 dB louder than the initial sound. . . . The Haas trick is usually achieved in one of two ways. The first involves panning a mono track hard to one channel, duplicating it, panning the duplicate hard to the opposite channel and nudging the duplicate [ahead or behind] by a few milliseconds. The second way involves loading a stereo delay on a mono track, setting one channel to have no delay and the other to have short delays between 1 and 35 ms.44

Playlist Title: Understanding Records: Audio Concepts & Demonstrations Track: 17. The Haas trick A brief demonstration of the so-called “Haas trick,” which I explain above. The track begins with a full iteration of a “chopped” vocal sample with bass. Then a synth pad is added, panned hard left and hard right. When the part repeats, the Haas effect is applied to the synth pad, pushing the track slightly “out of speakers” on the right side of the stereo spectrum. The process then repeats with drums, so listeners can hear the effect in a fully produced context.


Understanding Records, Second Edition

Modulation Modulation processors are a more complicated class of delay processor, all of which use a LFO to modulate a delay line. This seemingly simple modification enables a surprisingly vast array of effects.45 LFOs oscillate (produce) “infrasonic” frequencies, which is to say, frequencies that fall below the threshold of audibility (i.e., under 20 Hz). The human ear thus typically interprets the infrasonic frequencies which an LFO produces as pulsating rhythms rather than discrete pitches. If recordists modulate the amplitude envelope (dynamics profile) of a synth pad using an LFO set to oscillate at 6 Hz—if the amplitude of a synth pad were made to rise to peak amplitude, and fall to base amplitude, in the shape of an undulating 6 Hz sine wave—the volume of the synth pad would rise and fall 6 times per second, that is, at a rate of 6 Hz. On the other hand, if the LFO were set to oscillate twice as fast, the volume of the synth pad would rise and fall at a rate of 12 Hz. In either case, the modulation produces the dynamic quivering which musicians call tremolo.

Figure 2.11  Simplified block diagram of the signal flow for a simple delay line, with modulating LFO. The feedback line is represented by a dotted line to indicate that it is a variable component.

Mixing (The Space of Communications)


“Modulation rate” determines how quickly a particular modulation should occur per second. Not surprisingly then, the modulation rate of a modulation processor is usually expressed in Hertz. Most often, recordists apply the LFO on a modulation processor to its delay-time setting. If an LFO is applied to a delay-time setting of 10 ms, then, a common modulation (i) raises the delay time to 15 ms; (ii) lowers it to 5 ms; and, then, (iii) raises it back up to 10 ms again. If recordists set the LFO to oscillate at 6 Hz, the modulation described above occurs 6 times per second, that is, the delay time modulates from 15 to 5 ms six times per second. If, on the other hand, the LFO were set to modulate at a rate of, say, 2 Hz, the modulation would accrue only twice per second. “Modulation depth” describes the amplitude of the modulating waveform and, thus, the strength (amount) of the modulation it produces. In most cases, modulation depth is expressed as a percentage value of whichever setting the processor modulates. If a delay line is set for a delay time of 20 ms, for instance, and recordists set the modulation depth for +/-50 percent, the modulated signal rises and falls in exactly 10 ms increments, however often per second the modulation rate setting dictates. If the depth is set for only +/-25 percent, the processed signal only rises and falls in 5 ms increments; and, of course, if it is set for +/-75 percent, it rises and falls in 15 ms increments. “Modulation shape” describes the (sometimes variable) shape of the modulating waveform. If recordists choose a sinusoidal shape, for instance, the modulation sweeps smoothly from base to peak setting at a rate and strength determined by the modulation rate and the modulation depth settings, respectively. Should recordists choose a square waveform, however, the modulation snaps immediately between those settings, as the modulating waveform snaps immediately from crest to trough. Though different waveforms produce wildly divergent modulations, variable modulation shape settings are exceedingly rare on “prosumer” modulation processors. Some flangers, for instance, still allow recordists to shape the modulating frequency into either a sinusoidal or a triangular form—not a terribly dramatic variance, in any event—but most modulation techniques require only a sinusoidal shape.

Flanging Flanging is one of the oldest-modulation-processing techniques in the recordist’s toolbox. Innovated by pioneer recordists like Les Paul, while they worked with rudimentary tape delay systems, flanging was initially an entirely


Understanding Records, Second Edition

tactile (manual) technique. To produce the effect, recordists smudged their fingers against the metal flanges that held the tape reels on a tape machine in place. In doing this, recordists impeded the capstan from cycling as usual, thereby varying the delay-time setting according to the amount of pressure they applied. Delay times were usually short—recordists could expect to vary the delay-time setting up to 85 ms without breaking the capstan—so flanging regularly provoked comb filtering. Moreover, because delay times varied constantly according to the manual pressure which recordists applied with their fingers and thumbs, the resulting comb filter swept back and forth across the frequency spectrum, creating the whooshing sound which recordists still, to this day, associate with flanging. Longer delay times shifted the notches in the comb filter ever lower in the frequency spectrum, while shorter delay times raised them to ever higher regions. Nowadays, of course, everyone but purists use digital plug-ins or stompboxes for flanging. These plug-ins do exactly what tactile flangers did, only they use algorithms rather than manual force to vary the delay-time setting. Accordingly, digital plug-ins tend to produce a more obviously regularized cyclic tonal variance (i.e., the whooshing sound they create moves up and down the audible spectrum in a more mechanically precise way). Moreover, modern flangers often route a portion of their output back to input for another round of flanging, which only makes the comb filtering all the more severe. Whatever technology they use, though, recordists have made flanging an extremely common term in the pop lexicon. Flanging is most obvious when it is applied globally, that is, to whole productions. One of the earliest known examples of this can be heard during the psychedelic interludes between chorus and verse on the Small Faces’ “Itchycoo Park” (i.e., 0:50–1:07, 1:40–2:05, and 2:20–2:46); but any track from the playlist below will suffice to demonstrate its sound. Playlist Title: Understanding Records: Audio Concepts & Demonstrations Track: 18. Flanging A brief demonstration of flanging. A string quartet sample sounds twice without effect, then twice with flanging applied, then once more without effect.

Playlist Title: Understanding Records, Chapter Two: Sectional Flanging A playlist comprised of tracks that feature “sectional flanging” (flanging at mix level).

Mixing (The Space of Communications)


Chorusing A distant cousin of flangers, “chorus” processors send audio signals through one or more simple delay lines—some processors use more than fifty delay lines—using an LFO to modulate the delay-time setting of each. Though this obviously resembles the signal flow for flanging, chorusing does not usually require a feedback line to route its output back to input. As the input signal shunts directly to output, the first modulated delay line outputs a copy which is delayed by however long the LFO’s modulation rate, depth, and shape settings dictate; a second modulated delay line then delays and modulates the first modulated delay line; a third modulated delay line delays and modulates the second line; a fourth line delays and modulates the third; and so on. When all of these modulated delay lines combine at output, a sequence of so-called “pitch modulations” accrue, that is, cyclic variances in pitch. Playlist Title: Understanding Records: Audio Concepts & Demonstrations Track: 19. Chorus The same process is heard in this track as the one before, only with chorus applied to the third iteration of the quartet sample.

Chorusing is easiest to hear when recordists set the LFO for a particularly fast and deep rate and depth. These fast and deep settings produce a characteristic warbling sound, which is the product of pitch modulation. Such deep chorus effects are typically applied to electric guitar and piano tracks in a pop production, though vocals and background vocals are common targets, too. The playlist noted directly below contains a number of obvious examples of chorused electric guitar tracks, on hit records released since the late 1970s, and early 1980s, which many historians, and critics, consider to have been a golden age for the chorus effect. Playlist Title: Understanding Records, Chapter Two: Chorus A playlist comprised of obviously chorused guitars and/or bass tracks.

Phasing Phasers are a “special case” in modulation processing. Rather than a simple delay line, phasers use “all-pass filters” to reshape the frequency content of an input signal. Among the many things they do, all-pass filters shift the phase of the input signal—for our purposes, they delay the input signal—at a rate


Understanding Records, Second Edition

determined by wavelength. Low frequencies, which have longer wavelengths, thus shift (delay) at a slower rate than do higher frequencies.46 Notches, that is, muted frequency ranges, subsequently accrue at frequencies where the phase shift induced by the all-pass filters is precisely 180 degrees. This emulates the effect of comb filtering and, when a modulating LFO is applied to the phase rate of those filters, the notches subsequently sweep back and forth across the frequency spectrum. Playlist Title: Understanding Records: Audio Concepts & Demonstrations Track: 20. Phasing The same process is heard in this track as the one before, only with phasing applied to the third iteration of the quartet sample.

Playlist Title: Understanding Records, Chapter Two: Phasing A playlist comprised of some obviously phased tracks: drums throughout “Kashmir”; on the Fender Rhodes during the introduction to “Just the Way You Are”; guitar and vocals on “Pictures of Matchstick Men”; choral vocals on “Sheer Heart Attack”; electric guitar on “Bring on the Night,” “Cockoo Cocoon,” “Have a Cigar,” “Strange World,” “Paranoid Android,” and “The Rover.”

The playlist noted below features a number of progressive rock records released during the mid-1970s which feature clearly phased electric guitar and electric bass tracks. While it may be patently obvious to readers that the tracks noted in this playlist feature some kind of comb filtering, many will nevertheless find it difficult to determine whether that filter is the product of flanging, chorusing, or phasing. And yet, this is a crucial distinction for many recordists.

Chorus, Flange, and Phase: Telling them apart To say with confidence whether a track was flanged, chorused, or phased can be extremely difficult. All three techniques are, after all, variations on the same basic process (i.e., modulating the delay-time setting on one or more delay lines). Nonetheless, flanging, chorusing, and phasing each have unique audible traits. Chorusing is likely the easiest to distinguish because it is the only modulationprocessing technique that produces pitch modulations. Flanging and phasing are more difficult to distinguish. Both techniques create a comb filter in the input signal, and both sweep that comb filter up and down the audible spectrum. However, flanging does so in the service of producing a more severe comb filter than phasing creates—the tonal distortion which flanging produces more

Mixing (The Space of Communications)


obviously transforms the input signal, especially when recordists opt to engage the feedback option—and flanging also tends to create harmonic notches much higher in the audible spectrum, which produces a whooshing sound like a lowflying airplane passing by overhead. The audible effects of phasing are usually restricted to a kind of midrange variance, without harmonic relation to the input signal. The Fender Rhodes track which introduces Led Zeppelin’s “No Quarter” provides an ideal case study. The track is obviously the product of modulation processing of some sort; moreover, the modulating LFO obviously oscillates at a fast and deep setting. This produces a sound that, on first blush, might resemble the warbling that chorusing creates. However, closer inspection reveals a conspicuous absence of pitch modulations, which rules out chorusing as a possibility. The warbling might be the product of flanging, then, but the ascending and descending high-frequency whooshing that flanging creates is also absent. This leaves phasing as the most likely option and, indeed, close inspection of the track reveals a cyclically varied comb filter, focused primarily in the midrange of the input spectrum, which characterizes that process. So-inclined readers may even create the effect themselves, using the following settings: mix at 50 percent, feedback set for 75 percent, the depth set to something very strong, and the LFO set to oscillate at a rate of about 6.5 Hz. Playlist Title: Understanding Records: Audio Concepts & Demonstrations Track: 21. Telling Modulation Processes Apart The unprocessed quartet sample from tracks 18 to 20 sounds, followed by a flanged, chorused, and phased version, in turn. The process then reverses itself (we hear the phased sample twice, that is, followed by a chorused, flanged, and then unprocessed version).

Reverb processing The final processing technique I will consider here is reverberation processing. Recordists use a variety of tools to situate tracks along the proximity plane of a mix. However, reverb processing remains the most common tool that mixers use to shape the proximity of sounds. In fact, reverb processing has become a core technique in modern mixing. One of the foundational tenets of multitrack production mandates that component tracks be recorded in as dry (non-reverberant) a manner as possible; recordists later shape and refine reverberation profiles for tracks once the spatial


Understanding Records, Second Edition

and spectral needs of the production at large are fully evolved. Multitrack mixes seldom develop in a straightforward manner, after all. As Brian Eno (1979) so famously put it in 1979, multitrack recordists usually engage in in-studio composition, where you no longer come to the studio with a conception of the finished piece. Instead you come with actually a rather bare skeleton of the piece, or perhaps with nothing at all. I often start working with no starting point. Once you become familiar with studio facilities, or even if you’re not, actually, you can begin to compose in relation to those facilities. You can begin to think in terms of putting something on, putting something else on, trying this on top of it, and so on, and then taking some of the original things off, or taking a mixture of things off, and seeing what you’re left with—actually constructing a piece in the studio. In a compositional sense, this takes the music away from any traditional way that composers worked, as far as I’m concerned, and one becomes empirical in a way that the classical composer never was.

Even if they pursue a complete and clear conception of the final mix from the very moment they enter a studio, recordists nonetheless refrain from making final decisions about reverb processing until that mix nears completion. Because it extends spectral timbres in time and in space, reverb processing is prone to inducing unintended masking errors. The time it takes to correct these errors can prove costly, both financially and creatively. At the very least, errors of this sort impede the forward momentum of the record-making process, stalling work at what amounts to a technical footnote. Once they are ready, though, recordists have a number of tools available for shaping a sound’s reverberation profile. They can adjust, among other parameters, (i) pre-delay timings; (ii) early-reflection and late-reflection levels; and (iii) decay and diffusion rates. Adjusting even just one of these parameters refines the reverberation profile of tracks, specifying particular acoustic spaces for those tracks to occupy. In other words, recordists use reverb processing to position tracks along the proximity plane and, in so doing, to conjure an idealized space for each track to inhabit in a mix. I conclude this chapter by briefly considering each parameter they use.

Pre-delay Every reverberation begins with a brief interlude of non-activity, known colloquially as the “pre-delay phase.” This phase intervenes between the arrival of

Mixing (The Space of Communications)


the “direct sound,” that is, the soundwave which provokes the reverberation, and its first “reflection.” In creating this buffer, the pre-delay phase provides listeners with important information about the size and dimensions of the physical space in which the direct sound reverberates. Specifically, pre-delay delineates the distance which direct soundwaves travel before reaching reflective surfaces in a room. Longer pre-delay times—say, more than 85 ms—demarcate larger spaces while shorter pre-delay times denote smaller spaces. In fact, pre-delay times can be measured in feet. We know that sound travels at a rate of roughly 1,000 feet per second, “so a 50 millisecond pre-delay gives the effect of placing a sound source 50 feet away from the opposite wall of a room,” explains Bruce Swedien: Pre-delay determines how far the sound source is from the walls in a room. This has the subjective effect of creating depth [in a mix], and long pre-delays of 50 to 65 ms are often used to wash vocals and make them fit better in a mix. .  .  . This sounds pretty huge but it is not unusual for concerts to take place in large concert halls or auditoriums that are considerably larger than 50 feet in length.47

Figure 2.12 The pre-delay phase of a typical reverberation profile, separating the arrival of direct sound from its first reflection.


Understanding Records, Second Edition

Early reflections and late reflections Following the pre-delay phase of a reverberation comes the so-called “early reflections.” These “early reflections” arrive within 35 to 80 ms of the direct sound, and their onset delineates the definitive conclusion of the pre-delay phase of a reverberation profile. Early reflections are followed closely by another round of reflections called “late reflections,” which arrive roughly 35 ms after early reflections arrive. Both early reflections and late reflections provide important information about where the sound source—and, in turn, the listener—is situated in a room.48 Early reflections reflect off only one or two near surfaces in a room; they don’t travel all the way to room boundaries before they bounce back to listeners, as late reflections must. Because they do not travel as far as late reflections, early reflections are usually significantly brighter, that is, they contain more highfrequency components, and their attack transients thus remain relatively intact. A preponderance of these bright early reflections thus encourages listeners to localize tracks closer along the proximity plane of a mix, especially when they are combined with tracks that feature a preponderance of late reflections instead. To achieve this effect, however, purists like Bruce Swedien insist that the early reflections must be recorded live (amateur recordists also tend to be particularly interested in what purists have to say on this matter: purists’ “rules” offer the comforting illusion of objective standards which can be used to gauge work in the absence of professional accolades or success): Early reflections are something that I have always considered “the forgotten factor” of acoustical support, when it comes to high quality recording. . . . The thing that is always apparent to my ear is that the quality of early reflections, when generated in a room, is quite different (and vastly superior) to the so-called early reflections that would be generated artificially. So, if we have well-recorded sound sources, with good early reflections, what you want to do is open up the pre-delay, or make the pre-delay larger in number, to accommodate the early reflections. If you have done a good job of recording your sound source, if you don’t have pre-delay in the reverb, you’ll mask those beautiful early reflections. And those early reflections are a very important component of sound.49

Decay rate The “decay phase” of a reverberation profile spans the onset of late reflections to the onset of silence. Decay rates are often expressed in terms of “RT60,”

Mixing (The Space of Communications)


Figure 2.13  A reverberation profile, including pre-delay, early, and late reflections.

in reference to the 60 dB (SPL) of attenuation which is usually required for a sound to diminish to silence (RT refers to “reverb time”). A small acoustically tuned room may exhibit an RT60 of roughly 200 ms, say, while some cathedrals exhibit RT60 rates of more than 5,000 ms, that is, 5+ seconds. Only by further complicating matters, decay rates vary according to wavelength and frequency: smaller wavelengths, which correspond to higher frequencies, tend to absorb and diffuse much faster than larger wavelengths (lower frequencies). Accordingly, slower decay rates create louder reverb tails, which increases the likelihood of masking. Of course, if the slower decay rate is desired nonetheless, recordists can use the side-chain of a compressor to duck the reverb rail under the dry vocal track, thus allowing the dry line to sound while also providing a lush reverberant tail, during its tacit moments, which prompts listeners to psychoacoustically situate the dry track in a lush reverberant context. In fact, one of the most common vocal treatments in pop runs as follows: a lead-vocal track is sent to a reverb bus panned hard left, and another panned hard right, and both are detuned in opposite directions by +/−6 cents and ducked under the lead track. Sometimes, the Haas trick is applied to widen the reverb. This fairly simple treatment serves to thicken the main vocal, even as it avoids masking through ducking. Decay rates can also be used to shape the acoustic environment in which a track sounds. Indeed, the decay rate of a reverberation profile offers important information about room size and furnishings. Reflective surfaces extend decay rates. More late reflections bounce back to the listener, and at greater amplitudes,


Understanding Records, Second Edition

from, say, concrete than from carpet. Accordingly, when the late-reflection phase is dynamically equivalent to the early-reflection phase of a reverberation profile, listeners tend to infer a space furnished with highly reflective materials. The size of the room also plays a role, however. Larger rooms extend the decay phase of a reverberation profile while smaller rooms curtail it. Related to decay rates are “diffusion rates,” which measure the amount of time that intervenes between the arrival of each subsequent reflection in a reverberation. As a reverberation profile reaches its decay phase, ever more time intervenes between the arrival of each late reflection; and then, as they decay to silence, reverberations become even more diffuse. Thus, recordists can use this psychoacoustic principle to their advantage, pushing tracks back along the proximity plane of a mix simply by increasing the decay and diffusion of their reverberation profiles.

Reverberation and tone shaping To conclude this subsection on reverberation processing, I asked Al Sims to explain how he approaches reverberation processing in his own work as a platinum-certified and gold-record-holding producer and engineer. Interestingly, while he does acknowledge reverberation’s role in spatialization as an important one, he also emphasizes the role it plays in shaping tone: To me, reverb use falls into two categories, first; putting something into a space. This would be the standard way of thinking of reverb, you record a source with a mic a few inches back from it and add reverb to give a sense of size and length to the performance. This can be applied to any instrument or recorded source and usually works best with a send and return workflow. Second, using reverb as an effect. To me this is the fun and exciting part of reverb. Using a super short, non-diffuse reverb as a makeshift stereo widener. Using an abrupt gated reverb on a vocal to make it sound like a chopped sample. Adding a long, lush reverb to a simple melody then distorting the whole thing for infinite sustain. All of those are tools I use in productions to add excitement and character to an element but don’t necessarily put it in a space as the first category would. Too often these exciting parts of reverb are forgotten because of rules that are taught, like always use reverb in a send and return method instead of inserting a plug-in or piece of outboard directly into the signal path. Or that distortion or dynamic processors should never come after ambiance effects. Learning the basics of reverb is essential, but using it non-traditionally in my opinion is just as important. 

Mixing (The Space of Communications)


Video Title: Alex Chuck Krotz Explains Reverb In this video, Alex Chuck Krotz considers the place of reverb processing in his mixing process production. Alex has mixed tracks featuring some of pop music’s modern elite, including the likes of Drake, Big Wreck, Shawn Mendes, and Three Days Grace, and he has graciously agreed to discuss some of the reverb moves he regularly makes when he works.

Video Title: Understanding Records, Putting It All Together: Lonesome Somedays, pt—3Mix Level Signal Processing In this video, Matt explains some of the more obvious “mix level” signal processing moves he made on “Lonesome Somedays.”



Mastering (The Final Say)

When they are done mixing, recordists are ready to master their productions. During mastering, engineers use a variety of specialized tools and processes to polish mixes so they sound optimal in a variety of playback settings and formats. Before considering the most important of these, though, I’d like to first take a brief moment to review how mastering came to be.1 Of all the various procedures that make up record production, I find the history and development of mastering the most fascinating.

Mastering: A brief history Given how stodgy and dry the mastering process can seem, it will likely come as a surprise for readers to hear that mastering can be said to have emerged directly from German counterintelligence measures adopted before and during the Second War. Throughout the 1930s, and well into the Second War, counterintelligence officers capitalized on German advancements in magnetic tape recording to broadcast speeches by Nazi officials simultaneously from multiple time zones. Allied analysts thought the speeches might be vinyl transcriptions, but the sound quality of those speeches was indistinguishable from a live broadcast, and their duration far exceeded the short-time constraints of 78 rpm discs. It was difficult to know, then, how they achieved this spatiotemporal sleight of hand. And, in fact, analysts stayed stumped until Allied forces captured German magnetophone tape recorders in raids on Radio Luxembourg.2 Three years after the cessation of hostilities in Europe, in 1948, Ampex introduced the first commercially available magnetic tape recorder in the United States, which drew much of its technical impetus from the machines that Allied forces brought home with them from the war. It was then, and only then, that the need for a mastering engineer arose. Before the widespread adoption of magnetic tape, all recording was done direct to a “mastering lathe,” usually


Understanding Records, Second Edition

housed in an adjacent room, whether the means were mechanical (before the introduction of electric microphones) or electromechanical (using microphones and electronic amplification).3 When magnetic tape was used to make a record, however, someone had to “transfer” the electromagnetic information stored on tape into grooves on a 10 inch 78 rpm disc.

Transfer engineers Engineers who oversaw transfer of signal from tape to disc were called “transfer engineers.” These “transfer engineers” acted mostly as quality control agents. They worked primarily to ensure that the excess low- and upper-midrange frequencies which magnetic tape could tolerate didn’t produce distortions when transferred to the less forgiving medium of vinyl. Even more dangerous, excesses of low frequencies could create defective discs if left unchecked by transfer engineers, rogue bass frequencies creating exaggerated laneways on the disc which could make the stylus “jump the groove,” as it were. Indeed, it is worth noting that level was already a locus of controversy in mastering from the start. If engineers set too low a level when transferring electromagnetic information onto vinyl, the volume of the resulting discs would be too low to defeat the self-noise of the listening apparatus, producing “noisy” discs. If engineers set the level too high, on the other hand, they risked either creating defective discs which “jumped the groove” or, far worse, destroying not only the master disc but the mastering lathe as well. A delicate balance had to be struck when determining levels for transferring, and finding this balance may well be the very first musical consideration which belonged solely to the mastering engineer. By 1955, magnetic tape had become standard in the record industry. It was this year that Ampex introduced its next advancement in magnetic tape recording, namely, “Selective Synchronous” (SEL-SYNC) recording, which allowed engineers to overdub musical material. From this point on, once overdubbing became standard procedure in the production process, different phases of record production necessitated different recording competencies. When tracking, audio engineers began to think in terms of overdubbing, adopting what might be called a “multitrack” mind-set, albeit in a very rudimentary form in 1955.4 Tracks could be overdubbed at different times, and their dynamic and spectral contours could be shaped individually and in isolation. Transfer engineers, on the other hand, received completed multitrack recordings. It was their job to ensure that those projects transferred optimally as unified aesthetic statements.

Mastering (The Final Say)


In other words, their listening perspective was “holistic”; transfer engineers heard and considered the minutia of each multitrack recording as a consolidated musical statement.

Cutters Within a few years of the divergence I note in the section above, many transfer engineers had earned a new job title. To some, they were “cutters.” And their job wasn’t simply to “cut” vinyl from tape but, moreover, to optimize its sound quality.5 By 1960, mastering engineers routinely engaged directly with the music they worked on, adjusting its dynamic range and spectral contour using compressors and equalizers and other tools as a matter of course. Consequently, another divergence in the mastering engineer’s recording competence emerged. Whereas audio engineers and transfer engineers began to develop distinct tasks and techniques related to their unique positions in the production process, aesthetic decisions made by mix engineers were now routinely reconsidered—and even dismissed sometimes—at the mastering stage. Mixing slowly became subject to mastering, in other words. After this point, it was assumed that mastering engineers would finalize whatever mix engineers submitted to them, and thus was the ear of the mix engineer first ultimately subordinated to the ear of the mastering engineer. In fact, “cutters” were routinely tasked with making mixes sound as loud as possible. Brian Holland, for one, remembers just how central a concern loudness was for labels like Motown throughout the 1960s: Loudness was a big part of the Motown sound. We used ten, even twenty equalizers on a tune—sometimes two on one instrument, to give it just the right treble sound . . . a higher intensity. We used equalization to make records clear and clean. We also used a lot of compressors and limiters, so we could pack the songs full and make them jump out of the radio. We were interested in keeping the levels “hot” on a record—so that our records were louder than everyone else’s. It helped established the Motown sound . . . [our] records really jumped out.6

Once again, average level (“loudness”) emerged as a peculiar problematic of the mastering process. Labels knew that mastering engineers had tools and techniques for making their mixes sound louder, on average, and that louder mixes almost always sounded better and more exciting when heard after quieter tracks broadcast on the radio or played on a jukebox. However, they risked


Understanding Records, Second Edition

distorting their masters if they applied too much gain. So mastering engineers, especially those we would call “cutters,” set about developing technologies and techniques specifically designed to increase the average volume of a master without compromising its sound quality. While transfer engineers worried about the level of the input, cutters worried about the level of the output; whereas transfer engineers worried about setting transfer levels so a mix could best be heard by listeners, cutters considered at what level a mix sounded best.

A cultural difference It would be a mistake to consider the “transfer” engineer merely a primordial way station in the evolution of mastering, a loser in the craft’s natural selection process now relegated to the dustbin of history. In fact, even while the “cutter” tradition emerged, the “transfer” tradition remained healthy. Both “cutter” and “transfer” mastering styles simply found different welcomes on different shores of the Atlantic. By the early 1970s, American engineers seem to have felt entirely within their rights to reshape the dynamic and spectral characteristics of just about any mix they worked on. In other words, Americans took a typically “interventionist” role when mastering, reshaping a mix as much for aesthetic reasons as for anything even remotely related to duplication. British engineers, on the other hand, followed a more “transparent” path, compressing and equalizing only insofar as doing so ensured an optimal transfer. It would be silly, of course, to reduce this transatlantic divide to broad cultural differences between “cowboy” America and “conservative” Britain, even if it was a cultural difference that birthed it. In America, mastering was (and still is) seen as the final stage of record production, thus inviting engineers to consider what they master a work still in progress. Meanwhile, in Britain, mastering was conceived as the first stage of the manufacturing process. As such, British engineers worked to transfer what they saw as already finished productions for duplication.

Mastering “Houses” By the late 1960s, the “cutter” tradition had given birth to America’s earliest mastering “houses,” as mastering studios are typically called, perhaps in deference to the craft’s generally boutique nature. Doug Sax opened the first

Mastering (The Final Say)


dedicated mastering house in Los Angeles, in 1967, calling it “The Mastering Lab.” As Sax remembers: If you go back to the late ’60s and before, everything was done in-house. You were signed to a label, you were given an A&R man, and you stayed within the label. If you recorded at Capitol, then you went down to Capitol’s mastering to get your product cut to lacquer. You went to Capitol’s art department and they gave you the artist that designed your cover, and that’s the way it was. It was really at the end of the ’60s that certain top producers would say, “I love the security, but I would like to work with an artist that’s not on this label. I would like to work with Streisand, but she’s on Columbia.” So they started to break off from the label and really started the process where nobody is tied to one anymore. The cry became, “If you sign me, I’ll use the engineer I want and I’ll record and master where I want.” That’s 40 years of hard fought independence, so from the standpoint of an independent that is not aligned with a label, just a specialty room that handles mastering, the answer is yes . . . I was one of the pioneers when there was no independent business. We opened up our doors in December 27 of 1967 and by ’71 or ’72, you couldn’t get into the place because we were so busy. By ’72 we were doing 20 percent of the top 100 chart and there weren’t a lot of competitors. There was Artisan in LA, and Sterling and maybe Master Disk just starting in New York, and that was it. Now there seems to be a thousand because the reality is that it’s very easy for someone to go into this business now, or for the artist or engineer to “do it yourself.” You can get a workstation with all the bells and whistles for a song and a dance. A Neumann lathe setup in 1972 was $75,000, and that was just the cutting system; you still needed a room and a console, so you had to have a big budget, and there was only a few people doing it as a result.7

Despite the costs, other engineers soon followed Sax’s lead. Denny Purcell set up a shop in Nashville, Bernie Grundman in L.A., and Bob Ludwig in New York City, to name only a few celebrated examples. Artists quickly came to appreciate the specialized ear of these mastering engineers, and the work they produced in their “houses.” Indeed, however loudly some artists claim ignorance about the mastering process, they care deeply about the sonic difference mastering makes. When working on Moondog Matinee (1973), for instance, the Band famously fired Capitol’s in-house mastering engineer in favor of Bob Ludwig, and they expressed their clear appreciation for his work in their credits for the album, which concluded, “Mastered (as always) by Bob Ludwig at Sterling Sound.”


Understanding Records, Second Edition

In fact, the move to freelance was client-driven for many mastering engineers. Bernie “Big Bass” Grundman, for one, remembers his move to Fantasy Studios in Berkeley, California: After a few years of mastering Credence hits, they plucked me out of RCA, and I went up to Fantasy [Studios] for four years. I was the first employee there. They wanted mastering in there first, before anybody else moved in. It was ’69 or ’70 . . . . And that led to the next phase, because after my four years there, I went to Allen Zentz Mastering, where we ruled in the disco department. We did all the Donna Summer records, all the Casablanca catalog. That was really something.8

Mastering engineers No matter the success enjoyed by well-known freelance engineers like Doug Sax, Bernie Grundman, and Bob Ludwig, it was not until the introduction, and widespread adoption, of compact discs in the early 1980s that the title “mastering engineer” first saw regular use in the record industry. Pushed into digital terrain, though still working on mostly analog gear, mastering engineers working in the 1980s transferred analog mixes onto bulky modified video tapes. To do this, they used the SONY PCM1610 and later PCM1630, two-channel digital-audio processors capable of 16-bit linear quantization at a rate of 44.1 or 44.056 kHz. The 1630s were composed according to the so-called “Red Book” standard, allegedly named for the red color of the binder SONY used to collate their specifications for formatting compact discs. Mastering engineers now needed to know how to not only convert electromagnetic data stored on tape into digital bits, and compress and equalize it so it sounded best given such a transfer, but also program PQ and ISRC codes, and other important data, into the resulting binary stream of ones and zeros. While the Red Book standard would prevail throughout the compact disc era, and while it continues to set the 16-bit/44.1 kHz formatting standard for compact discs and most untethered downloads and streaming even today, the 1630 was only a relatively short-lived delivery vehicle. Exabyte tapes containing image files, DAT (Digital Audio Tape) and Doug Carson’s DDPi (Disc Description Protocol image) format quickly replaced 1630s as the “industry standard” for delivery to duplication, which engineers burned using the Sonic Solutions Digital-Audio Workstation or some equivalent. Thus did mastering enter the realm of personal computing and slowly morph into a kind of musical programming and coding.

Mastering (The Final Say)


Figure 3.1 A DDPi being prepared by the author for commercial release, using WaveLab 9. In this case, it was a physical master for san holo’s album 1 being prepared.

And there would be no turning back from computers, of course. Computing is by now a core component of the mastering engineer’s basic competence. In fact, only a decade after the release of the Sonic Solutions DAW, almost all mastering, if not music production in general, was done using computers. By the late 2000s, mastering engineers routinely produced digital masters of digital mixes, which themselves were created using digital-audio workstations like ProTools and Logic, for distribution onto a seemingly never-ending array of digital formats, both physical (i.e., CD) and virtual (i.e., mp3, wav, aiff, etc.). And though this seems easy enough to achieve in our current user-centered computer age of the late 2010s, where ease of use seems a programmer’s primary concern, in the early 2000s such a feat required a great deal of digital know-how. In short, mastering engineers once had to be as good with computers as they were with music. At this point, the problematic of loudness reached what many historians consider to be its historical apotheosis, namely, the so-called “loudness wars” (for more on the so-called “loudness wars” see “The Art of Mastering: Dynamics”). As noted, throughout the late 1990s and into the 2000s, mastering embraced personal computing, and the formats for music delivery became almost entirely digital. In response, mastering had to adopt a “full-scale” decibel


Understanding Records, Second Edition

weighting system. In this system, an absolute numeric value (0.0 dBfs) represents maximum volume before clipping, while in the analog domain headroom permits levels to be more dynamic. This seemingly minor change would have profound consequences for how not just mastering engineers, but the industry in general, conceived of loudness. The maximum amplitude of a digital-audio recording is always “full scale” or 0.0 dBfs. To increase the loudness of a digital master, then, which should always peak at a value no greater than 0.0 dBfs, engineers must increase its average amplitude. To do this, they reduce the “crest factor” of mixes through compression, and apply brickwall limiting, which allows them to apply generous amounts of make-up gain in turn. This means that mastering engineers working on digital material must decrease dynamic range to increase average level. Loudness comes directly at the expense of dynamic range given full-scale weighting, in other words. It should come as no surprise, then, that as records have increased by +17 dBfs of average amplitude since 1980 their dynamic range has decreased by roughly the same.9 Only making matters worse, the compression required to make full-scale recordings sound louder has the added effect of emphasizing upper-midrange frequencies in a mix, consequently making overly compressed mixes sound even harsher and more distorted than they already are. Playlist Title: Understanding Records, Chapter Three: The Loudness Wars, part 1 This playlist is comprised of tracks released in the mid-1960s and the mid2010s, sequenced back to back. That is, the first track is a release from 1966, the next is a release from 2017, the next was released in 1967, the next came out in 2010, and so on. Readers are encouraged not to adjust their volumes while they listen, to get a sense of how “loudness” levels vary from track to track, and also from era to era, depending on technology and aesthetic values.

Loudness normalization Though the so-called “loudness war” tends to receive the lion’s share of critical attention in commentary on mastering, a trend which obviously continues the extroversion of the “cutter” tradition, it is worth noting that the more reserved “transfer” tradition persists. In fact, the “transfer” tradition may be more important than ever in modern mastering, a notion I consider in greater detail below.

Mastering (The Final Say)


The record industry’s embrace of digital formats has led to a proliferation of outputs for mastering, each of which is optimized to its own dynamic and spectral standards. Engineers must now deliver masters which sound balanced in a variety of digital contexts, including YouTube, Vimeo, and SoundCloud; Spotify and Apple Music; iTunes and Bandcamp; and so on. Only making matters more complicated, many of the streaming services that now dominate the global marketplace, like Spotify and Apple Music, have recently adopted different “loudness-normalization” standards to combat the deleterious effects of the “loudness wars” on the personal listening experience (see “The Art of Mastering: Dynamics” for more on “loudness normalization”). Services now standardize their outputs to different “loudnesses,” measured in LUFS (Loudness Units Full Scale). Apple Music, for instance, recently adopted −16 LUFS as its target volume, while Spotify and Tidal both use −14 LUFS, and YouTube uses −13 LUFS. Though a difference of −3 LUFS may seem negligible to some readers, in mastering such a change is dramatic. As such, and solely to accommodate this diffusion, engineers (myself included) now deliver multiple masters to clients, each optimized for particular destinations. I asked Russ Hepworth-Sawyer, founder of MOTTOsound (mottosound. co.uk) and mastering engineer of merit in his own right (his credits include records by the likes of Steve Earle, Bill Ray Cyrus, George Strait, and John Hiatt, among others), how new loudness-normalization standards had impacted his personal practice. He had this to say: I think I may have been a little late to the party on this one. I’ve been aware of R-128 since early on, but honestly was very sceptical as to whether this would ever positively move over to the music world. Yet it has. I’m still doing masters for CD (yes believe it or not) and clients are still wanting a modicum of Loudness Wars on this still. It does, when done well, sound good. That could be through decades of conditioning for us all. The positive world that loudness normalization offers us is tremendous. It will take a long time for it to become widespread and to be fully appreciated by all who listen. However, as Spotify is asking for a loudness target of nearly 20dB less than the loudness wars, dynamic range has been restored! Given this, we’re offering clients two masters. One is the CD master (whether they go to CD or not) which is loudness wars essentially (it’s palatable by the client compared with their current music collection), but also a “loudness normalized” master which is optimum for the likes of Spotify etc. The future is bright with streaming when PCM become the standard (imagine 48/24 as standard!)


Understanding Records, Second Edition

Back to the future: The Wild Wild West Advances in personal computing, the proliferation of affordable mastering DAWs, the rise of the internet, and streaming and “loudness-normalization,” among other developments, have conspired to create the current mastering situation, where anything seems to go anywhere and at any time. Mastering engineers now collectively occupy a kind of occupational Wild Wild West, as it were. On one extreme, engineers continue the “mastering house” tradition begun by the likes of Doug Sax in 1967, spending small fortunes on top-of-the-line acoustic treatments and gear, to outfit spaces dedicated in their entirety to the mastering process. On the other extreme are engineers who use a more economical (if not downright Spartan) setup, requiring only a laptop and headphones they trust. And, of course, the vast majority of engineers work somewhere in between these two extremes. I hope it is clear by now in this book that I firmly believe no gear choice is more correct than another. If my decades of experience making records have taught me anything, it’s this: expensive gear is simply not needed to render professionally viable masters. Everyone works with whatever tools are available, to create whichever sounds are achievable, and which best suit their personal aesthetic priorities and values as artists. This said, musical results are certainly easier to achieve when working in optimized environments outfitted with specialized tools. One doesn’t hammer nails using the butt of a drill—unless they have no other option! Nevertheless, so long as engineers produce masters happily approved by clients, they do a “good job” of mastering. It matters little where one works, and on what gear, nor even what one thinks of the results, if the masters they submit sound good to the people responsible for approving them. And, in any event, the most crucial bit of gear in any mastering setup is the engineer’s ear. As Phil Ramone so aptly put it: Engineers use many of the same tools and yet they create [records] that are quite different from each other. . . . . Technology is constantly changing; equipment is constantly evolving; and every new generation must either follow or break the rules of the past. Engineers like Al Schmitt, Leslie Ann Jones, and Chuck Ainley all learned in different kinds of formal settings, and yet they carry forward a certain tradition. In listening to their work, one understands that the great engineers know how to make technology work for them. No matter how many tracks are recorded, at some point an engineer must use his or her skill to marry a combination of elements, from great live recordings to sampled drum sounds, while still ensuring that the record grooves.10

Mastering (The Final Say)


In many respects, the internet and affordable software has made the need for a dedicated mastering space irrelevant for many engineers. I receive the lion’s share of the premasters I work on via online file transfer services like wetransfer. com or Dropbox, and I deliver them with my notes in an email to clients using the same. I work in a variety of spaces, though I prefer to work in the one I know best, and which has the acoustics I know best. When pressed, though, I have delivered many successful masters created in conditions well outside my comfort zone, when traveling for recording or performance purposes. And the results have earned recognition from my peers in the industry in Canada, one record produced under such circumstances earning a nomination for a Juno Award in the category of Best Electronic Recording in 2014. Because of the way I work, however, I can’t offer some of the services dedicated mastering houses can, which many clients appreciate. Most obviously, I can’t offer the option of cutting vinyl, as I have no access to a lathe, and I don’t have the resources to facilitate attended sessions. To compensate for this, I usually allow for a few rounds of unpaid revisions, even if clients rarely require it. Wherever one works, and whatever tools one uses to ply their trade, mastering engineers now operate in a totally changed, completely volatile musical marketplace. The vast majority of recordists who contract mastering engineers work primarily on desktop computers and laptops, and in less-thanideal environments. Money once spent on time in the studio is now regularly earmarked for acquiring as much gear as recordists can gather, which means that sound quality is most often subordinated as a concern to the affordability of gear. These recordists tend to rely on mastering engineers for quality assurance more than anything else, assuming that mastering engineers will adjust and change their mixes to meet market standards. Thus, mastering engineers presently straddle the transfer, cutter, and programmer traditions. Now, engineers have added to this an element of curation, clients depending on their knowledge of the dynamic and spectral expectations of various genres and markets to complete their mixes. Indeed, from its first wobbly steps as a transfer service, mastering has grown into a musical technique and competency all its own. At no point in time before has the mastering engineer enjoyed such freedom, or suffered such uncertainty. As the record industry continues its dive-bomb into perceived bankruptcy, service jobs and agencies within that industry are pulled into the red along with it. Whereas entering mastering once seemed “steady” work— every record needs mastering, after all!—the field now seems as desperate as


Understanding Records, Second Edition

any other professional subventure in music. So long as records are made, though, they will need to be mastered. How that is to be done remains still to be seen. Playlist Title: Understanding Records, Chapter Three: Hodgson Mastering Samples This playlist is comprised of recently released tracks that I mastered. It is offered so readers can get a sense of the material I tend to work on, and to provide more advanced readers an opportunity to see if they can hear something like a “sonic signature” recurring throughout.

Finalization: Mixing/mastering It is worth noting here, before delving into the mastering process, that most mastering engineers with any sort of experience would agree that the audible products of mastering have as much to do with mixing as they do mastering. Whether engineers tend stylistically toward the “transfer” or “cutter” traditions in their work, their choices are entirely limited by the spectral and dynamic properties of the mixes they work on. A master can only ever be as good as a submitted mix permits, of course. Just as a king’s ransom in microphones, preamplifiers, compressors, and room acoustics simply cannot make a donkey braying sound like Aretha Franklin in her prime, neither, too, can any amount of mastering make a bad mix sound good. Indeed, though they are typically treated by analysts and recordists alike as totally separate procedures, and though each procedure has its own uniquely specialized tools and core techniques, mixing and mastering actually function as sequential phases in a single overarching musical procedure. I’ll call that procedure, for the purposes of this book, “finalization.” Mixing comprises the “early stage” of finalization, when individual tracks are related to one another and, in the process, configured into broader musical arrangements. Mastering comprises the “late stage” of finalization, when engineers shape and refine the larger dynamic and spectral contours those configurations assemble. Understanding how these two musical procedures relate, that is, understanding how mixing and mastering work as an aesthetic and technical tandem, is crucial for understanding not just record production in general but also the crafts of mixing and mastering in particular.

Mastering (The Final Say)


Mastering as arm’s-length peer review The notion that mixing and mastering comprise discreet phases of the same broader finalization process runs counter to a long-standing consensus on mastering, a consensus found in almost every technical and theoretical text written about record production. This consensus reduces mastering to a kind of arm’s-length peer review, or “quality control,” before duplication. Engineers review the basic contours of mixes, and they look for dynamic and spectral irregularities which will prompt aesthetic or technical defects after transfer, or so the argument runs. They attenuate obfuscating dynamic excesses, de-noise and notch tonal distortions, and take any number of further corrective or “forensic” measures to optimize the mix for its intended formats. Engineers then determine an optimal level for the mix at transfer, even if only “at transfer” from the stereo bus of their DAW to, say, their desktops or the “bounces” subfolder of their session file. Little more is said or done, unless the output format requires coding metadata or a DDPi. Thus, the mastering engineer emerges as a kind of aesthetic mediator, taking a mix from the rarefied world of the control room and relating it to the stereo systems, computers, iPhones, and earbuds of the broader listening populace. Actually, in many respects this process is precisely what a mastering engineer does. Even a band like the Beatles, whose every artistic whim readers might expect their label to have indulged, still ultimately deferred to the good judgment of a mastering engineer before their records could make market. In fact, this circumstance proved increasingly difficult for the band to abide, especially as they achieved greater and greater chart success. Paul McCartney was especially annoyed by this mediation, given that it was his bass tracks which mastering engineers routinely attenuated. The “low end” of their records was a constant bone of contention during mixing and mastering. “The main problem from EMI’s perspective was the possibility of too much bass causing the stylus on the average record player to ‘jump the groove,’” explains Ryan and Kehew. “Being the conservative organization that they were, they compensated perhaps a bit too much, although it was still a valid concern.” As engineer Norman Smith remembers: We were sort of restricted. The material had to be transferred from tape onto acetate, and therefore certain frequencies were very difficult for the cutter to get onto disc. I mean, if we did, for instance, slam on a lot of bass, it would only be a


Understanding Records, Second Edition

problem when it got up to the cutting room. . . . Paul [McCartney] used to have a go at me for not getting enough bass on a record. During mixes he’d always say, “Norman, a bit more bass?” And I’d say, I can’t give you more bass, Paul. You know why—I’ll put it on there, but as soon as it gets upstairs into the cutting room, they [read: EMI’s salaried mastering engineers] will slash it . . . because they thought the needle would jump.11

This is just one view of mastering, of course—a view which aligns neatly (probably too neatly) with the so-called “transfer” tradition. As noted, this “transfer” style of mastering emerged from the British culture of record production, where mastering was seen as the first step of manufacture as opposed to the last stage of mixing. Accordingly, the mastering process was conceived in terms of quality assurance. Engineers working in the “cutter” tradition, however, have always been encouraged to take an alternate view of the process. As noted, the “cutter” tradition sees mastering as the final stage of record production, and not the beginning of duplication. This cultural viewpoint invites direct aesthetic intervention by mastering engineers, who consider the mixes they master as works still in progress. Engineers working in this tradition are invited to scrutinize the aesthetic virtues of the mixes they work on, to reconsider their artistic merits and tweak accordingly. Brian “Big Bass” Gardner explains: I was at Century Records, my first job. It was a custom kind of place, and they cut records for schools and the armed forces. I don’t remember the first session I did. I understood the concept—the transfer of mechanical energy to electric energy through the cutting head—so it wasn’t any big surprise for me. I learned from watching it and reading about it. . . . Of course, at that time, I wasn’t allowed to EQ. Back then, you were forbidden to touch what engineers had done. Your job was just to put it on disc. It was: “Do not touch this tape; it’s perfect!” And then Creedence Clearwater came along. We kept cutting refs for them over and over, thinking there might be something off on the frequency response. And it still wasn’t right. So I broke out the Fairchild limiters and the Pultecs, and I tweaked it, and that kind of changed the whole thing—being able to doctor up tapes.12

Just how far mastering engineers can go in their tweaking is another matter entirely, of course. A host of concerns constrains mastering engineers as they work. The very first such concern is the client. Who communicates with mastering engineers goes a long way in determining how emboldened they feel to tweak a mix. Engineers are, after all, contracted by a host of agencies, including amateur home recordists; producers; mix engineers; labels facing

Mastering (The Final Say)


distribution deadlines; and so on. Each contracting agency will invite a different approach. An overly egotistical mix engineer, for instance, will likely refuse to adjust their mixes in any way whatsoever, rigidly preferring a distorted master over admitting error and learning from their mistakes. Though this may seem likely to be rare, and completely unprofessional, as a working attitude in mixing, I have encountered it far too often to count in my work as a mastering engineer, and at every echelon of professional activity. People simply don’t like to be told they’re wrong, let alone on aesthetic decision in which they have become emotionally invested. The relationship between mixing and mastering engineers is indeed a fraught one! On the other side of the coin are amateur recordists. People who go through mastering for the first time often grant engineers an aesthetic carte blanche, trusting that they know best how to finalize their mixes for general distribution. Or they make (sometimes hilariously) outrageous demands of mastering engineers, having little idea what mastering is actually meant to achieve (I was once asked to strip the drum kit from a stereo .wav, for instance). Meanwhile, labels who work exclusively with the same mastering engineer, or mastering house, tend to trust the opinion of their mastering engineer above anyone else’s, and will insist that mix engineers rebalance and resubmit when the mastering engineer so demands, sometimes despite the mix engineer’s bitter protestations.

Qualities of a good premaster No matter who contracts them, and no matter with whom specifically they communicate, mastering engineers almost universally agree that submitted mixes which feature a few key shared features fare best in mastering. Mixes should be phase coherent, mono compatible (though in my experience this becomes less of a concern with each passing year), contain roughly −8 to −3 dBfs of headroom, and clients shouldn’t worry excessively over whether or not every mix they submit peaks at the same amplitude. Mastering engineers likewise routinely caution against allowing excesses or resonant spikes in the spectral contours of mixes. Resonant spikes often emerge when one or more instruments fail to dock, or “sit right,” within a broader multitrack production. This sort of imbalance can be extremely difficult to fix in mastering, especially when the lopsided contour makes a mix brighter. And the same goes for excessive lows. Exaggerated bass frequencies will get in the way of a mix being brought to market level, as bass buildups impede dynamic inflation.


Understanding Records, Second Edition

Mastering engineers also routinely ask mix engineers to bypass any stereobus processing they may have used while mixing. This can be tricky, however. Many mix engineers “mix into” a stereo compressor as a matter of technique, and to create a particular general sound for their mixes. This works best when the engineer knows how to set the compressor so it adds a layer of dynamic management while remaining transparent for the mastering engineer, that is, such that the mastering engineer can inflate the mix to market level without it sounding overly compressed. So long as the mix continues to feature a healthy dynamic range at output, most mastering engineers agree, stereobus compression is fine. Once they become more familiar with the mastering process, though, many mix engineers abandon this practice completely. As Will Haas explains: A common mistake made by inexperienced mixers is adding a stereo compressor or limiter to the mix after the mix is complete, while bouncing files, for the sole purpose of making their mix sound louder. This action is usually done out of fear because they are concerned their clients will not understand why their mix is so low in volume when compared to other commercial recordings. This is fine for a “client mix,” but doing this to the final mix pass and sending it to the mastering engineer creates a problem, because it can’t be undone. Mastering engineers will most likely add more compression during mastering (if need be), but keep in mind that this is their forte. Mix-bus compression may or may not be for you, depending on how you work with dynamics when mixing. The only way to find out is to experiment and see which option produces a better mix. Whether or not you choose mix-bus compression as a staple of your mixing arsenal, the mastering engineer should not have to go to forceful extremes to make your music sound stellar!13

The art of mastering: Making a start Ironically, engineers usually start the mastering process focused on its conclusion. Intended formats (i.e., digital download, loudness-normalized streaming, compact disc, vinyl, cassette, etc.), distribution outputs (i.e., Spotify, Bandcamp, SoundCloud, YouTube, etc.) and general aesthetic priorities (i.e., loud, dynamic, bass heavy, etc.) all influence the way engineers approach mastering, and should be worked out before a single premaster is heard. Experienced engineers will also negotiate payment upfront, informing clients when in the process they expect to be paid, how they would like to receive payment—in the era of e-transfers,

Mastering (The Final Say)


PayPal, and bitcoin, payment can be complicated—and how many revisions clients can expect before additional charges apply. They will also likely discuss how material should be delivered, how much headroom is ideal for a premaster, whether masters will be “pegged” using LUFS or some other standard, and what turnaround time clients can likely expect.

Setting up “the line”: In-the-box (ITB) and Out-of-the-box (OTB) Once the broad administrative logistics of the project have been sorted, mastering engineers are ready to set up their “lines,” that is, to establish the signal chain(s) they will use for the project. This can take a number of different forms, but depends in the first instance on what gear engineers have available. If they intend to master in-the-box, that is, entirely using their computer, engineers will nevertheless usually require top-of-the-line digital-to-analog conversion, to hear the subtle aesthetic details of the music they master. And they will use the analog signal to feed a set of monitors optimized for mastering. Unlike in tracking and mixing, though, where near-field reference monitors tend to rule the roost, mastering engineers with the resources to acquire them most commonly prefer to work on mid-field or far-field “three-way” monitors. There is good reason for this preference. The distance of mid- and far-field monitors from the point of audition allows for a more accurate hearing of the subsonic information in a mix, just as the three-driver system avoids the interdriver comb filtering and crossover obfuscations (at about 1–3 kHz) to which so many near-field monitors are prone. When they work out-of-the-box (OTB), that is, using analog hardware, engineers will usually insert a mastering transfer console—the Maselec MTC-1X is a good example—into the chain after initial conversion. Because they are optimized for mastering, these consoles resemble signal splitters much more than mixing consoles at first glance. Along with an onboard elliptical filter, LPF and HPF knobs, and a knob for determining stereo width, most mastering transfer consoles allow engineers to route out to a number of analog units without necessarily committing to their position in the chain. Inserts 1 and 2, and inserts 4 and 5, on the MTC-1X can be flipped with the press of a button, for instance, allowing engineers to audition the effect of putting, say, equalization before or after compression. More importantly, each unit can be taken in and out of the line at whim, any bit of gear bypassed, and inserted, and bypassed again, with only the press of a button. While all this may sound entirely mundane—or,


Understanding Records, Second Edition

at the very least, nothing very special—to engineers working ITB, for those of us still aesthetically attached to analog gear, this functionality can be a godsend for expediting the creative process of mastering and for tightening turnaround times. Despite their aesthetic and ergonomic appeal, analog workflows are expensive. Aside from the cost of the analog units themselves, engineers working with analog gear have the added expense of high-resolution analog-to-digital conversion, which they need to get processed signal back into their computers. Furthermore, when engineers route to analog gear they commit to its effect on a mix, its audible modifications inextricably printed at conversion. This can cause issues if engineers later change their mind about the effect a particular unit has on a mix; the entire line will need to be rerun, even if only to adjust the attack time setting on a compressor by 4 ms. Those who work ITB, on the other hand, can leave every process open to rejigging before final render. Processing can be increased or decreased by miniscule amounts, at no extra cost to the engineer. Should engineers wonder whether, say, 0.1 percent saturation might sound better than 0.2 percent with a particular compressor in the line, they can dial in the settings and make an empirical decision in a matter of seconds. They can just as easily change their minds later, too. They might even send clients different versions of the same master, to get their opinions on the matter, without any additional labor. One version they send might have a compressor in the line, while another might be precisely the same but without any compression, and a simple bypass before bouncing is all that this requires. In fact, many clients appreciate having a say in the final sound of their masters (some don’t, however, and wonder why they are paying for a mastering engineer to work on their projects if not to have them exercise their expertise in making these sorts of determinations). In any event, once they determine a viable “line” for a project, engineers begin to examine the submitted premasters in closer detail. If they determine that the premasters require adjustment, they then typically initiate a sometimes tricky dialog with their clients to request a remix. If the premasters pass muster, they begin to tweak. Each component of the line they established at the outset will obviously need adjusting to accommodate the unique balances and aesthetic details of each individual mix, but certain points (usually “destinations” or end points in the “line”) will usually remain fixed. Individual tracks may have different volumes, for instance, and some may require de-essing, different compressors, equalizers, and any number of different settings, but they all

Mastering (The Final Say)


wind up at the same “destination,” that is, run through, say, the same M/S unit, saturator, limiter, and meter. The specific routines each engineer follows diverge after this point. Indeed, once they establish a viable signal chain for working, each engineer then “makes a start” of mastering in a number of different ways. Some prefer to “top and tail” tracks at the outset, going so far as to determine rough fades (and even fade curves) for every track in a project before they do anything else. Others will listen to each track in sequence, if a sequence is provided, and balance in turn. If no sequence is provided, many engineers will prefer to determine a sequence before they even begin to think of balancing, let alone tweaking. And still others prefer to leave these administrative details open, adjusting spectral and dynamic contours and balances before they worry about anything like gaps, fades, and cross-fades, for instance. Whichever routines and techniques they follow from the start, though, we shall shortly see that mastering engineers tend to listen for the same broad details as they do so. When they make a start of mastering, that is, most engineers listen for the same things, even if they do different things while they listen.

Figure 3.2  Some “analog” gear (my API lunchbox) I used to master hundreds of “underground techno” releases throughout the 2010s, including two Juno-nominated releases, a pop Top 20 release in Canada, and a few tracks which charted on the once crucial beatport.com.


Understanding Records, Second Edition

Sequencing As noted, the first stage in mastering is usually “sequencing.” During “sequencing,” recordists order the songs on an album according to either their own or the client’s, specifications. As inconsequential as this process may seem on first blush, sequencing can be an extremely difficult and complex procedure. Everything in mastering proceeds from sequencing. Once they decide on a sequence, engineers don a “bi-directional” micro/macro-analytic lens, simultaneously evaluating tracks as singular entities and components in a broader sequence. In other words, once a sequence is made, every adjustment becomes an adjustment to both a specific track and to the broader project overall. When they boost, say, the midrange of even just a few seconds of a synth solo somewhere on track two, for instance, mastering engineers simultaneously boost the midrange of the sequence as a whole. “You have to think about everything at the same time,” notes Chris Gehringer: I don’t really treat different musical styles differently. .  .  . I decide if there is enough top, if the bass is big enough, are the vocals loud enough, and just do what I think is needed. The challenge is to make the whole thing sound like a homogenous record. . . . Generally what I do is put the whole album together and, as I’m working, I compare each song to each other. When I get to the end I listen right through the whole thing, and then hearing it as a whole I might decide that song five, say, needs a new approach. So I might go back and redo that track with the Sontec EQ instead of the Avalon, for example.14

Though the particulars of sequencing vary from genre to genre—and, even, from sub-genre to sub-genre, record to record, and ep to ep—a few large-scale patterns tend to dominate sequencing on modern records. In the classical domain, for instance, the order of movements in large-scale compositions tends to prescribe sequencing. It makes little sense to start a record with the final movement of Beethoven’s Ninth Symphony, for instance, only to end with its first movement. In most genres of popular music, however, “set lists” change on a nightly basis and trial-and-error rules research and design. When they sequence a pop record then—or, as occurs more often now in our streaming era, when they sequence an ep—mastering engineers struggle to finalize something which most often defies finalization. The notion of a “concept album” (or a “concept ep,” I suppose) complicates this dynamic somewhat. Concept albums tend to be conceived with a final

Mastering (The Final Say)


sequence already in mind, or they are sequenced to elucidate some external program, whether that program be: (i) narrative (e.g., telling a story); (ii) thematic (e.g., elucidating a philosophical concept); or (iii) impressionistic (e.g., exploring a particular feeling). Some celebrated examples of narrative sequences include The Beatles’ Sgt. Pepper’s Lonely Hearts Club Band, which, though it was sequenced long after each component track was completed, ostensibly documents a fantasy concert performance by the fictional Edwardian band “Sgt. Pepper’s Lonely Hearts Club Band”; Pink Floyd’s Dark Side of the Moon, and The Wall, which both recount, in sometimes harrowing detail, the slow descent into madness which awaits any feeling human born into Western advanced-industrial capitalism, or so the album’s fantastically wealthy lyricist, Roger Waters, would have us mere listeners believe; Frank Zappa and the Mothers of Invention’s Freak Out!, which recounts a “bad trip,” presumably inspired by LSD, suffered by GTO-groupie Suzy Creamcheese; and The Streets’ A Grand Don’t Come For Free, which tells the story of a soccer hooligan trying to locate a large sum of lost money (one-thousand pounds, to be exact).

Examples of thematic sequences include Iron Maiden’s Seventh Son of a Seventh Son, which explores a series of world mythologies surrounding the mystical powers of seventh-born males who, themselves, are children of seventh-born males; Marvin Gaye’s What’s Goin’ On?, which explores American race relations and the stark decay of American urban-industrial society during the early 1970s; Jethro Tull’s Thick as a Brick, which parodies the high-art pretensions of English progressive rock bands working in the early and mid-1970s, like ELP, Yes, Genesis and King Crimson; and Pink Floyd’s Animals, which bleakly, and without direct reference to George Orwell, categorizes humans into archetypal animals, specifically, dogs, flying pigs and sheep.


Understanding Records, Second Edition

Finally, examples of impressionistic sequences include The Beach Boys’ Pet Sounds, which elucidates the onset of adolescent fears and misgivings about life via strange musical timbres (and, in case listeners miss the point, the album begins with the words: “wouldn’t it be nice if we were older”); Joni Mitchell’s Blue, which explores, in confessional detail, feelings of loss and homesickness arising in the wake of a romantic breakup at Christmastime (the record reportedly chronicles the demise of Mitchell’s relationship with Graham Nash); The Cure’s Faith, which offers a number of different, often gothically bleak, perspectives on the vagaries of religious faith; and Radiohead’s Ok Computer, which examines the paranoia and anxiety many in the post-industrial West felt during the mid- and late 1990s, while the so-called digital revolution provoked a seemingly never-ending wave of cultural and industrial upheavals.

Non-programmatic sequences usually follow a different logic than the programmatic sequences we’ve just considered. Most often, the logic of the nonprogrammatic sequence is steeped in the concert paradigm of musical exchange. The concert paradigm figures prominently in published accounts, in fact. Bob Katz, for one, likes to divide his sequences into discrete “concert sets,” each with its own emotional design: Before ordering the album, it’s important to have its gestalt in mind: its sound, its feel, its ups and downs. I like to think of an album in terms of a concert. Concerts are usually organized into sets, with pauses between the sets when the artist can catch her breath, talk briefly to the audience, and prepare the audience for the mood of the next set. On an album, a set typically consists of three or four songs, but can be as short as one. .  .  . The opening track is the most important; it sets the tone for the whole album and must favourably prejudice the listener. . . . If the first song was exciting, we usually try to extend the mood, keep things moving like a concert, with an up-tempo or mid-tempo follow-up. Then it’s a matter of deciding when to take the audience down for a breather. Shall it be a three or four-song set? . . . Then I pick candidates for the second set, usually starting with another up-tempo in a similar “concert” pattern. This can be reversed; some sets may begin with a ballad and end with a rip-roaring number, largely depending on the ending mood from the previous set.15

The Beatles’ Sgt. Pepper’s Lonely Hearts Club Band provides an invaluable opportunity to gauge the effect sequencing can have on record reception. The

Mastering (The Final Say)


record also offers an interesting hybrid of programmatic and non-programmatic sequencing: though some critics still debate whether the album even has a concept at all, more agree that the record follows a programmatic sequence, specifically, a fantasy variety show complete with audience laughter and applause hosted by the fictional “Sgt. Pepper’s Lonely Hearts Club Band.” Sequenced at a time when long-play records (LPs) were almost uniformly packaged as an A-side/B-side technology, competing sequences for the A-side of Sgt. Pepper’s emerged during mono mixing sessions for the album. Initially, the album’s A-side sequence ran as follows: 1. 2. 3. 4. 5. 6. 7.

“Sgt. Pepper’s Lonely Hearts Club Band” “With a Little Help from My Friends” “Being for the Benefit of Mr. Kite” “Fixing a Hole” “Lucy in the Sky with Diamonds” “Getting Better” “She’s Leaving Home”

At the very last moment, though, George Martin re-sequenced the album as follows: 1. 2. 3. 4. 5. 6. 7.

“Sgt. Pepper’s Lonely Hearts Club Band” “With a Little Help from My Friends” “Lucy in the Sky with Diamonds” “Getting Better” “Fixing a Hole” “She’s Leaving Home” “Being for the Benefit of Mr. Kite” Playlist Title: Understanding Records, Chapter Three: Alternate Sgt. Pepper’s This playlist is comprised of the two sequences mentioned directly above, that is, the “original” sequence and the “last-minute” sequence. Readers are encouraged to consider what, if any, effect the sequence has on how they feel each track relates.

Gapping Though they seem like simply administrative details, mastering engineers use the silent “spacing” between tracks, and the “fades” at the beginning and ending


Understanding Records, Second Edition

of tracks, to reinforce the basic emotional design of a sequence—much as authors, myself included, use punctuation to reinforce the intended meaning(s) of sentences! Spacing determines the length of gaps, or silences, between tracks on a record, while fading determines how tracks move into, and out of, those gaps. To promote a sense of conceptual cohesion, for instance, recordists often cross-fade seamlessly from one track to the next for concept albums, such that one track fades out as another fades in. This cross-fading characterizes the spacing on a number of celebrated concept albums, including the Beatles’ Sgt. Pepper’s Lonely Hearts Club Band; Pink Floyd’s Dark Side of the Moon, Wish You Were Here, and Animals; and, more recently, Prefuse 73’s Everything She Touched Turned to Ampexian and Nine Inch Nail’s The Fragile. Of course, the opposite is also true: longer gaps between tracks can easily suggest that, as an entity, a sequence should be received as a collection of distinct songs, rather than as a musical program of any sort. Compare, for instance, the sense of connectedness that arises from the cross-fade between Lemon Jelly’s “Elements” and “Space Walk”—tracks one and two, respectively, on the duo’s Lost Horizons—with the sense of separateness arising from the extended fadeout, brief gap, and slow fade-in between tracks five and six on the same album, namely, “Nice Weather for Ducks” and “Experiment Number Six.” Extended gaps that follow a series of cross-fades, on the other hand, like the minute-long fade-out and gap which separates Pink Floyd’s “On The Run” and “Time” on Dark Side of the Moon, tend to invoke a sense of internal closure within a broader sequence. That is, extended gaps, especially when they occur after a series of cross-fades, usually suggest that some sort of internal cadence has been reached, which in turn prompts listeners to conceptually combine what precedes the gap into an aggregate musical statement. Rarely are the gaps in a sequence standardized. They are certainly never random, though. In fact, spacing is a surprisingly complex and emotionally wrought procedure. “It’s funny how gaps seem to vary depending on the mood you’re in,” notes Ray Staff: Normally, I prefer people to attend the session and listen to the gaps for themselves, because opinions can differ. I’ve had people send me an unmastered CD saying that they’ve worked out all the gaps, so I’ve read their CD’s Table of Contents to get their timings, and matched mine up exactly. After hearing the master they’ve then decided that the gaps didn’t sound right. That’s because they’re hearing different detail, or because my fades are smoother, making [the album] sound different. .  .  . It’s not unusual for people to say “Start the next

Mastering (The Final Say)


one straight after the previous song has finished,” and that’s because they’re not relaxed and are sitting there in anticipation. They often end up calling back later asking if we can add another second to every gap. If you have two energetic songs together you might place them straight after one another, but you might need a longer gap between a big lively song and a moody one. You have to listen on a musical and emotional level, and do what feels right.16

After sequencing and gapping, recordists can turn their attention to other kinds of processing. I will begin examining these processes by looking at equalization. It is during equalization that the bidirectional micro/macro-analytic lens that engineers don during mastering becomes most apparent, in fact. As such, I would argue that it warrants my attention before anything else.

The art of mastering: EQ Tread lightly, carry a big stick. Perhaps no other adage so neatly encapsulates the approach mastering engineers take to equalization. Unlike in tracking and mixing, where large Q-values and boosts and cuts of more than +/-2 or +/- 3 dBfs are common, equalization in mastering is a much subtler craft. Most mastering engineers follow the same simple rule, in fact: boosts and cuts of more than +/- 3 dBfs or +/-4 dBfs indicate that adjustments need to be made at mix level before mastering can proceed any further. I asked Russ Hepworth-Sawyer to explain, as best as he could, how he approaches EQ as a mastering engineer: Oh my word, EQ’s all over it. To be honest, as much as I’ve trained myself to become a master of the dynamics processing I have on offer to achieve loud stuff, without EQ at certain points in my chain, I’d be stuffed. It’s so so important (yes two “so”s). I still find it incredible that good mixes that sound great in the recording/mixing room can [still] be enhanced so much with the right lifts and tucks in places. Speaking generically, for me EQ gets used very specifically. It’s usually taming low mids to the bass end. This is where a lot of errors are made, whether through the acoustics of the rooms people are working in, or through a misunderstanding of what the lows do in music. This “misunderstanding” can be from poor speaker reproduction, acoustics, or interpretation of what’s coming out of them. It’s also so transient due to genre, decade etc. As you know Jay, we’ve spoken about this for years!


Understanding Records, Second Edition

I use EQ to tame the bass as above. I use EQ a lot to sort out the overall scope. I also use the EQ in Mid Side probably more than most. I remember asking our friend Barry Grint about this and he said he rarely uses it. I only put it down to the quality of mixes he gets (amazing—look at his back catalogue). We get all sorts, from the best mixed Nashville things (honestly amazing) to local bands with ropey home-made demos. The latter can be realistically enhanced by Mid Side processing with EQ on both paths. Mid Side is dangerous in bad hands! But subtle work can truly enhance and almost repair. I love a touch of EQ on the output of my chain, as this is really where I can sculpt and interact with the dynamics processing I have there. Without this I’d be lost.

Feathering: Boosts and notches Equalization in mastering generally occurs in smaller increments (often 3 dB or less) than it does elsewhere in the production process. Likewise, narrower Q-values figure in mastering than elsewhere in production. When larger boosts or cuts are required, however, to achieve the same effect with less of a disfiguring footprint on mix balances, mastering engineers will often use a technique called “feathering.” Rather than boost a single frequency by +3 dBfs, for instance, mastering engineers “feather” it by adding only, say, 1 dBfs of energy but also pumping its related neighbor frequencies by +.25 dBfs. Instead of a single boost of +3 dBfs at 100 Hz, then, mastering engineers may “feather” 100Hz up by +1.5 dB, and then add +.5 dBfs at 80 Hz and +.25 dBfs 120 Hz (these numbers are entirely made up, as the specific amount of boosting or cutting depends entirely on the nature of the mix being mastered). When larger boosts or cuts do occur in mastering, it is most often to “notch” an overly resonant frequency or two either out of a mix completely or down to a more tolerable level. To do this, engineers focus the equalizer’s Q-value to as narrow a range as the EQ allows (or as the notch requires). It is not entirely uncommon, in fact, to narrow the Q-value down to a width of only a few frequencies, for instance. Then, engineers “notch” or attenuate the trouble region so it sits better in the mix, even if that means attenuating it almost completely. Often this requires following the offending frequency up the harmonic series as also happens with “feathering,” with engineers creating a sequence of ever-smaller cuts at each related harmonic partial above the resonant frequency.

Mastering (The Final Say)


“Dynamic notching”: De-essers, expanders, multiband compressors for EQ Sometimes engineers cannot notch out overly resonant frequencies without completely disfiguring a mix in the process. Mixes usually feature important musical information at trouble regions, and their dynamic profiles must be preserved even while resonant spikes in those same regions are brought in line. In this case, mastering engineers will invoke a dynamic equalizer of some sort. When the offending frequency is somewhere in the midrange of human hearing (i.e., 1–7 kHz), as they tend to be, mastering engineers will often reach for a dedicated de-esser of some sort. Focusing the de-esser’s detection circuitry onto the problem region, engineers then set its threshold so only the offending spikes trigger it to action, and then they adjust the remaining dials and knobs to musical taste. Ideally, only excessive frequencies are thus attenuated. De-essers are usually only required when mix engineers have failed to dock a particular instrument as well as they should. Indeed, spectral imbalances usually manifest less abstractly than the concept of a “spectral contour” might suggest. Engineers don’t typically hear an excess at, say, 2 kHz so much as they hear problematic tones in a mix, that is, overly “picky” guitars or basses; “clicky” kick drums with too much attack; snare drums with too much “snap”; harsh “ice pick” hi-hats; overly sibilant vocals; too bright “trashy” cymbals; and so on. And I should quickly note that, when they can be adequately focused on particular bands, expanders can be used to achieve the opposite effect. Expansion at around 80 Hz can bring out the low “thump” of a kick drum, for instance, while expansion between 100 Hz and 120 Hz can bring out the bottom of a bass, but only when those instruments play (and barring negative crossover which might contribute to masking), that is, only when amplitude in those regions is sufficient to trigger the expander. Of course, one might just as well consider de-essers a single band of a multiband compressor. Indeed, what is a multiband compressor if not a few de-essers spread out across the audible spectrum, in selectable bandwidths? When they do use multiband compressors—many engineers consider multiband compressors too disfiguring for taste, and rebalance using less obtrusive measures—engineers most often do so either to bring excessive bass energy into line or to somehow alter the relationship of vocal tracks to the broader mix. In fact, multiband compressors are considered extremely dangerous by experienced mastering engineers, given their tendency to affect the timbral properties of a mix even while they alter its dynamic contour. Most instruments


Understanding Records, Second Edition

have important frequencies in more than just a single area of the audible spectrum, after all. Altering the dynamic contour of a mix above 12 kHz may thus have the intended result of controlling the “highs” but, in so doing, it might also alter the way, say, the cymbals are heard at the macro level of a mix. Likewise, using multiband compression to emphasize a vocal performance at roughly 3 kHz can easily have the unintended consequence of altering the timbre of rhythm guitars in a mix, such that their picking sounds become exaggerated and the mix, in turn, becomes too “mid-forward” and harsh. Or it can negatively affect the way reverb is heard in your mix. In the words of Mike Senior: Any multi-band gain-riding will inevitably affect the decay characteristics of your reverb. However, multi-band compression should be a very subtle process, so unless you’re deliberately using it for its own effect, it ought not to make your reverb sound unnatural. Where you would most notice reverb being unnatural would be in recordings with a wide dynamic range, and these are just the sort of thing that you wouldn’t want to go near with heavy-handed multi-band dynamics processing at all. In short, if you keep your multi-band mastering “tasteful” then it’s unlikely that your reverb is going to suffer unduly.17

Nonetheless, and despite such caveats, multiband compression can be a powerful tool for dynamic equalization in the hands of an experienced mastering engineer. Playlist Title: Understanding Records: Audio Concepts & Demonstrations Track: 22. De-Essing at Master Level An unprocessed premaster sounds once. When it repeats, a de-esser is applied to “tamp” down on the hi-hats and cymbals. The premaster sounds once again, without de-esser, to end the track. If readers struggle to hear the process at master level, it is likely clearest in terms of the relative emphasis given to the sound of the hi-hat opening and shutting. To be clear, the de-esser is not used here in any “corrective” or musical way. Rather, it is heavy-handed and disfiguring, to demonstrate the process. Uses for de-essing at master level are usually much subtler and, thus, difficult to hear without many years of experience, let alone access to the processed premaster.

Mid-side processing Another powerful EQ tool used for mastering is so-called “mid-side” or “M/S” processing. These processors allow engineers to tweak center and side information in a stereo mix separately. Sometimes this can be done for corrective reasons. A mix with an overly present vocal track can be fixed with a simple adjustment to the mid channel of an M/S processor. Likewise, a dull mix can be

Mastering (The Final Say)


brought to life, without necessarily changing its dynamic envelope, by pulling treble frequencies to the sides of a mix and compressing them there, making them more readily audible and, in the process, increasing their transient detail. M/S processors also typically feature an elliptical EQ, which allows engineers to “mono the lows,” that is, to isolate bass frequencies to only the center or “mid” channel of a mix (anywhere below 100 Hz, usually). This is sometimes misunderstood as an unnecessary historical residue from the vinyl era, when “monoing the lows” was an important countermeasure against “skipping the groove.” In fact, M/S processing remains a core technique in mastering for vinyl. However, monoing the bass frequencies in a mix can have powerful spectral benefits as well, especially for engineers working in electronic genres. With the lows set to mono, and the highs pulled to the sides and compressed, mixes otherwise obfuscated by a muddying overabundance of bass frequencies tend to clarify to transparency. Details like the return on a reverb and the sparkle of cymbals suddenly emerge from the spectral mud, even while the bass region becomes even more robust, punchy, and focused. Because the human ear is incapable of hearing directionally below about 150 Hz, monoing the lows rarely has spatial ramifications at mix level. But any work on the sides of a mix will indeed have profound consequences for stereo width. As Ray Staff explains: As a very coarse guide, reducing the spread of frequencies below 100Hz won’t adversely affect the song’s spatial characteristics or the directionality of an instrument, because the human head can’t detect stereo below 100Hz anyway. Above that it becomes a trade-off between the stereo spread and quality of the bottom end.18

According to Russ Hepworth-Sawyer, Well the main use for M/S processing is EQ. . . . I rarely dynamically process the MS channels differently. I don’t like that idea. The image (dynamically) must stay consistent. EQ enough is dangerous! The main advantages of MS for me are using the mid channel to enhance the bass drum specifically, the vocals intelligibility range too. These are clear winners when you’ve got poor mixes. Then the sides on bad mixes can benefit from a lot of low frequencies being lowered. It’s mud in the sides when you don’t need it. I might enhance the high mids or even the highs when the sides need it—but be careful I always say. It’s rare I play with the levels between the mid and the side, but this is, to be fair, where some mixes are too wide, or too mono, I’ll do what I can here. Mid Side is in my chain ready to go on bypass whenever I need it.


Understanding Records, Second Edition

EQ: Summary Regardless of which particular tools mastering engineers use to equalize, or the basic rationale they follow in so doing, and regardless of the fact that dynamics processing tends to monopolize the critical conversation about mastering nowadays, most mastering engineers agree that EQ is the most powerful technology for transforming a mix in their arsenal. And, again, they all agree that the most powerful results when equalizing in mastering come from moderate moves. Bob Ludwig put it best: It is a matter of context. . . . Mix engineers will often really crank on an equalizer to get it where they want. 10dB or more EQ is not rare. Mastering is the opposite, totally dealing with minutiae. Any particular frequency boosted or dipped 3dB would be a lot of correction. I think many mix engineers have struggled so much to get the mix to where they are happy, to look at it again, under a microscope, is simply too difficult. By that point they are lacking perspective. Mastering is dealing with the trees in the forest of the mix.19

Indeed, equalization may well be the most crucial way that mastering engineers “deal with the trees in the forest of the mix.” This said, dynamics and dynamic balance also play crucial roles in how a mix is perceived. And mastering engineers work equally hard at ensuring they are optimized to best suit the aesthetic needs of each mix they work on. In fact, as we shall see as I now turn my attention to dynamics processing in mastering, “loudness” can just as easily be achieved through equalization as through dynamics processes like compression and limiting.

The art of mastering: Dynamics Though they are often confused, loudness and volume are two very different things. Volume refers to the peak amplitude of a signal or an acoustic phenomenon (a sound), while loudness is a subjective auditory interpretation of sound, based largely on its frequency content and average level. In short, volume is an empirical property of acoustic phenomena while loudness is a subjective auditory response to such phenomena. As noted in the first chapter of this book, the amount of time an acoustic phenomenon spends at or near its peak amplitude has more to do with determining its loudness than does its actual peak volume. A sine wave which, for instance, peaks at 0.0 dBfs makes a quieter auditory impression than does a square wave

Mastering (The Final Say)


peaking at the very same level. This is so because the square wave spends all its time at peak amplitude and, thus, features a higher average amplitude than the sine wave. The human auditory apparatus uses this information, specifically, the average amplitude of both signals, to conclude that the square wave sounds “louder” than the sine wave, even though both indeed have the same volume. Mastering engineers are, of course, keenly aware of this difference. We have already explained that when they embraced the “full-scale” decibel weighting system, a by-product of the record industry’s broader embrace of digital technology and personal computing, mastering engineers also embraced an approach to dynamics management whereby loudness necessarily comes at the expense of dynamic range. Because all digital signal peaks at “full scale” (0.0 dBfs) before clipping, to squeeze, say, +3 dBfs more average amplitude from a mix, engineers must decrease its average amplitude by the same, and then make up the difference using some (aptly named) “makeup” gain. Put simply, the louder the digital master, the smaller its dynamic range can be. And there is no question that records are getting louder, even if current commentary on the so-called “loudness wars” suggests that hostilities have abated. As noted, pop records have increased in average amplitude since the mid1980s by roughly +18 dBfs, even as their average dynamic range has decreased by precisely the same.20 To compete at market level, then, records released today must feature elevated average amplitudes, which means they must also feature a concurrently squashed dynamic range. While gold-standard records like, say, Led Zeppelin’s “Stairway to Heaven” (1971) once featured +/-35 dBfs of dynamic range, modern releases in the same genre—like Muse’s “Supermassive Black Hole” (2006), to name an arbitrary example—can and often do feature dynamic ranges of less than +/-4 dBfs. As Ian Shepherd recently put it: The Loudness War .  .  . is based on the misguided idea that louder is always better. If you play people two versions of the same thing, chances are they’ll pick the louder one. But that only works if you can keep turning it up forever, and that’s not true in any real world situation, especially not in digital audio where there’s this brickwall ceiling you can’t go past. Beyond a certain point, it sounds flat, lifeless, has less of an emotional impact, and can even sound crushed and distorted.21

Of course, different genres tolerate different dynamic ranges. Pop, rock, and hip-hop records, for instance, routinely feature dynamic ranges of less than +/-8 dBfs. Jazz, folk, country, and classical records, on the other hand, feature ranges of roughly +/-12 to +/-16 dBfs. Meanwhile, EDM records typically


Understanding Records, Second Edition

feature a range of less than +/-4 dBfs. And it is the mastering engineer’s job now to know all this. They must curate the dynamic range of the mixes they master, knowing what DR (dynamic range) is suitable for a record given not just its intended formats but also the markets it will be released to. Mastering an EDM record with a DR more appropriate for a classical record—say, -/+18 dBfs—is simply bad practice, no matter your personal opinion of digital loudness and dynamic range. This is why, in my opinion, holding steadfast to hard-and-fast rules of any sort sounds so terribly amateurish to my ears: when you have been working in record production long enough, you will find that nothing is set in stone, and every project demands its own approach and aesthetic rulebook.

Loudness and spectral content: Equal loudness contours Only making matters more complicated for mastering engineers, loudness is not simply the product of a record’s DR. Spectral content also plays a crucial role. This is so thanks largely to inherent biases in the human hearing mechanism. Indeed, when it comes to hearing equally well across the audible spectrum, the human ear is flawed at best. As the likes of Harvey Fletcher and Wilden A. Munsen first demonstrated in the early 1930s, when they worked to create their famous “equal loudness contours,” human ears grossly exaggerate midrange frequencies (i.e., roughly 1–7 kHz). In fact, it takes roughly 90 dB (SPL) of energy at 20 Hz to achieve the same loudness as about 12 dB (SPL) at 2 kHz. In short, brighter sounds seem louder next to duller sounds. Thus, mastering engineers can use equalization, especially in the midrange and upper-midrange of the frequency spectrum, to convey or discourage the audible impression of loudness. This said, boosting the midrange of a mix will only get the mastering engineer so far in what can seem like a never-ending quest for loudness. There is a law of diminishing returns at work here. The louder a mix gets, on average, the flatter our frequency perception becomes when we listen to it. As amplitude elevates, we hear more clearly at the extremes of the audible spectrum. Engineers seeking to convey the impression of loudness, then, even while they maintain a healthy dynamic range, must thus mimic this auditory response when they equalize. To do so, they often “hype” a mix’s spectral extremes, emphasizing its bass and treble frequencies over the midrange. This spectral contour is often called a “Smiling EQ Curve,” because it looks like a grin when viewed horizontally from 20 Hz to 20 kHz on a graphic EQ.

Mastering (The Final Say)


Playlist Title: Understanding Records: Audio Concepts & Demonstrations Track: 23. Smile Curve A brief demonstration of the “smiling EQ” mentioned above. The same sample as was heard in the last track sounds once without a “smile EQ” applied, then with, and then without again. Notice that while the objective level of the track does not change, the second iteration sounds “louder” to our ears, thanks to the “smile EQ.”

Clipping Braver mastering engineers will sometimes intentionally introduce brief bursts of distortion to boost the perceived loudness of tracks, especially those who master for electronic and hip-hop markets. Though the conventional wisdom on distortion says that all manner of “clipping”—which is what recordists call the peculiar distortion that digital-audio tracks make when pushed beyond full scale—should be eliminated before duplication, some recordists have taken to thumbing their nose at that particular convention. In fact, select application of “good old-fashioned clipping” is now essential for mastering EDM tracks, at least according to Paul White: The final weapon in the guerrilla mastering arsenal is that of good old-fashioned clipping. .  .  . As a very general rule of even more general thumb, periods of clipping should be kept to under one millisecond (around 44 samples maximum at a 44.1 kHz sample rate) and if two or more periods of clipping follow each other in quick succession then the maximum period of clipping needs to be made shorter to prevent audible side effects. . . . If you are recording acoustic music then using clipping as a means of squeezing a decibel or two of extra gain out of it may not be a good idea, but when it comes to high-energy dance music, clipping is frequently employed either at the mixing or mastering stage (possibly both!) and, as with most things in audio, if you can’t hear it working (or if it makes something sound subjectively better), it’s fair game.22

Mastering engineers must proceed with extreme caution should they decide to follow White’s advice, though. Recordists who push “the clipping barrier,” as Greg Milner calls it, risk deforming their productions beyond all musicality. By 1999, in fact, Milner continues, everybody had a favorite example of loudness gone wild. Was it Ricky Martin’s “Livin’ La Vida Loca”? Cher’s “Believe”? Rage Against The Machine’s “Battle of Los Angeles”? Santana’s “Supernatural”? . . . After 1994, there was no turning back. With each passing year, CDs got more compressed. More waveforms


Understanding Records, Second Edition

slammed up against that 0 dBfs barrier. In 1999, the Loudness War reached a crisis point. They called it “the year of the square wave,” in tribute to the flattopped waves that had donated their transients to the cause of loudness.23

It was during the “year of the square wave”—1999—that the most notoriously clipped record to reach market was released, specifically, the Red Hot Chili Peppers’ Californication. Partially inspired by the fact that audience opinion prompted Rush’s record label, Warner Brothers, to fund remastering of the band’s obviously clipped and hyper-compressed Vapor Trails, Simon Howson, an Australian film studies student, published an online petition to have Californication remastered. According to the petition: The main objective is to release new versions of the album that do not contain excessive compression, and are free of digital distortion (clipping) that is sadly prevalent on the current CD, and even LP, copies. The music should not be mastered simply to make all of the songs as loud as possible when broadcast on the radio.24

Though Californication has yet to receive a “quieter” remaster, Howson’s online petition generated more than seven thousand signatures within its first few months of publication. And, according to Greg Milner, many audio-engineering professors still find it useful to have their students “rip . . . the CD digitally, and count the number of places they find clipping.”25 Playlist Title: Understanding Records, Chapter Three: The Loudness Wars, part 2 A playlist comprised of numerous tracks which are often cited as examples of excessive “loudness” by critics and industry pundits. To be frank, I am of the mind that “loud is beautiful,” so these tracks do not bother me. I find the whole “loudness wars” conversation boring and pointless, insofar as commentators often seek to superimpose a kind of rational objectivity to what amounts to a completely subjective practice, namely, making music. There is no better or worse in art, as far as I am concerned. And sometimes—in fact, to my mind, often—distortion sounds fantastic at master level. But then I have never been accused of being overly conservative in artistic matters before (just check out the previous playlist, with examples of my work), and I have avoided all traces of the conservatory since I escaped to Berklee, in Boston, in 1995.

Parallel compression While peak limiting and digital clipping are effective tools for boosting the loudness of masters, they are hardly transparent. Engineers usually caution

Mastering (The Final Say)


against their use, in fact, given how little of each is required before the dynamic and tonal balance of tracks becomes completely deformed. As an alternative, many engineers deploy the so-called “parallel compression” technique. This dynamics-processing technique capitalizes on a very simple psychoacoustic principle: the human ear is much more forgiving of upward boosts to quiet musical passages than it is of downward cuts to loud passages. Unlike peak limiting, which normalizes and compresses the dynamic range of tracks in order to boost average amplitude, and unlike digital clipping, which augments tracks with rapid-fire bursts of digital distortion to create the spectral impression of loudness, parallel compression leaves the dynamic range of tracks more or less intact, though a moderate boost to quieter passages indeed occurs as part of the process. Engineers simply compress a “split” copy of the input signal, and mix it in with the original at output. Thus recordists boost the volume during quieter passages, even while louder passages remain untouched. Bob Katz calls this particular approach to parallel compression “transparent parallel compression,” and to achieve its desired effect he suggests very specific settings: (i) threshold: −50 dBfs; (ii) attack: 1 ms or less; (iii) Ratio: 2 or 2.5:1; (iv) Release: 250–350 ms, “though with a capella music as much as 500 ms may be needed to avoid overemphasis of reverb tails”; (v) detection: peak; and, finally, (vi) make-up gain: to taste.26 When transparent parallel compression proves insufficient, recordists may opt for so-called “attitude parallel compression.” “This second approach to parallel compression is a way of achieving attitude or punch without damaging the loud peaks,” Katz explains, or to warm or clarify the low to mid-levels of the music. The attitude parallel compressor effectively brings up the mid-levels, where the meat of the music lives, which can help achieve that desirable epic quality in rock and roll. This parallel technique can often fatten up sound better than a normal compressor because it concentrates on the mid-levels without harming the highest levels.27

To achieve the distinctive sound of “attitude parallel compression,” Katz recommends a slightly looser approach to settings: (i) threshold set to produce anywhere between 1–3 and 5–7 dBfs of gain reduction; (ii) attack: 125+ ms; (iii) ratio: to taste; (iv) release: to taste, but “set to work in tandem with the attack time to obtain maximum rhythm and punch”; (v) detection: RMS; and, finally, (vi) make-up gain: to taste, though “since the gain reduction is much less than in the [transparent parallel compression] technique,” Katz recommends that recordists should “expect to mix the compressor at a higher level but rarely past -6 dBfs.”


Understanding Records, Second Edition

The loudness-normalization era In a sense, the recent adoption by Spotify, Apple Music, Tidal, and most other streaming services of “loudness-normalization” measures, as explained at the outset of this chapter, has put an abrupt end to the so-called “loudness wars.” In adopting “loudness-normalization” measures, services now restrict the average level of what listeners hear based on a kind of loudness reading called “LUFS” (Loudness Units Full Scale). And, amazingly, most streaming services have chosen comparatively conservative levels! If a track registers above whichever LUFS reading a streaming service has chosen, the service will simply turn down that track by however many LUFS units it registers over the limit. As a direct result of this, and of streaming’s recent total victory over the record industry, the mastering process has transformed completely. In fact, in only a matter of months, each of the loudness techniques I just reviewed have become simply tools in the mastering engineer’s toolbox, rather than core techniques required for every session. We may even have entered an era when engineers can stop obsessing about loudness completely. No longer burdened with the problem of having to inflate mixes to sometimes offensively loud and often distorted levels, many of the once routine tweaks and adjustments that were made along the way in that process have now become entirely unnecessary. As noted, if Spotify normalizes playback using a measure of −14 LUFS then any tracks registering louder on that scale are automatically turned down by the service. Ironically, then, tracks which register “louder,” on average, before they are streamed, only sound quieter on the service than do tracks mastered to the quieter but specified LUFS standard. So what is the good in submitting tracks pegged to higher-than-specified levels? Within only a few weeks of “loudness normalization” becoming standard in streaming, my mastering process changed completely. The mixes I received, when done right, easily hit a −13 LUFS or a −16 LUFS reading, with little to no tweaking required on my end. While I was once forced to work monomaniacally focused on optimizing balances for the loudest possible levels, I was suddenly free to concentrate on musical aesthetics that would best serve the mixes I’d been sent. All of a sudden, I could—and, more importantly, I felt free to—achieve balances, establish dynamic ranges, and pursue a general punchiness, which I’d only dreamed possible just weeks before. Mastering had changed focus, loudness now suddenly only a peripheral concern. I understand that most readers are not mastering engineers. Accordingly, it may remain unclear what “loudness normalization” means for the music they

Mastering (The Final Say)


hear if they have never heard a track prior to mastering. I have thus devised a track (see Track 24 below) to clarify this. On this track, the same sample as was heard in the past few audio example sounds four times in succession, with each successive iteration being “pegged” to a notably higher LUFS reading. The final two iterations surpass Spotify’s loudness-normalization reading, thus reducing the volume of the preceding iterations significantly. Readers should hear the same sample growing in volume, that is, in average amplitude, with each iteration. However, more experienced readers will note that, as the track increases in volume, it loses its dynamic range, and becomes increasingly midforward and distorted. Playlist Title: Understanding Records: Audio Concepts & Demonstrations Track: 24. Loudness-normalization Experiment The “experiment” track described in the paragraph directly above.

Figure 3.3  A meter used by the author for a number of recent releases.


Understanding Records, Second Edition

The art of mastering: Other tools, other approaches It goes without saying that equalization and loudness management are the marquee players in modern mastering. They get all the press, even if engineers use much more than just equalizers, compressors, and limiters when they work. There is far more to mastering than simply signal processing, after all, let alone just equalization and compression. Also included in the mastering engineer’s modern toolbox are inflators, saturators, exciters, bass maximizers, stereo widening tools, de-noisers, and a few other bits of technology worth considering here.

Inflators and saturators Inflators, saturators, exciters, and bass maximizers are routinely deployed by mastering engineers as alternatives to compression and brickwall limiting, and straightforward equalization, for managing loudness. Sony-Oxford’s Inflator plug-in, reportedly the first commercially available plug-in designed specifically to provide engineers with an alternative to “brickwalling” dynamic ranges for creating “loudness,” provides an instructive case study. The Sonnox Inflator adds harmonic cues to an input signal which the human hearing apparatus tends to associate with elevated sound-pressure levels. While the peak level of the Inflator at output stays fixed, unless the unit is clumsily shoved into non-linearity by its users, the harmonic content (i.e., the added distortion cues) it adds to the input increases its average amplitude. In so doing, the Inflator creates a subjectively “louder” output without having to overly inflate peak values in the process. If engineers wind up reducing the dynamic range of an inflated mix to achieve a particular desired spectral balance, they may then work with the Inflator’s “curve” feature, which acts as a kind of expander on its minimum settings. Taken together, these features provide engineers with a powerful tool for increasing the loudness of a mix without completely crushing its dynamic range. Like inflators, saturators add harmonic content to signal. However, they do so specifically to emulate the sound of “tape saturation,” which is a kind of soft clipping, or gradual compression, of audio peaks that occurs when a tape is pushed into its “red zone,” that is, into non-linearity (the “red zone” refers to the red zone of VU meters, which caution against exceeding the linear threshold of gear). This “soft clipping” produces a harmonically consonant distortion related to the input signal, which the human ear tends to find

Mastering (The Final Say)


pleasing. As noted in the previous chapter, in the digital domain, when signal exceeds its linear threshold, the onset of distortion is immediate. Digital peaks are thus flattened the very instant they exceed the threshold of distortion, creating an audio effect engineers call “hard clipping.” This distortion isn’t necessarily harmonically related to the input signal, and listeners typically report finding it harsh or, at the very least, less aesthetically pleasing than “soft clipping.” Engineers ultimately turn to saturators not just to add harmonic cues associated with distortion to signal, as they do when they inflate, but also to create a “warmer” and aurally pleasing harmonic relationship between those cues and the material they modify. Some might wonder why engineers would turn to inflators or saturators at all, when they have the analog means to achieve inflation and saturation already at their disposal? This is a fair question. Indeed, the vast majority of professional mastering engineers now work in hybrid systems, using digital gear for grooming and capture, and analog gear to tweak and (re)balance. Engineers often invest in fantastically expensive digital-to-analog and analog-to-digital convertors, in fact, just so they can route signal from their computers through various equally expensive bits of analog gear. At any point after digital-to-analog conversion, engineers can invoke the saturation principle we note above to induce “soft clipping,” specifically, by hitting a unit in the chain with slightly more gain than it is designed to handle in a linear manner. However, when done this way, the “soft clipping” is “printed” on its return to the computer, meaning that it is recorded as an inextricable modification of the input signal. Even if they want to, engineers cannot adjust the “soft clipping” itself after this point. Using plugins, however, allows engineers a means to do just that. In other words, engineers working with plug-ins can use an inflator or saturator to create the harmonic effect of “soft clipping” without necessarily committing to it. This allows them the option to later change their minds about inflation or saturation itself, even if only to adjust some setting by a negligible increment like 0.2 percent. And it has the added bonus of making the revision process much smoother, and more efficient, than in the analog realm, as masters can be adjusted without the additional time and labor of analog recalls.

Exciters Another way to increase perceived loudness without necessarily maxing the peak amplitude of a signal is aural exciting. Initially, exciters were designed to


Understanding Records, Second Edition

return high-end frequencies lost to the low-frequency bias of tape (tape acts as a kind of LPF with a shelving frequency that plummets ever lower with each successive dub). To return “air,” or “sparkle,” to tired and worn master tapes lacking in content above about 10 kHz, Aphex introduced its Aural Exciter in the mid-1970s. The unit was once so popular that it was rented to engineers on a “cost per minute of recorded music” scale. Aphex’s exciter was simple in design, mirroring inflation and saturation in many interesting ways. The exciter places a HPF in the chain, minimally distorts the filtered signal and then blends it back with the input signal. Like the saturator and inflator, then, the exciter adds additional harmonic components to signal, but it does so in a much more spectrally selective manner. As such, its effect is to alter the way the spectral balance of signal is perceived by listeners, much more than altering the overall perceived loudness of a mix (though both can be achieved at the same time, of course). And many engineers now use exciters to shape transient detail rather than just to “top signal.”

Figure 3.4  The “native” exciter in LogicX.

Mastering (The Final Say)


Playlist Title: Understanding Records: Audio Concepts & Demonstrations Track: 25. Exciters at Master Level The unprocessed premaster we have been using for this chapter sounds twice, then once more with an exciter applied in a very exaggerated way, and then once more without any processing.

Bass maximizers Thus far we have considered tools which tend to emphasize frequency content in the midrange and above. Yet, many mastering engineers deploy similar principles to modify the bass profile of a mix as well. I will consider just one of these here, namely, the so-called “bass maximizer.” Bass maximizers follow the same principle as do inflators, saturators, and exciters; specifically, they emphasize and add content which is harmonically related to the input signal—only bass maximizers do this to tilt the spectral balance of a signal in favor of the bass somehow. The way they do this provides engineers with a powerful alternative to equalization. While simply boosting the low-frequency content of a mix may, indeed, have the desired effect of maxing the bass, it may well be—in fact, it often turns out—that boosting the lowest regions of a mix simply doesn’t work. Boosts of this sort can easily muddy a mix and induce masking, impede dynamic inflation regardless of deft side-chain filtering, or simply induce distortion. Bass maximizers thus emphasize second, third, and fourth harmonic partials of the fundamental frequencies contained in the subsonic region of a mix, following a principle similar to “feathering.” This allows engineers to attenuate the bass region of a mix, should they so desire, even while adding harmonic emphasis to it (it also allows them to articulate the subsonic regions of a mix on modern speakers and earbuds, the vast majority of which jettison an extended bass response in favor of other marketing concerns). In this way, a mix’s lowest components can be attenuated to allow for the dynamic inflation most modern markets demand, even while mastering engineers nevertheless produce a master rich in punchy and robust bass detail.

Mid-side processing: Determining width and depth Aside from loudness and spectral profile, mastering engineers also often make final determinations about the width and depth of a mix. I’ve already looked at


Understanding Records, Second Edition

mid-side (M/S) processing, and the role it plays in equalization and, consequently, the perceived width of a mix, in the section on equalization at the outset of this chapter. However, it is crucial that readers understand the central role M/S processing plays in modern mastering beyond simply spectral balancing. In fact, some mastering engineers, myself included, turn to M/S units when they work far more often than they do EQs and, depending on the genre, more often than they use compressors even. And this is because of the great powers M/S units grant mastering engineers to deconstruct and re-assemble mixes from various spectral and dynamic points of view. M/S units typically extract mid- (mono) and side- (stereo) channel information in a stereo mix and allow engineers to adjust them separately. For instance, the simplest M/S units—Waves’ Center plug-in is a good place to start for the novice engineer—allow engineers to (i) pull the lows to the center, and the highs to the sides, of a mix; (ii) vary the dynamic relationship between the mid and side channels through straightforward boosts and attenuations; and (iii) apply an inflation curve selectively to either the mid or side channels. More advanced M/S units, such as the Brainworx’s bx_XL, provide even more layers of control. On top of divvying the mix into mid and side channels, this unit further dissects the mid channel into “low” and “high” bands. Engineers can separately saturate, attenuate, and compress all three channels, side-chain any one channel under another, mono mix frequencies under a selected corner value using an adjustable slider that spans the entire audible spectrum (i.e., from 20 Hz to 20 kHz), and then sum the three channels back into a summing M/S matrix, which allows for even further modifications. Engineers can then boost or attenuate the volume of the summed mid and side channels, for instance, and even pan the mid channel anywhere they like along the horizontal plane of the mix.

Mid-side processing: Corrective and creative M/S processing can be useful both as a corrective and creative measure. It is incredibly useful for fixing errors in a mix, to be sure. However, it is also useful for adjusting a mix in any number of additive and creative ways. On the corrective end, engineers turn to M/S processors when their correlation, goniometer, or

Mastering (The Final Say)


Figure 3.5  The Waves Center plug-in, with bass set for center, sides boosted, and the high and punch settings set all the way for “side.”

vectoroscope meters point to phase incoherences in a mix. These interferences will generally result in overly destructive relationships when the mix is collapsed to mono, which occurs far more often than many novice recordists seem to understand. More creatively speaking, advanced M/S units provide mastering engineers an astonishing amount of control over aesthetic details in a mix. If a mix features overly sibilant vocals, for instance, even while the cymbals sit perfectly in the same frequency band, mastering engineers can use M/S processing to tame the sibilant “ice-picks” in the center channel while leaving


Understanding Records, Second Edition

the cymbals completely untouched at the sides. Likewise, if the instruments surrounding a vocal track threaten to overwhelm it, engineers can easily bring it to the fore with a quick attenuation of the side channel or a boost in the middle, or by compressing the mid channel without adjusting the sides at all. M/S processing also provides engineers a means of modifying the perceived depth of a mix. Indeed, every time engineers adjust levels on an M/S processor they adjust not only the width but also the depth of a mix. Adjusting the dynamic contour of mid channel information has the added effect of moving centered tracks closer or further along the fore-aft of a mix. Boosting a vocal track in the center of a mix, for instance, moves it slightly closer in proximity to the listener, while attenuating it moves the vocal track in the opposite direction. When combined with bandpassed reverberation, engineers can even move vocal tracks slightly back along the fore-aft of the mix, even as they lend it just the right amount of ambience to make it standout with greater presence. In fact, engineers can deploy broadband reverberation to a mix in its entirety, but this presents a major danger as reverberation is consequently applied across the audible spectrum, including to the muddying lows. That said, broadband reverberation of this sort is commonly used to tailor fades and to decrease the overall proximity of tracks during cross-fades rather than to alter the profile of any particular mix component (EDM DJs use this tool to great effect when cross-fading between tracks in their live-sets, for instance).

De-noising: Single-ended and otherwise Though there are other mastering tools I could consider here, I’ll conclude this section by briefly considering so-called “de-noising.” It can, and often does, fall to the mastering engineer to remove broadband and sustained noises in a mix (e.g., 60 cycle hum, tape hiss, air hiss from overly compressed tracks, etc.) as well as impulsive and tonal varieties of noise (e.g., resonances, crackle and pops, clicking sounds, etc.). These noises emerge from any number of sources, but the most common culprits include: dirty power in the tracking environment; acoustic deficiencies in the tracking and mix environments; poor or noisy tracking circumstances; cheap gear prone to distortion; poorly maintained instruments; and so on. Regardless of who is responsible, and no matter how loudly mastering engineers insist that the best results in de-noising come from addressing noises before rendering the premaster, nowadays it almost inevitably

Mastering (The Final Say)


falls to mastering engineers to remove noise from mixes. And de-noising has become an aesthetic art in and of itself as a consequence of this cultural dictum. De-noising is not as straightforward as it sounds. The “forensic” tools used to de-noise a track are often prone to artifacting when pushed to any substantive degree, an issue only further compounded by the tendency of modern mix engineers to submit overly loud premasters. Common de-noising artifacts include (i) a loss of ambience and stereo separation; (ii) comb filtering and other phasing noises (so-called “gremlins”); and, even, (iii) low thumping and popping. Moreover, as immediately apparent noises are removed from a mix, a process known as “single-ended” noise reduction, other suddenly “unmasked” noises leap emphatically to the fore. Remove hiss, for instance, and all of a sudden you may hear the low level thumps of the lead singer kicking the mic stand. Remove crackle and, suddenly, click track leakage in the drum overheads can be exposed. It’s a lot like peeling the layers of an ontologically ambiguous onion; evermore intricate patterns of distortion and noise emerge the deeper engineers dig. All the same, de-noising is quickly becoming one of the most important tools in the modern mastering engineer’s toolbox. This is largely because tracking and mixing no longer occur in the professionally treated environments they once most often did, nor is the technology used for tracking and mixing nearly as professionally optimized as it was before the early 2000s. In short, the more the “project” paradigm of record production becomes ensconced as a standard approach in tracking and mixing, the more regularly mastering engineers are called on to de-noise. Bob Katz perhaps puts it best: With modern mixes, a lot of project studio mixing rooms are not as quiet as professional studios; air conditioner rumble, airflow noise, and fans in computers cover up noises in the mix. Regardless, the mix engineer should be concentrating on other things besides whether the singer produced a mouth tic. Consequently, when the mix arrives at the quiet mastering suite, we notice problems that escaped the mix engineer—click track leakage, electrical noises, guitar amp buzz, preamp hiss or noises made by the musicians during the decay of the song. We use our experience to decide if these are tolerable problems or if they need to be fixed. Hiss which can be traced to a single instrument track is more transparently fixable at mix time; I ask the mix engineer to send me the offending track for cleanup, then return it for a remix. Or I may suggest a remix,


Understanding Records, Second Edition

bringing his attention to vocal noises between syllables, which he can mute. But clients don’t always have the luxury or time to remix, and so mastering houses have the most advanced noise reduction tools which affect the surrounding material as little as possible.28

A quick note on the peculiar art of remastering Remastering is the process of creating new masters for records which were already mastered at some point before. Though skeptics often reduce the process to a cheap marketing ploy—and, indeed, labels have released remastered and repackaged records in a cynical effort to sell the same product to consumers twice—records are actually remastered for a number of good reasons, each of which has artistic merit in its own right. Technical improvements will sometimes come along and provide engineers with a means of achieving greater fidelity in their transfers. Remasters in this case offer a lower signal-to-noise ratio than previous masters. Advancements in de-noising may likewise give reason to revisit original masters. Similarly, the emergence of new formats will often prompt a remaster and engineers would be tasked with rebalancing old masters to suit the sonic biases of new technology. On a more straightforwardly aesthetic level, remasters are sometimes done to modernize outdated balances; engineers would resort to rejigging and realigning older dynamic and spectral contours, and horizontal imaging, to suit present taste. Remasters are also made to correct lopsided balances, which were set when a medium was new and not well understood by the industry (many digital masters released during the early years of the CD era, for instance, were remastered a decade or so later, when engineers had more experience working in the medium). And, sometimes, remasters are done simply to offer a “new spin” on an old classic, often to celebrate some historical milestone in the legacy of a band or record (John Davis’s recent remaster of Led Zeppelin’s Physical Graffiti (1975), overseen by Jimmy Page in honor of the record’s fortieth anniversary, springs immediately to mind). Whatever the impetus, and despite their varying degrees of artistic legitimacy, remastering remains controversial in the industry. Some listeners and critics talk as though tinkering with classic records constitutes a kind of Orwellian molestation, with remasters interfering with the historical record to suit the caprice of fleeting faddish fashion. We’ll call this the “Han shot first” perspective.

Mastering (The Final Say)


In the original Star Wars (1977) movie, when Han Solo is famously confronted by Greedo in the Mos Eisley cantina, he shoots the bounty hunter and quickly escapes to the Millennium Falcon. In the twentieth-anniversary special edition of the film, however, director George Lucas inserted a first shot from Greedo, feeling that the original script cast Han Solo as a cold-blooded killer. While some viewed the change with actor Harrison Ford’s indifference—“Who shot first? I don’t know and I don’t care,” Ford famously quipped—others were so appalled by the rewrite that a petition was created and circulated online. If Greedo shot first, the complaint ran, Han Solo seems less morally ambiguous at the start of the trilogy and, thus, his later transformation from antihero to hero loses its power and meaning. Responding to this criticism, Lucas actually altered the scene once more for the 2004 DVD release, changing the timing so shots fire from Solo and Greedo at almost exactly the same time. While there may be some merit to the “Han shot first” argument, many critics simply adopt this point of view on remastering as a matter of principle, believing that labels ought not to tinker with cultural milestones. Classic records, the argument runs, should withstand the test of time without rejigging, retaining their merit just fine when left the way the world first heard them. Any attempt at adjusting the spectral balance and dynamic contour of older albums represents an Orwellian rewriting of history, from this rather skeptical perspective—remastering in this case initiating a kind of musical “butterfly effect” whereby listeners eventually lose any indication as to how—or, even, whether—a remaster sonically relates to what artists first approved. Don Was, discussing his work remastering Rudy Van Gelder’s Blue Note records, succinctly articulates this point of view: Rudy [Van Gelder] mastered for vinyl: he added some EQ, just so the phonograph needle wouldn’t skip and certain sonic peaks wouldn’t mess with the needle. In doing that, he altered the sound and that’s the sound everybody knows and loves. Who are we to editorialize this stuff, 50 years after the fact? What is the standard you remaster to? . . . The original vinyl was the standard. That’s what people bought and that’s what people loved.29

“Han shot first” caveats aside, many listeners appreciate the work of remastering engineers regardless of the remastered record’s historicity. Each remaster deserves to be evaluated on its own terms, after all, based on its unique musical attributes and individual sonic merits. In fact, despite the cynicism Apple’s new “Mastered for iTunes” and other high-resolution formats have confronted


Understanding Records, Second Edition

of late—“is high-resolution music a gimmick?,” wonders a recent article from the Pulitzer Prize–winning The Guardian Magazine—many listeners have nonetheless embraced high-resolution formats, and seem willing to pay for remasters made specifically for them. Even if record labels actually commission remasters just to wring more cash out of an old product, mastering engineers still care deeply about the aesthetic outcomes of remastering. And they will work to develop unique transfer techniques designed specifically to optimize records for each particular format, for however long they are commissioned to work. In short, high-resolution formats can represent a shameless marketing ploy and a vast improvement in sound quality at the very same time. It says something about the way we currently value music, as a culture, that attempts to improve sound quality are greeted with such skepticism, the vast artistry of remastering reduced in this case to redundant marketing gimmickry. In reality, the work of remastering for high-resolution formats is no different than what the first transfer engineers did when transferring tape to vinyl. As most professional mastering engineers will readily admit, in fact, often the best remaster emerges when engineers stop thinking in terms of rebalancing and rejigging at mix level for aesthetic reasons and instead adopt a straightforward transfer mindset. When he works toward the MFiT format, for instance, Bernie Grundman argues that remastering is not an equalization or compression thing at all. It’s a process of lowering the level into the encoder for the AAC file, and by lowering the level into the encoder, Apple has shown us that it creates much less distortion and can make a much more accurate encode to the point where a well-made Mastered for iTunes file sounds closer to the 24-bit master than the 16-bit compact disc does.30

Regardless of their perspective on the ethics of remastering, most engineers agree that the process has evolved in the last few decades into a legitimate subtechnique in its own right, with its own peculiar aesthetic priorities, values, and techniques. As Nick Krewen explains: In the early days of the digital music era in the ’80s, remastering recordings for release on CD became common and a new sticker adorning the jewel case proclaiming “remastered” was often considered a selling point. But despite a promise to deliver improved sonics, digital technology was still in its relative infancy, and many audio experts with “golden ears” found the resulting sounds of the remastered CD recordings harsh, cold and generally not as good as the original analog recordings. Today’s remastering process is something entirely

Mastering (The Final Say)


different. Now, with a variety of formats available to the listener, dedicated record company historians and production engineers, along with talented mastering engineers, listen attentively to original masters and incorporate the latest advancements in digital technology with the intention of duplicating—or, in some cases, improving—the sound that artists intended for their works.31

The transfer process: Balancing for output(s) When they are done tweaking mixes, engineers turn their attentions to the very activity which has defined mastering from its start: the transfer process. In this case, though, they do not necessarily transfer a mix from one medium to another so much as they transfer it between various digital states (between, say, a LogicX session file and a stereo-interleaved mp3). As such, the crucial role that the transfer process plays in modern mastering may not be immediately apparent on first blush, and readers might well think the transfer tradition is dead. But they would be dead wrong! The music industry’s recent move to primarily “virtual” distributions, such as via streaming services like Spotify, Apple Music, and YouTube, to name a few, has made the transfer tradition more vital than ever now. Engineers make records using any of a number of DAWs, and those records find release in a multitude of digital formats. Engineers must balance the mixes they master to suit the biases of each of these outputs, as best they can, and this sort of tailoring has ever been the goal of the transfer process. In my own practice, I am often asked to render unique masters for untethered downloading on iTunes and Bandcamp, video streaming, streaming via apps like Spotify and Apple Music, playback via public address in clubs, and so on. When a club master is requested, I am much more aggressive in monoing the lows, as it were, than I would be when rendering masters intended for streaming apps like Spotify or Tidal, for instance. Similarly, when a master is intended for untethered downloading via sites like bandcamp.com, I usually push its average loudness further than I would that of masters sent for distribution to Apple Music, which I have already noted uses the remarkably conservative loudnessnormalization target of −16 LUFS. Regardless of which specific tweaks I make, though, when I render a master specifically for a distinct digital format, that is, when I modify a mix so it sounds optimal in whichever format it is intended for, I engage in the “transfer” process.


Understanding Records, Second Edition

In fact, engineers now have a number of specialized tools at their disposal which are specifically designed to help them navigate the increasing complexity of the modern transfer tradition. Though untethered downloads are quickly going the way of the dinosaur, as it were, Apple’s Mastered for iTunes (MFiT) droplet is a good example of such a specialized tool. The MFiT droplet renders high-resolution .m4a files from .wav files, which are meant to reflect audible modifications that will occur when the 96 kHz masters that mastering engineers ultimately submit to iTunes for distribution are finally rendered as Mastered for iTunes (MFiT) files, available for download and playback on consumer devices. Sony-Oxford’s new Codec Toolbox does similar things to tracks, but for a multitude of formats beyond MFiT, and it is designed for use in session files themselves, which is to say, it figures in the mastering process before transfer actually occurs. Once activated on the stereo bus of a session file, the Codec plug-in allows engineers to hear how any number of adjustments will fare when the track is rendered to different formats (the plug-in’s “clip safe” mode also provides a good countermeasure against intersample peaks, which may accrue in one format but not another).32

Metering Meters are another tool which have always helped engineers gauge the viability of their masters for transfer to different formats. That said, metering remains one of the most controversial subjects in modern mastering. As already noted, many mastering engineers blame the rejection of VU for digital meters, and the latter’s dependence on “full-scale” decibel weighting, for the general pulverizing dynamic ranges have taken since the heyday of the loudness wars in the early and mid-2000s. Still other engineers consider our modern over-dependence on metering in mastering yet another symptom of the overt ocularcentrism that has overtaken record production. Producers now work on computers and spend most of their time staring at computer screens, critics complain. Over time, they have learned to trust visual over audible information, and they work accordingly. Their fades follow a visually symmetrical curve rather than any sort of aural or aesthetic paradigm, or so the argument runs, and many recordists judge the quality of masters now by the readings they produce on whichever meters they like best rather than by the way they strike the ear (I have personally experienced this on a number of occasions and, even, have dropped clients completely when they became overly focused on tracks hitting certain target numbers on their meters).

Mastering (The Final Say)


Not surprisingly, given how extreme the record industry can be, some mastering engineers have responded to the ocularcentric bias of digital-audio by disavowing metering completely. In fact, I once heard from a very successful mastering engineer that he believes his best work was done only after he had painted over the VU meters in his studio so he could no longer see them. This is going too far, in my opinion, and throws the baby out with the bath water completely. And it neglects the reality that, with the new loudness-normalization standards in streaming, there are specific LUFS targets to which mastering engineers should indeed “peg” their masters.

Pegging meters Visual information is only ever meant to confirm what engineers hear, not to hamper their hearing; most mastering engineers are indeed capable of using meters without becoming their slaves. Multimeters and spectral analyzers, like Spectrafoo, provide loudness information parsed across the audible spectrum, which engineers can use to accurately seek and destroy rogue frequencies and resonances not immediately apparent on their monitoring setup. Goniometers and correlation meters likewise provide diagnostic data for gauging the phase coherence of a mix. In fact, meters can, and usually do, provide invaluable information to engineers while they work. It is only engineers who singlemindedly insist on clearing masters pegged to particular target values, whether LUFS or RMS, that risk becoming enslaved by their meters. Nonetheless, it is prudent to do whatever one can to mitigate any undue influence metering might have on the mastering process (this is a real risk, though perhaps only a mastering engineer can ever understand it completely). Bob Katz probably has done more than any other engineer to ensure that his meters work for him, rather than the other way around. He has even gone so far as to introduce a metering system all his own, the so-called “K-meter,” which was ostensibly designed to rehabilitate dynamic ranges in pop.33 In general, though, engineers typically use meters as simple diagnostic tools. A glance at their loudness meter helps engineers confirm the general levels they think they are hearing, both in terms of peak and average amplitudes (and LUFS, if appropriate). And many such meters have switchable weighting systems to help them gauge appropriate levels and dynamic ranges for different outputs (engineers mastering audio for film, for instance, have a very different set of loudness specifications than do engineers tasked with rendering a club master).


Understanding Records, Second Edition

A quick note on vinyl and audiocassette Before concluding this study of the structure of modern mastering, I should briefly address the resurgence of once obsolete formats like vinyl and audiocassette. Indeed, projects intended for release on vinyl and audiocassette grace the desks of mastering engineers with increasing frequency nowadays, even if both formats have yet to reach even 5 percent of the market saturation they once enjoyed during their heydays. This only complicates the modern transfer process further. Engineers are now regularly tasked with rendering digital and vinyl and/or cassette masters, meaning that many engineers must now learn the vagaries and biases of once obsolete media, media which younger engineers might never have even encountered in their personal lives, so they can optimize mixes for transfer onto them. With audiocassette, the process is easy, at least in my experience. I usually deliver .wav files optimized for the format, meaning that they are slightly “topped” to counteract the low-frequency bias of tape, and the (mostly boutique) labels who will release the cassettes dub and oversee the transfer of those masters to tape themselves. Mastering for vinyl, on the other hand, is a vastly more complicated affair. Elliptical EQs are deployed to mono bass frequencies, the midrange is balanced in unique ways, masters are separated into discrete left and right channels, and so on. Interestingly, though, this skillset is increasingly becoming the purview of the “cutter” engineers responsible for cutting the vinyl itself. I am often hired to create final masters for vinyl, and I render them as I would for streaming—though knowing that the masters will wind up on vinyl, I usually allow for a larger-thanusual dynamic range, mono the lows aggressively, and attenuate slightly more than usual in the upper-midrange—and labels simply deliver my masters to the cutter, who then transfers them to vinyl. Interestingly, I am usually the only credited mastering engineer on the album. In this sense, mastering for vinyl, as a skillset, is slowly becoming conceptually absorbed into the vinyl manufacturing process, which suggests a fascinating bifurcation of mastering competencies in general, but one that can’t emerge in any definitive sense unless vinyl assumes a more prominent market share and such mastering becomes routinized.

Conclusion: Mastering as a value added for other services The mastering industry is in a state of perpetual transformation, if not outright decline. Old business models for mastering are quickly dying, crushed under

Mastering (The Final Say)


the weight of the record industry’s broader tailspin into disintermediation and, relatedly, the ever-diminishing fiduciary value the vast majority of listeners now place on records themselves. As platinum records become more of a historical rarity with each passing year, and as the record industry itself contracts to the point of atrophying, thanks largely to the ever-diminishing returns, the figure of a specialized mastering engineer working in a dedicated mastering house seems destined for the dustbin of history—an unsustainable ancient phantasmagoria carried over from a period in history when record making could be a lucrative endeavor. This is not to say that mastering is obsolete, or that people won’t require the professional services of an audio mastering engineer anytime soon. Records will be mastered for as long as records are made. Indeed, the evidence points to a much simpler historical destination for mastering: mastering will continue to become a value addition to other services—part of the fee for, say, mixing or record production in general—until only very few people in the world claim to be audio mastering engineers per se. This is not to say that nobody will call themselves a mastering engineer in the future. People still make buggy whips, after all, and I imagine that some even make a great deal of money in so doing. However, a few people turning a profit for making or doing something doesn’t make an industry. Mastering is a service industry, meaning mastering engineers work in the record industry. Thus, the fortunes of the mastering engineer rise and fall with the fortunes of the record industry at large, and, lately, neither industry has done much of anything but plummet. Cheaper technology, the emergence of online distribution and marketing, the triumph of “virtual” formats, and the subsequent rise of the DIY online record label, have made mastering, and the quality control it represents, largely outdated, if not historically quaint. Unlike its “indie” precedents, the DIY online record label does not require pressing and distribution to get their records into the same points of purchase as the global superstars they compete with. Artists themselves can—and, in my professional experience, often do—simply slap an L3 Ultramaximizer on their stereo bus, render a high-resolution bounce, and upload the track to online aggregators like CDBaby and Tunecore for distribution on iTunes, amazon.com, and other web retailers. Or they use online automated services like LANDR. Traditionalists and purists may argue that the tracks they deliver are not actually “masters” until they are blue in the face— indeed, many grumps still absurdly insist that the only proper use of the term “mastering” is to describe transfer of audio from tape to vinyl—but they will


Understanding Records, Second Edition

never convince clients happy with the results of DIY mastering that their tracks are in any manner diminished. This is especially true because tracks often need only to sell a few copies now to provide a viable return on investment for artists, thanks largely to savings earned through a production process pared of as many unnecessary expenses as possible. What artists deliver to online aggregators are indeed “masters,” and in the classic sense of the term, even if they are distorted beyond recognition or feature other audible traces of amateur work. A “master” is simply the prototype from which consumer copies of a record are made. Whether or not that prototype sounds any good, and whether or not it might have been significantly improved, should it have been sent to a professional mastering house, is another matter entirely, one that is entirely irrelevant from a business point of view and is, furthermore, largely subjective. In the present production era, as so many old Gods fall, no signal flow, no bit of gear, and no technique in particular, is more or less correct for mastering. I have even encountered mastering engineers who work on laptops using only earbuds, though most seem (wisely) hesitant to advertise the fact. Nonetheless, those engineers deliver viable masters to happy clients, and they accept payments happily given. Often they do this as part of some other service, whether it be production, mixing, or any other “musical services” clients are willing to pay for now; the cost of mastering is simply folded into that other service. This nonetheless makes them a professional mastering engineer like any other. They simply approach their craft from a different point of view, and they use different tools to achieve whichever aesthetic ends they and their clients deem appropriate. And there is absolutely nothing wrong with this! It is simply the way mastering is done now for many projects. Of course, mastering is still done in the traditional way by engineers working in traditional mastering environments as well, for however much the DIY model seems destined to become standard in the record industry, the mastering industry, such as it is, remains a large and variegated professional domain. As I noted at the outset, it is currently in a state of perpetual transformation, and I would argue that it has ever been in such a state. It would thus be dishonest to present the craft of mastering as any one particular thing right now, let alone as a settled sequence of professional and aesthetic transactions. Some of the industry is subsumed in other fields now, to be sure, as mix engineers and producers routinely invoice for mastering without sending out to dedicated mastering services. But much of it still resides in the acoustically ideal and dedicated

Mastering (The Final Say)


mastering environments in which so many of the world’s top mastering engineers tend to work. No matter where they work, what gear they use, or how they choose to use it, all engineers are caught in the same tumult as is the record industry at large. How they navigate that tumult remains to be seen. Video Title: Understanding Records, Putting It All Together: Chapter Three—Mastering Lonesome Somedays. In this video, I provide an overview of the “quick and dirty” streaming master I prepared for “Lonesome Somedays.”


Coda (Fade-Out)

This field guide is by no means exhaustive. In many respects, I have only been able to scratch the surface of Recording Practice—and just barely at that! As I noted at the very outset of this study, Recording Practice is an immense, and immensely complicated, topic which requires volumes of encyclopedic exposition to comprehensively elucidate. That said, this field guide has provided a comprehensive survey of the most fundamental terms that recordists now use to communicate, in a way that practitioners and analysts alike should find useful. Clearly this field guide represents an experimental form of academic research. My ultimate goal in writing this field guide was to marry practice and analysis in the service of creating an improved pedagogy for popular music studies. As I worked, I continuously sought new ways to allow practice to inform my research and exposition. As a researcher who consciously maintains both analytic and creative engagements with the musics that I study, make and teach, I can think of nothing to substitute for actual practice—that is, actual music making—when it comes to generating analytic insights. Too many “researchers” who engage themselves in conversations about record production specifically, and about popular music practice in general, do so from a place of practical ignorance, knowing only what the books they read on their academic armchairs tell them about the creativity they purport to understand. I would implore future researchers not to let academics pass off second hand stories about Recording Practice as though they were authoritative or definitive statements. In fact, I would recommend that those of us who consider ourselves to be scholars of popular music continue to find a central position for creative practice in our research and pedagogy. Many of us will need to continue to insist, though, as we face administrators and educators hell-bent on rolling back the endless tide of progress (I once had a dean tell me that the term “production” was a typographical error on a letter of appointment we’d both signed, and that it should have read “publishing”; other colleagues have noted, in print, that “popular music doesn’t belong in a university.”). Despite the prevailing view in many university music departments, theory and practice aren’t inherently exclusive. To recognize this, we simply need to rethink those armchair academic paradigms that put a taboo on studying and making popular music in the first place.


Understanding Records, Second Edition

Though this field guide is by no means exhaustive, the techniques I consider throughout recur constantly in recorded musical communications. Yet they remain conspicuously absent from the lion’s share of research on record production per se and, for that matter, remain unnoticed in research on popular music at large. Lacunae are to be expected in a field as young and diffuse as popular music studies. However, the present paucity of research on specific record production techniques signals the field’s institutional basis much more than its nascent state, in my opinion. Insightful and challenging studies of record production have indeed emerged in the last two decades, but these studies usually address the analytic priorities and concerns of disciplines that remain completely uninterested in musical technique itself, like cultural studies, sociology, media studies, cultural anthropology, and political economics. If it is mentioned at all in these fields of study, record production itself usually only registers as a site for social, cultural, and industrial struggle; only a select few researchers offer much in the way of musical and technical details about the production process. Not surprisingly, then, what engineers do to make recorded musical communications has failed to register as a fundamentally musical concern in the vast majority of published research on the subject. Though one might reasonably expect technical manuals, trade journals, and audio-engineering textbooks to “fill in” the missing musical information I note, most do not address straightforwardly musical topics. Texts in this category usually only sketch the technical capacities of modern recording technologies and provide a basic psychoacoustic rationale for a few core uses. Or they rehearse broad procedural chronologies for certain historically significant recording sessions, without considering the larger aesthetic paradigms that recordists deploy their musical practice to service. Even as musicologists and music theorists turn their analytic attentions to pop records, they remain largely fixated on musical details that can be notated (i.e., pitch relations, formal contour, metered rhythms, harmonic design, and so on). This is especially ironic given that so many analysts ostentatiously reject notation as an analytic tool now. In other words, no matter how avant-garde their methods, most musicologists and music theorists remain fixated on the very same musical details they have always analyzed. They simply disagree over how best to interpret those details now. Recording Practice itself, that is, musical practice of recording technology, continues to register in a very small, albeit growing, collection of articles and books.

Coda (Fade-Out)


The need for a research program designed to illuminate Recording Practice as musical practice is urgent. As Albin Zak put it, almost two decades ago now, “record making is a recent art form, and many of its artistic roles belong to no prior tradition . . . we know what songwriters do, but what about sound engineers?”1 Scholars have tried to answer this very question using what seems like every disciplinary and interdisciplinary tool available, and yet Recording Practice in and of itself remains stubbornly absent from the lion’s share of published research. In my opinion, it will remain absent until a unified “disciplinary” approach to analyzing record making (and not just records) finally emerges, an approach that conceives, and explains, musical practice of recording technology—say, tweaking the release time setting on a side-chained compressor or a keyed gate—as musical communication per se. Records present listeners with music; this much we know. Until analysts can connect that music to a concrete corpus of embodied procedures, though, Recording Practice itself is likely to remain the geeky “gearslut” of musical study—a crucial species of modern musical communications which analysts routinely devalue as nothing more than a technical support, something like scaffold building, for the “true arts” of performance and composition. It is my ultimate hope that readers of this field guide feel empowered, and inspired, to change this circumstance.


Notes Introduction 1 Maureen Droney, Mix Masters: Platinum Engineers Reveal Their Secrets for Success. Boston: Berklee Press, 2003, p. 145. 2 Virgil Moorefield, The Producer as Composer. Boston: MIT Press, 2005, p. 29. 3 Alexander Case, Sound FX: Unlocking the Creative Potential of Recording Studio Effects. Boston: Focal Press, 2007, p. xix.

Chapter 1 1 Case, Sound FX, p. 232. 2 Cited in Greg Pedersen, “Bob Ezrin: I Was a Teenage Record Producer,” Electronic Musicians Magazine (April 2007). Available online at: emusician.com/em_ spotlight/bob_ezrin_interview/index.html 3 Bruce Swedien, In the Studio with Michael Jackson. New York: Hal Leonard, 2009, pp. 120–21. 4 Kevin Ryan and Brian Kehew, Recording the Beatles: The Studio Equipment and Techniques Used to Create Their Classic Albums. Houston: Curvebender, 2008, p. 164. 5 Swedien, In the Studio with Michael Jackson, pp. 22–23. 6 Cited in Pedersen, “Bob Ezrin.” 7 Ryan and Kehew, Recording the Beatles, p. 376. 8 Ibid., p. 377. 9 Cited in Pedersen, “Bob Ezrin.” 10 Swedien, In the Studio with Michael Jackson, p. 122. 11 Ibid., p. 19. 12 Steve Rosen, “Interview with Jimmy Page,” Modern Guitars Magazine (1977). Available online: modernguitars.com/archives/03340 13 Jake Brown, Rick Rubin: In the Studio. Toronto: ECW Press, 2009, p. 192. 14 “Impedance,” or “Z,” measures the amount of resistance to the flow of electrons presented by physical media in an audio chain, such as electric guitar cables and preamplifier circuits.

210 Notes 15 Ryan and Kehew, Recording the Beatles, p. 156. 16 For any gear nerds reading right now, the Synthi-A was a compact version of the EMS VCS3 synthesizer, but with a prototype 250-step digital sequencer attached. 17 Adapted from David Franz, Recording and Producing in the Home Studio. Boston: Berklee Press, 2004, pp. 38–39. 18 As Thomas Rudolph and Vincent Leonard explain, “It would appear that the standard . . . compact disc format (16-bit sampling and 44.1 kHz sample rate) is ideal. However, the hardware that controls these signals is not perfect and can affect the overall sound quality. For example, when an analog signal is converted to a digital signal, and in the conversion process is slightly altered, a distortion called ‘quantization noise’ occurs. ‘Aliasing,’ another form of signal corruption, happens when the analog-to-digital converter misreads the signal and produces a lower frequency. To reduce signal distortion, some systems record at sampling rates of 44.1 kHz but with 20- or 24-bit resolution. This allows for greater dynamic range and reduced distortion or noise during production. However, before the CD is recorded, the signals must be ‘dithered’ down to 16-bit. ‘Dithering’ is the name given to the process of reducing the number of bits in each digital word or sample. This process involves adding some white noise to the signal. Using a process called ‘noise shaping,’ this noise is usually switched to areas of the audio spectrum to which our ears are less sensitive, above 10 kHz.” In Thomas Rudolph and Vincent Leonard, Recording in the Digital World: Complete Guide to Studio Gear and Software (Berklee Guide). Boston: Berklee Press, 2001, p. 4. 19 This is an excerpt from a longer citation I included in Larry Starr, Christopher Waterman, and Jay Hodgson, Rock: A Canadian Perspective. Toronto: Oxford University Press, p. 438. 20 Cited in Luna Benni, “Steve Albini: An Interview with a Wizard of Sound,” Luna Kafé Interviews (2002). Available online: lunakafe.com/moon73/usil73.php 21 Greg Milner, Perfecting Sound Forever: An Aural History of Recorded Music. New York: Farrar, Straus & Giroux, 2009.

Chapter 2 1 Moorefield, The Producer as Composer, p. 1. 2 Bobby Owsinski, The Mixing Engineer’s Handbook, 2nd edition. New York: Cengage, 2006, p. 3. 3 John McDermott, with Eddie Kramer and Billy Cox, Ultimate Hendrix: An Illustrated Encyclopedia of Live Concerts and Sessions. New York: Backbeat Books, 2009, p. 44. 4 Ryan and Kehew, Recording the Beatles, p. 466.

Notes  211 5 Ruth Dockwray and Alan Moore, “Configuring the Sound-Box 1965-1972,” Popular Music 29 (2), 2010, pp. 181–97. 6 Rooey Izhaki, Mixing Audio: Concepts, Practices and Tools. Boston: Focal Press, 2008, p. 71. 7 For more on Ludvig’s credits in particular, see his IMDB page: https://www.imdb. com/name/nm2014977/ 8 Daniel J. Levitin, This Is Your Brain on Music: The Science of a Human Obsession. New York: Plume, 2006, p. 21. 9 John Andrew Fisher, “Rock and Recording: The Ontological Complexity of Rock Music,” in Musical Worlds: New Directions in the Philosophy of Music, ed. P. Alperson. Pennsylvania: University of Pennsylvania Press, pp. 109–23. 10 I have always liked what Alexander Case has to say about this: “The most important music of our time is recorded music. The recording studio is its principle musical instrument. The recording engineers and music producers who create the music we love know how to use signal-processing equipment to capture the work of artists, preserving realism or altering things wildly, as appropriate. . . . Equalization is likely the most frequently used effect of all, reverb the most apparent, delay the most diverse, distortion the most seductive, volume the most underappreciated, expansion the most under-utilized, pitch shifting the most abused, and compression the most misunderstood. All effects, in the hands of a talented, informed and experienced engineer, are rich with production possibilities.” In Case, Sound FX, pp. xix–xx. 11 Franz, Recording and Producing in the Home Studio, p. 190. 12 Bill Gibson, Compressors, Limiters, Expanders and Gates. Michigan: ProAudio, 2005, p. 5. 13 Ibid., p. 9. 14 I should also note that both compressors and limiters have gain functions which influence how much attenuation of an audio signal ultimately obtains though each process. 15 Doug Eisengrein, “Do the Dip: Sidechains,” Remix Magazine (2007). Available online at: remixmag.com/production/tips_techniques/columns/sidechaining_signal_swapping 16 Cited in Richard Buskin, “Classic Tracks: Chic ‘Le Freak,’” Sound on Sound (April 2005). Available online at: www.soundonsound.com 17 Again, Alexander Case has a great passage on this subject: “The alert reader may note that a compressor with a side-chain input should achieve nearly the same thing [as a ducker]. The ducking process . . . is attenuating a signal when the threshold is exceeded—the very goal of compression. Background music is attenuated when the voice goes above the threshold. The trouble with using compression for this effect, and the reason some . . . gates provide this feature, is the presence of that critical parameter: range. Not available on most compressors,

212 Notes range sets a maximum amount of attenuation, useful in many . . . gating effects. In the case of ducking, it is likely that the music should be turned down by a specified, fixed amount in the presence of the speaking voice. No matter the level of the voice, the music should simply be attenuated by a certain finite amount based on what sounds appropriate to the engineer, perhaps 10–15 dB. Compression would adjust the level of the music constantly based on the level of the voice. The amplitude of the music would modulate constantly in complex reaction to the level of the voice. Compression does not hit a hard stop because compressors do not typically process the range parameter. Therefore, look to noise gates for the ducking feature.” In Case, Sound FX, pp. 183–84. 18 Cited in Masaki Fukuda, The Boss Book: The Ultimate Guide to the World’s Most Popular Compact Effects for Guitar. New York: Hal Leonard, 2001, p. 69. 19 Cited in Dave Hunter, “Voxes, Vees and Razorblades: The Kinks Guitar Sound,” The Guitar Magazine. Available online at: www.davedavies.com/articles/tgm_019901.htm 20 Dave Hunter, “Effects Explained: Overdrive,” Gibson Online Magazine (2008). Available online at: gibson.com/en-us/Lifestyle/effects-explained-overdrive 21 Teemu Kyttala, Solid State Guitar Amplifiers (2009). Available online at: that raymond.com/downloads/solidstate_guitar_amplifiers_teemu_kyttala_v1.0.pdf 22 Daniel Thompson, Understanding Audio: Getting the Most Out of Your Project or Professional Recording Studio. Boston: Berklee Press, 2005, pp. 245–46. 23 George Martin and Jeremy Hornsby, All You Need Is Ears: The Inside Personal Story of the Genius Who Created the Beatles. London: St Martin’s Press, 1994. 24 Geoff Emerick and Jeremy Hornsby, Here, There and Everywhere: My Life Recording the Beatles. New York: Gotham, 2006. 25 Ryan and Kehew, Recording the Beatles, p. 466. 26 Larry Meiners, “Gibson Gets Satisfaktion With Maestro Fuzz-Tone,” Flying Vintage Machine (2001). Available online at: flyingvintage.com/gcmag/fuzztone.html 27 Ibid. 28 Ibid. 29 Mastering engineers also do something like reamping these days. Tasked with re-mastering dated records, engineers routinely send the original masters through modern pre-amps to help update the sound. Of course, they do not capture the result using a mic, but the principle is similar. 30 Riku Katainen, Dauntless Studio Diary 2008 (2008). Available online at: dauntless. blogspot.com 31 Richard Buskin, “Different Strokes: Interview with Gordon Raphael on Producing the Strokes,” Sound on Sound (April 2002). Available online at: www. soundonsound.com

Notes  213 32 Of course, fans will know that Soundgarden’s “Black Hole Sun” is an exception to the band’s usual output. As Kyle Anderson in Accidental Revolution: The Story of Grunge puts it: “‘Black Hole Sun,’ which would win a Grammy for Best Hard Rock Performance, was a massive watershed moment for grunge, sadly occurring at the end of its reign of supremacy in the post-Cobain era. The song was a massive molten slab of melodic rock, a mid[-]period [Led] Zeppelin riff wrapped in a wave of druggy psychedelia. The lyrics seem to be about Armageddon (or some sort of cataclysmic event that seems pretty biblical), and the words match up perfectly with the music underneath, as the tension and discomfort in the verses give way to the bombast and destruction in the chorus. Curiously ‘Black Hole Sun’ is one of the few Soundgarden songs that follow the traditional quiet verse/loud chorus dynamic that grunge made famous—the band always seemed to go from being loud to being extremely loud.” In Kyle Anderson, Accidental Revolution: The Story of Grunge. New York: St Martin’s Press, p. 123. 33 Case, Sound FX, p. 98. 34 Izhaki, Mixing Audio, p. 453. 35 Richard Buskin, “Classic Tracks: ‘Heroes,’” Sound on Sound (October 2004). Available online at: www.soundonsound.com 36 Emerick and Hornsby, Here, There and Everywhere, pp. 94–95. 37 As Case explains: “Before the days of digital audio, a common approach to creating delays was to use a spare analog tape machine as a generator. . . . During mixdown, the machine is constantly rolling, in record mode. The signal is sent from the console to the input of the tape machine in exactly the same way one would send a signal to any other effects unit—using a spare track bus. That signal is recorded at the tape machine and, milliseconds later, is played back. That is, though the tape machine is recording, it remains in repro mode so the output of the tape machine is what it sees at the playback head. . . . The signal goes in, gets printed onto tape, the tape makes its way from the record head to the playback heard (taking time to do so), and finally the signal is played back off tape and returned to the console. The result is tape delay.” In Case, Sound FX, p. 224. 38 “Tape can be looked at as having a LPF [low-pass filter] with its cut-off frequency dependent on the tape speed and quality,” notes Izhaki, in Mixing Audio, p. 384. 39 Wayne Wadhams, Inside the Hits: The Seduction of a Rock and Roll Generation. Boston: Berklee Press, 2001, pp. 48–49. 40 Levitin, This Is Your Brain on Music, p. 3. 41 Wadhams, Inside the Hits, pp. 123. 42 Ryan and Kehew, Recording the Beatles, p. 297. 43 Izhaki, Mixing Audio, p. 388. 44 Ibid., pp. 168–69.

214 Notes 45 Case adds: “The modulation section of a delay unit relies on a simple LFO. Instead of modulating the amplitude of a signal, as might be done in an AM (amplitude modulation) synthesizer, this LFO modulates the delay-time parameter within the signal processor. Rate is the frequency of the LFO. Depth is the amplitude of the LFO. Shape, of course, is the type of LFO signal. . . . These three parameters give the recording engineer much needed control over the delay, enabling them to play it like a musical instrument. They set how fast the delay moves (rate). They set the limits on the range of delay times allowed (depth), and they determine how the delay moves from its shortest to its longest time (shape).” In Case, Sound FX, p. 224. 46 According to Neil Morley, “The phase of a low frequency [and] . . . a higher frequency will be phase shifted by different amount[s]. In other words, various frequencies in the input signal are delayed by different amounts, causing peaks and troughs in the output signal which are not necessarily harmonically related.” In Neil Morley, “Phasing,” Harmony Central (2009). Available online at: harmony-central. com/Effects/Articles/Phase-Shifting 47 Bruce Swedien, In the Studio with Michael Jackson. New York: Hal Leonard, 2009, p. 141. 48 According to William Moylan, “Early reflections arrive at the listener within 50[-80] ms of the direct sound. These early reflections comprise the early sound field. The early sound field is composed of the first few reflections that reach the listener before the beginnings of the diffused, reverberant sound. Many of the characteristics of a host environment are disclosed during this initial portion of the sound. The early sound field contains information that provides clues as to the size of the environment, the type and angles of the reflective surfaces, even the construction materials and coverings of the space.” In William Moylan, Understanding and Crafting the Mix: The Art of Recording, 2nd edition. Boston: Focal Press, 2007, p. 30. 49 Swedien, In the Studio with Michael Jackson, p. 125.

Chapter 3 1 I should note that much of what I recount here is taken from a wonderful series of videos made by Bobby Owsinski for lynda.com, specifically, Bobby Owsinski, “Mastering Tutorial: The History of Mastering.” Available online at: https://www. youtube.com/watch?v=mcK4cShDhIE. 2 Owsinski, “Mastering Tutorial.” 3 For an interesting look at this early era see: Jonathan Sterne, The Audible Past: Cultural Origins of Sound Reproduction. Durham: Duke University Press, 2003, pp. 31–86. 4 For more, see Owsinski, “Mastering Tutorial.” 5 Owsinski, “Mastering Tutorial.”

Notes  215 6 Wadhams, Inside the Hits. 7 This is an excerpt from a fantastic interview between Bobby Owsinksi and Doug Sax. Available online at: https://www.bobbyowsinski.com/mastering-engineershandbook-doug-sax-excerpt.html 8 In Droney, Mix Masters, p. 154. 9 Bob Katz, Mastering Audio: The Art and the Science. Boston: Focal Press, 2007, p. 168. 10 In Droney, Mix Masters, p. 4. 11 Cited in Ryan and Kehew, Recording the Beatles, p. 399. 12 Cited in Droney, Mix Masters, p. 153. 13 Will Haas, “The SOS Guide to Mix Compression,” Sound on Sound. Available online at: https://www.soundonsound.com/techniques/sos-guide-mix-compression 14 Hugh Robjohns, “Chris Gehringer of Sterling Sound: Mastering Engineer,” Sound on Sound (2003). Available online at: www.soundonsound.com 15 Katz, Mastering Audio, pp. 94–95. 16 Cited in David Flint, “Preparing Your Music for Mastering,” Sound on Sound (September 2005). Available online at: www. soundonsound.com 17 Mike Senior, Mixing Secrets for the Small Home Studio. Boston: Focal Press, 2011, p. 158. 18 Cited in Flint, “Preparing Your Music for Mastering.” 19 Ryan Dembinsky, “Interview with Bob Ludwig,” Glide Magazine (September 2009). Available online at: https://glidemagazine.com/142887/interview-bob-ludwigmaster/ 20 Katz, Mastering Audio, p. 168. 21 Cited in Dan Keen, “The Loudness War is Nearly Over,” Noisey (November 2013). Available online at: https://noisey.vice.com/da/article/69w4a5/the-loudness-war-isnearly-over 22 Paul White, “More of Everything! Maximizing the Loudness of Your Masters,” Sound on Sound (March 2003). Available online at: www.soundonsound.com 23 Milner, Perfecting Sound Forever, pp. 280–81. 24 Cited in Milner, Perfecting Sound Forever, p. 282. 25 Milner, Perfecting Sound Forever, p. 283. 26 Katz, Mastering Audio, pp. 134–35. 27 Ibid., p. 135. 28 Ibid., p. 140. 29 Nick Krewen, with Bryan Reesman, “Mastering the Remaster,” The 57th Annual GRAMMY Awards (2015). Available online at: https://lurssenmastering.com/files/ Mastering%20The%20Remaster.pdf 30 Cited in Krewen, with Reesman, “Mastering the Remaster”. 31 Ibid.

216 Notes 32 MusicTech magazine offers a great “beginner’s” tutorial on intersample peaks for the interested reader: https://www.musictech.net/2012/09/10mm-no211-inter-samplepeaks/ 33 For more on K-meters and K-metering, you can see Bob Katz’s website: https:// www.meterplugs.com/kmeter

Coda 1 Albin Zak, The Poetics of Rock: Cutting Tracks, Making Music. Berkeley: University of California Press, 2001, p. 26.

Index Abbey Road Studios  111, 130 Ableton DAWs  51 Accidental Revolution: The Story of Grunge (Anderson)  213 n.32 acoustic bass  38, 129 acoustic character  41, 43 acoustic energy  9 acoustic guitars  38, 41, 47–9, 67, 84, 125 acoustic phenomenon  2–3, 178 acoustic pianos  38 acoustic recording technology  69 Adams, Kay  114 ADAT machine  85 Adele  48, 106 “Cold Shoulder”  106 “Daydreamer”  48 19  48 Ad Rock  121 Agfa 467  128 Ainley, Chuck  158 air absorption  38, 40 air pressure  52 AKG 414c large-diaphragm condenser microphone  29, 41 Albini, Steve  54 algorithmic signal processors  56 Alice in Chains  118 Dirt  118 “Rooster”  118 Allen, Lily  39 “The Fear”  39 Allen Zentz Mastering  154 all-pass filters  139–40 amateur recordists  163 amazon.com  201 ambient dub  90 American engineers  152 American studios  44 Ampex  150 Ampex 456  128 amplification  46, 110, 150

amplitude  11–12, 91, 103, 105, 107, 156, 179 analog gear  107, 165–6 analog processors  54 analog tape machines  55, 127 analog technology  55 analog-to-digital converter (ADC)  46, 51–2, 166, 187, 210 n.18 anchor points  77–8 anchor tracks  77 Anderson, Kyle  213 n.32 Animals, the  63 “House of the Rising Sun”  63 Ann-Margaret  113 “I Just Don’t Understand”  113 anti-ProTools statement  54 Aphex  188 Apple  195, 198 Apple Music  157, 184, 197 ARP  49 “arrange window”  55 ART AR5 ribbon microphone  29 Artisan  153 Atkins, Chet  128 attenuation  22, 80–1, 91–2, 97–100, 102, 105, 161, 174–5, 189–90, 192 attitude parallel compression  183 audible spectrum  16, 18, 23–4, 50, 57, 90, 140–1, 175–6, 180 audiocassette  164, 200 audio editing  51, 63–5, 66 audio engineers  150–1 audio signal  3, 7, 9, 19, 20, 27, 79, 90, 91, 96–7, 121, 126–7, 139 Audio-Technica 4033A condenser microphone  117–18 auditory horizon  82 Aural Exciter  188 Automatic Double Tracking (ADT) machine  111, 130–1 average amplitude  12, 15, 156, 179 Awolnation  85

218 Index backing vocals  102, 121 balancing depth  82–4 Band, the  153 Moondog Matinee  153 Bandcamp  157, 164, 197 bass guitar  47, 89, 94 bass maximizers  186, 189 bass roll-off  20, 22 Beach Boys, the  170 Pet Sounds  170 Beam, Samuel  39 Beastie Boys  121, 132 Check Your Head  121 “Gratitude”  121 “Intergalactic”  132 “So What’cha Want”  132 “Stand Together”  121 Beatles, the  4, 35–6, 47–8, 70, 72–3, 78–9, 115, 119, 124–5, 131, 161, 169–70, 172 Abbey Road  48, 79 “All My Loving”  37 “Baby You’re a Rich Man”  48 With the Beatles  37 The Beatles (“The White Album”)  48, 78 “Being for the Benefit of Mr. Kite”  171 “Birthday”  131 “Blue Jay Way”  48 “A Day in the Life”  47, 79 “Dear Prudence”  78 “Devil in Her Heart”  37 “Everybody’s Got Something to Hide Except for Me and My Monkey”  111 “Fixing a Hole”  171 “Getting Better”  171 “Hello, Goodbye”  48 “Helter Skelter”  111 “I Am the Walrus”  72–3, 111 “I Feel Fine”  124 “I Me Mine”  47 “It’s All Too Much”  124 “I Wanna Be Your Man”  36–7 Let It Be  48 “Little Child”  37 “Lovely Rita”  47

“Lucy in the Sky with Diamonds”  47, 171 “Money”  35 “Ob-La-Di, Ob-La-Da”  131 “Only a Northern Song”  47 “Revolution”  47 “Revolution 9”  111 Revolver  4, 35, 131 “Savoy Truffle”  131 Sgt. Pepper’s Lonely Hearts Club Band  47–8, 70, 78–9, 169–72 “She’s Leaving Home”  171 “Think for Yourself ”  115, 119 “When I’m Sixty Four”  47 “With a Little Help from My Friends”  171 “Yer Blues”  111 Beck, Jeff  115 Beethoven  168 Belden, Bob  66 Bell, Alexander Graham  11 Benassi, Benny  98–9, 119 “Come Fly Away (feat. Channing)”  119 “Eclectic Strings”  119 “Finger Food”  98, 119 “Free Your Mind-On the Floor (feat. Farenheit)”  119 Hypnotica  98 “I Am Not Drunk”  119 “My Body”  98, 119 Rock and Rave  98, 119 “Rock ‘n’ Rave”  119 “Satisfaction”  98 “Who’s Your Daddy (Pump-kin Remix)”  119 Ben Folds Five  124 “Fair”  124 Benny Goodman Orchestra  110 “Bugle Call Rag”  110–11 Berklee College of Music  63–4, 75, 85 bidirectional microphones  34 bidirectional response  37–8 Bieber, Justin  82 Big Bands  109–10 Big Muff  115 Big Wreck  75, 96, 147

Index Billboard Top 40 record chart  66, 82, 98 Billy Talent  33, 75 binary code  9 binary digits (bits)  53 binary values  53–4 bitcoin  165 Blue Note  195 blues harp  73 Bob B. Soxx and the Blue Jeans  113 “Zip-a-Dee-Doo-Dah”  113 Bonham, John  20 boosts and notches  174 Boston  132 “More Than a Feeling”  132 bouncing  70–3 Bowies, David  101, 122 “Heroes”  101–2, 122 Brainworx’s bx_XL  190 Bream, Julian  73 Breeders  121 Brian Setzer Orchestra  130 The Dirty Boogie  130 Guitar Slinger  130 Vavoom!  130 “Bring on the Night”  140 British engineers  152 British Invasion  109 Brown, Peter  47 Buchla  49 Buffalo Springfield  111 “Mr. Soul”  111 buried vocals  87–8 “buttressing” technique  104 bypass circuit  112 Calling  85 Cameron, Matt  54 capacitor microphones. See condenser microphones Capitol  153 “cardioid” microphones. See unidirectional microphones cardioid response  38 Carey, Danny  20 Casablancas, Julian  87, 95, 117–18, 154 Case, Alexander  8, 17, 97, 119, 211 n.10, 17, 213 n.37, 214 n.45


Cassius  119 Au Rêve  119 “The Sound of Violence”  119 CDBaby  201 Celemony Melodyne  65 Century Records  162 Chandler, Chas  124 Chemical Brothers, the  119 “Loops of Fury”  119 Cher  181 “Believe”  181 Chkiantz, George  71 choral vocals  41 chorusing  139, 140–1 Christian, Charlie  109–10 claps  1, 103 Clapton, Eric  115 classical guitars  25 classical record  179, 195 Clearmountain, Bob  104 clipping  107, 181–3 close-mic placements  38–40, 45 Cobain, Kurt  130 “Cockoo Cocoon”  140 Codec Toolbox  198 Collins, Phil  101 Columbia Studio B,  66 comb filtering  17–19, 36, 127, 138, 140 compact discs (CDs)  54, 154, 164, 210 n.18 comping  64–5 complex waveforms  92 compression  3, 30, 100, 156, 212 n.17 compressor and limiter  96–7 compressors and equalizers (EQs)  10, 151, 190 computer-based sequencing  57 “concept album”  168, 172 concerts  4, 170 condenser microphones  20, 22–6, 29–30 control room evaluations  65 Cooper, Alice  29–30 “I’m Eighteen”  29–30 Corgan, Billy  43 “Bullet with Butterfly Wings”  43 “Cherub Rock”  43

220 Index “Fuck You (An Ode to No One)”  43 Siamese Dream and Mellon Collie & the Infinite Sadness  43 “Today”  43 “Zero”  43 correlation meters  199 country record/recordists  114, 179 Cream  111 “Sunshine of Your Love”  111 Credence  154 critical distance  41–2 cross-fading  172 CSNY  111 “Ohio”  111 cultural difference  152 Cure, the  87, 124, 170 Faith  170 “Prayers For Rain”  124 “Secrets”  87 Seventeen Seconds LP  87 cutter engineers  200 cutters  151–2 cutter style  152 cutter tradition  152, 156, 160, 162 cymbals  25, 30, 98, 175–6, 177 Cyrus, Bill Ray  157 Dabrye  106 “Air”  106 Daffy Duck  74 Daft Punk  99 “One More Time”  99 Dallas Arbiter Fuzz Face  115 dance-floor tracks  103 Danger Mouse  106 “Lazer Beam”  106 From Man to Mouse  106 data processing  52 Dauntless  117 Execute the Fact  117 Dave Matthews Band  47–8 Crash  47 “So Much to Say”  47, 48 “Too Much”  47 “Tripping Billies”  47 “Two Step”  47 Davies, Dave  108–9, 115 Davies, Ray  109

Davis, Miles  66–7 Bitches Brew  66 “Pharaoh’s Dance”  66 decaying signal  98 decay phase  144–6 Deepchild  73 “Neukoln Burning”  73 de-essers  175–6 delay  126–35 doubling  130–3 lines  105–7, 139 slapback as referent  129–30 slapback echo  127–9 synced echoes  134–3 tape  127 “the Haas trick”  135 times  138 unsynced echoes  133–4 de-noising  192–4 depth dimension  84 depth equalization  92–3 destructive phase  17 Devo  49 “Whip It”  49 diaphragm  20, 23, 38, 42 diffusion rates  146 digital-audio device  53–4 digital-audio processors  154 digital-audio recording  9, 52, 156 Digital Audio Tape (DAT)  154 digital-audio technology  52–4 digital-audio workstations (DAWs)  51–2, 54–5, 62–3, 65, 127, 134, 158, 161, 197 digital formats  155, 157, 197 digital plug-ins  56–7, 62, 65, 129, 138 digital processors  131, 134 digital revolution  170 digital technology  52, 55, 179, 196–7 digital-to-analog converter (DAC)  51, 52, 187 Digitech Vocalist Workstation harmonizer  105 Dion, Céline  106 “The Power of Love”  106 Diplo  64 direct-injection (DI)  6, 9, 46–9, 55 transduction tandems  48–9

Index directional microphones  38 directional response  34–7, 37–8, 41 Dire Straits  49 “Money for Nothing”  49 Disc Description Protocol image (DDPi)  154, 161 displacement power  11 distance-mic placements  40 distortion  88, 107–26, 181, 183, 187, 210 n.18, 211 n.10 feedback  121 feedback fade-outs  125–6 feedback lead-ins  124–5 fuzzbox goes mainstream  114–16 lift  119 Maestro Fuzz-Tone FZ-1  113–14 overdrive method  109–12 pitched feedback and whammy slamming  122–3 re-amping  116–18 reinforcement  120–1 sectional  118–19 slash and burn method  108–9 stomp-box method  112–13 transitional feedback  125 dithering  54, 210 n.18 DIY online record label  201–2 DJ Nexus  103 “Journey into Trance”  103 Dockwray, Ruth  75 double-tracked electric guitar  74 double-tracking effect  131–2 Doug Carson  154 downloading  54, 154, 164 Drake  75, 96, 147 Dropbox  159 drum  71, 75 buss  100 kick  1–2, 20, 30 kits  25, 30, 38, 83 snare  1, 30 sticks  20 timbre  101 ducked echoes  106 ducking mechanism  105–7, 211–12 n.17 Duran Duran  49 “The Reflex”  49 Dylan, Bob  84


“Can’t Wait”  84 “Cold Irons Bound”  84 “Dirt Road Blues”  84 “Love Sick”  84 “Make You Feel My Love”  84 “Not Dark Yet”  84 “Standing in the Doorway”  84 Time Out of Mind  84 Dylan, Jakob  121 dynamic microphones  20–2, 24, 27 dynamics-processing techniques  96–107, 183 compressor and limiter  96–7 ducked reverbs and delays  105–7 envelope following  102–3 gated reverb  101 keying for feel and groove  103–4 keying kick  104 multi-latch gating  101–2 noise gates  100 pumping  97–8 side-chain pumping  98–100 vocoding  105 Earle, Steve  157 early and late reflections  144 edge  38, 40, 43, 46, 48, 88 Edge, the  133–4 Edison cylinder phonograph  9 Eels  85 Eisengrein, Doug  103 electrical current  10, 20, 22, 27, 52 electric bass  1–2, 16, 38, 46–7, 48, 50, 71, 83, 90, 124, 140 electric guitars  38, 43, 46, 49, 71, 73, 94, 95, 103, 110, 129, 139, 140 Electric Lady Studios  125 Electro-Harmonix  115 electromagnetism  27, 150 electronic dance music (EDM)  83, 98, 134, 179–81 electronic genres  99, 177 “Elephant Stone (re-mix)”  80 ELO  85 ELP  82 Elpico amplifier  109 Emerick, Geoff  79, 111

222 Index EMS VCS3 synthesizer  123 Eno, Brian  102, 123, 142 “Berlin Trilogy”  102 envelope following  102–3 equalization 3, 30, 88, 91–6, 178, 180, 190, 211 n.10 depth  92–3 hi-pass filters  95–6 mirrored  93–5 equalizers (EQs)  89, 91, 104 graphic  92 multiband compressors for  175–6 parametric  92 peak  92 program  92 semi-parametric  92 ES-150 hollow-body electric guitar  109 e-transfers  164 exciters  187–9 Ezrin, Bob  18, 29–30, 41 Fat Lip  132 “What’s Up Fatlip?”  132 feathering  174 feedback  121 lead-ins  124–5 mechanism  134 feel and groove  103–4 female vocals  1 Fender Rhodes  140 Fender Stratocaster  123 Fergie  39, 82 “Glamorous”  39 figurative recording  4 “first electric period”  67 Flack, Roberta  19 I’m the One  19 flanging  137–8, 140–1 Fletcher, Harvey  180 Flex, Jay Da  106 Flying Lotus  59, 60, 98–9 Cosmogramma  59, 98 L.A. EP 1x3  60 Los Angeles  59, 98 1983  59, 98 Reset  59 “Tea Leaf Dancers”  99 FlyLo  104, 106

folk  179 folk-rock  84 Ford, Harrison  195 Foster, David  106 four-track technology  35, 72–3 frames  52 Franklin, Aretha  160 Frank Zappa and the Mothers of Invention  169 Freak Out!  169 Suzy Creamcheese  169 Franz, David  92 Freeman, Morgan  85 frequency  15–16, 24, 89, 214 n.46 balancing  90 response  10, 19, 20, 22–3, 25, 29, 30, 37, 38, 41 spectrum  138, 140, 180 Fripp, Robert  122–4 Frusciante, John  43 “full-scale” decibel weighting system  179 funk  103–4 fuzzbox  113–15 fuzz distortion  108, 113 Fuzz Face  115 fuzz timbre  113–14 Gabriel, Peter  101 “Intruder”  101 Gaisberg, Fred  69 Gallagher, Noel  43 gapping  171–3 Gardner, Brian “Big Bass”  162 “gated-reverb” effect  101 Gates, Bill  4, 7 Gaye, Marvin  169 What’s Goin’ On?  169 Gehringer, Chris  168 General MIDI (GM)  55 Gibson, Bill  96 Gibson EH-150 amplifier  109 Gibson Electronics  112–14, 113 Gibson J-160E semi-acoustic guitar  124 gigs  85–6 Gilmour, David  41, 115 Girdland, Ludvig  85 Godrich, Nigel  98

Index goniometers  199 Gould, Glenn  4, 102 gramophone  9, 69 Grande, Ariana  106 “Raindrops (An Angel Cried)”  106 Granz, Norman  110 Jazz at the Philharmonic  110 graphic EQs  92 Gray, Macy  85 Grint, Barry  174 Grundman, Bernie  153–4, 196 The Guardian Magazine  196 guitars  28–9, 75, 77, 109, 112, 133–4, 175 acoustic  25, 38, 41, 47–9, 67, 84, 125 bass  47, 89, 94 classical  25 electric  38, 43, 46, 49, 71, 73–74, 94, 95, 103, 110, 129, 139, 140 rhythm  42, 81, 83, 89, 176 Haas, Helmut  135 Haas, Will  164 “the Haas trick”  94, 135, 145 Haggard, Merle  114 Hammond, Albert, Jr.  95, 125 Hansa  102 hard clipping  107, 110, 187 harmonic distortion  110 Harrison, George  124 “Have a Cigar”  140 “head-and-solos”  110 headphones  2, 9, 18, 35, 73, 74, 75, 87 Heap, Imogen  39, 105 “Hide and Seek”  39, 105 heavy metal and hard rock  115 Hedges, Mike  87 Hendrix, Jimi  71, 112, 115, 122, 124 Are You Experienced?  70–1, 123 “Crosstown Traffic”  124 Electric Ladyland  124 “Foxey Lady”  124 “Purple Haze”  111 “The Star Spangled Banner”  112 “Third Stone From the Sun”  123 “Wild Thing”  112, 123 Hepworth-Sawyer, Russ  157, 173, 177 Hewitt, Ryan  43


Hiatt, John  157 high-resolution formats  195–6 hi-hats  25, 80–1, 100, 103, 176 hi-pass filters  90, 95–6 hip-hop  83, 98, 104, 132, 179, 181 Hives, the  82 Hodgson, Jay  63, 160 “Lonesome Somedays”  67–8, 107, 147, 203 Hole  124 “Mrs. Jones”  124 Holland, Brian  151 Hollies, the  47 horizontal plane  77–81 panning and masking  79–81 stereo-switching  78–9 house  98 House Of Pain  132 “Jump Around”  132 “Shamrocks and Shenanigans”  132 Howard, Ben  105 Howson, Simon  182 Hudson Mohawke  104 hyper-cardioid microphones  34 “hyping” tracks  92 Ikutaro Kakehashi  108 image files  154 impedance  209 n.14 inflators and saturators  186–7 input signal  97, 100, 139, 140–1, 186–9 Intelligent Dance Music (IDM)  98 internet  158–9 Iron Maiden  169 Seventh Son of a Seventh Son  169 Iron & Wine. See Beam, Samuel iTunes  157, 197, 201 Izhaki, Roey  81, 120, 134–5 Jackson, Michael  19, 29, 42 Bad  19, 42 Dangerous  19 “Don’t Stop ‘Til You Get Enough”  29 “Man in the Mirror”  42 Off the Wall  19, 42 “Rock with You”  42 Thriller  19 Jackson, Wanda  114

224 Index Jagger, Mick  40, 120 James, Colin  129 Colin James and the Little Big Band  130 “Jay Hodgson: Understanding Records Videos”  60 Jay-Z  132 “99 Problems”  132 jazz  66–7, 179 J-Dilla  60, 98, 104 Jennings, Waylon  114 Jesus and Mary Chain  40, 87, 124, 126, 132 “Catchfire”  124, 126 “Frequency”  126 Psychocandy  40, 126 “Reverence”  126, 132 “Sugar Ray”  126 “Sundown”  126 “Teenage Lust”  126 “Tumbledown”  124, 126 Jethro Tull  169 Thick as a Brick  169 Jimi Hendrix Experience  123 Johnny Rotten  74 Johns, Andy  70–1 Jones, Leslie Ann  158 Juno Award  159 “Just the Way You Are”  140 Katainen, Riku  117 Katz, Bob  170, 183, 193, 199 Keaton, Diane  85 Kehew, Brian  36, 131, 161 keyboards  50, 89 keyed gating  103, 105 keyed tracks  104 Khemani, Rohin  85 kick drums  1–2, 20, 30, 73, 74, 77, 81, 83, 89, 94, 98–9, 100, 104, 175 Kill Rock Stars  39 kingmobb. See Shelvock, Matt Kinks, the  109, 111, 115 “All Day and All of the Night”  111, 115 “You Really Got Me”  109 “K-meter”  199 Kramer, Eddie  71

Krewen, Nick  196 Krotz, Alex Chuck  75–6, 89–90, 96, 147 Kylesa  87 “Don’t Look Back”  87 Lady Gaga  85 LANDR  201 Lanois, Daniel  84, 133 large-diaphragm condenser  24–7 large-diaphragm condenser microphone  30, 41 large-diaphragm dynamic mic  30 Lauzon, Dylan  118 lead vocals  25, 29, 38–9, 40, 77, 79, 102, 105, 106, 107, 111, 129, 145 Le Chic  104 Led Zeppelin  20, 40, 42, 73, 84, 87, 111, 129, 141, 179, 194, 213 n.32 “Boogie with Stu”  40 “Communication Breakdown”  42 “Custard Pie”  40 “Good Times Bad Times”  42 “Immigrant Song”  132 “Kashmir”  132, 140 Led Zeppelin I  115 Led Zeppelin II  115 “Nobody’s Fault but Mine”  132 “No Quarter”  141 Physical Graffiti  40, 194 “Rock n Roll”  129 “Sick Again”  40 “The Song Remains the Same”  132 “Stairway to Heaven”  179 “Trampled under Foot”  40, 132 “The Wanton Song”  40 “When the Levee Breaks”  73 Lemon Jelly  172 “Elements”  172 “Experiment Number Six”  172 Lost Horizons  172 “Nice Weather for Ducks”  172 “Space Walk”  172 Lennon, John  35, 37, 79, 111, 124, 129, 130 “Ain’t that a Shame”  129 “Angel Baby”  129 “Be-Bop-A-Lula”  129 “Bony Moronie”  129

Index “Do You Want to Dance”  129 “Just Because”  129 “Peggy Sue”  129 Rock ‘n’ Roll  129 “Slippin’ and Slidin’”  129 “Stand By Me”  129 “Sweet Little Sixteen”  129 “You Can’t Catch Me”  129 Leonard, Vincent  210 n.18 Levitin, Daniel J.  88–9, 128–9 Lewis, Jerry Lee  128 “Great Balls of Fire”  128 “Whole Lotta’ Shakin’ Goin’ On”  128 lift distortion  119 Lillywhite, Steve  47, 101 Limp Bizkit  85 “line level”  46 Little Labs Redeye  117 live albums  110 live performance  3–4, 6, 66, 82 lo-fi timbre  117 Logic DAWs  51, 155 LogicX  63, 65, 197 long-play records (LPs)  171 Lopez, Jennifer  19 Rebirth  19 Los Angeles Philharmonic Auditorium  110 Loud Luxury  99 loudness  12, 151, 156, 178–9, 180–3, 187 normalization  156–7, 158, 184–5, 199 war  155–7, 179, 182, 184 Loudness Units Full Scale (LUFS)  157, 165, 184–5, 197, 199 loudspeakers  9, 75, 78 low-frequency oscillator (LFO)  131, 136–7, 139–40 low-pass filter  50 low-pass principle  95 Lucas, George  195 Ludwig, Bob  153–4, 178 L3 Ultramaximizer  201 McCartney, Paul  35, 37, 47–8, 79, 115, 119, 124, 161, 162 Macero, Teo  66–7


Madonna  39, 99 Confessions on a Dance Floor  99 “Erotica”  39 “Get Together”  99 Maestro Fuzz-Tone FZ-1  113–14 magnetic fields  20, 27 magnetic tape  149–50 The Making of the Dark Side of the Moon  50 manual doubling  130–3 Marshall 1959 SLP amplifier  115 Martin, George  4, 35, 47, 72, 78, 111, 124, 131, 171 Martin, Grady  113 Martin, Ricky  181 “Livin’ La Vida Loca”  181 Maselec MTC-1X  165 Massive Attack  90 Mezzanine  90 master cut  9 Master Disk  153 “Mastered for iTunes” (MFiT)  195–6, 197 mastering  6–7, 9, 16, 149–202 as arm’s-length peer review  161–4 bass maximizers  189 clipping  181–2 de-noising  192–4 dynamic notching  175–6 exciters  187–9 feathering  174 finalization  160 gapping  171–3 history  149–60 inflators and saturators  186–7 in-the-box (ITB) and out-of-the-box (OTB)  165–7 loudness-normalization era  184–5 loudness and spectral content  180–1 mid-side (M/S) processing  176–7, 189–92 parallel compression  182–3 peculiar art  194–7 sequencing  168–71 transfer process  197–200 mastering engineers  151–2, 154–6, 159, 161–7, 171, 173, 175–6, 178–81, 184, 186–7, 189–90, 193, 197, 199, 200–2, 212 n.29

226 Index mastering “houses”  152–4 “The Mastering Lab”  153 mastering lathe  149–50 matrix editor  57, 61 Mayer, Roger  113, 115 MCA  121 Mendes, Shawn  33, 45, 75, 96, 147 Mesa/Boogie Dual Rectifier  117 Meshuggah  125 metering  198–9 MF Doom  106 microphones  9–10, 17–19, 33–4, 91, 102 condenser  22–6 directional response  34–7 dynamic  20–2 operations principles  20 placement  37–41, 45, 78 ribbon  27–30 shootout  28–9 MIDI controller  58 MIDI grid. See matrix editor Midnight Juggernauts  124 “Road to Recovery”  124 mid-side (M/S) processing  176–7 corrective and creative  190–2 determining width and depth  189–90 Mike D  121 Miller, Jimmy  120 Milner, Greg  67, 181–2 Minogue, Kylie  64 mirrored equalization  93–5 Mitchell, Joni  170 Blue  170 Mitchell, Mitch  71 mix-bus compression  164 mix engineers  151, 163–4, 175, 193, 202 mixing  6–7, 9, 16, 36, 47, 51, 64, 69–147, 151, 201 consoles  10, 47, 55, 69, 79, 111 and multitrack paradigm  70–90 signal processing  90–147 “mix level” signal processing  147 “2018 mix” version  78 MK I fuzzbox  115 MK II fuzzbox  115 modern pop  5, 42, 60, 82–3, 105, 132

modern rock  98, 113, 133 modern trance records  103 modulation  136–41 chorusing  139, 140–1 depth  137 flanging  137–8, 140–1 phasing  139–41 rate  137 shape  137 Modwheelmood  126 “The Great Destroyer”  126 Mogwai  115 Mohler, Billy  85 mono/monaural reproduction  78 “mono the lows”  177 Monterey Pop Festival  113, 123 Monterey Pop  123 Moog  49 Moon, Keith  20 Moore, Alan  75 Moore, Scotty  128 Moorfield, Virgil  4 Morello, Tom  125 Morley, Neil  214 n.46 Mos Def  106 The Ecstatic  106 Mosrite Fuzzrite  115 Motown  151 MOTTOsound  157 moving coil microphones. See dynamic microphones Moylan, William  214 n.48 MS Audiotron MultiMix  117 Muddy Waters  108 Mudhoney  115 multiband compressors  175–6 multi-latch gating  101–2 multitap delay processors  134–5 multitrack mixing console  55, 69–70 multitrack production  7, 49, 58, 73, 84, 127, 141, 163 multitrack recording  150–1 multitrack technology  69–70 Munsen, Wilden A.  180 Muse  179 “Supermassive Black Hole”  179 musical communication  2–4, 63, 65, 74, 76–7, 78, 91, 206–7

Index musical programming and coding  154 musical techniques and procedures  5, 7, 14, 65, 127, 159, 206 Najera, Johnny “Natural”  63–4 Nashville studio  113 Nat King Cole Trio  110 Neumann lathe  153 Neumann M149s  41 Neumann TLM103 large-diaphragm condenser microphone  117–18 Neumann U47 and U48 condenser microphones  23, 35 New York room sounds  45 Nicholson, Jack  85 Nikki’s Wives  1, 118 “Hunting Season”  1 Nine Inch Nails  126, 172 “Corona Radiata”  126 The Fragile  172 “Mr. Self Destruct”  126 Ninth Symphony  168 Nirvana  118, 121, 124 “Heart Shaped Box”  118, 121 “In Bloom”  118, 121 Incesticide  121 “Lithium”  118, 121 “Lounge Act”  118 Nevermind  118, 121 “Pennyroyal Tea”  121 “Radio Friendly Unit Shifter”  121 “Sliver”  121 “Smells Like Teen Spirit”  118, 121 In Utero  118, 121 “You Know You’re Right”  121 noise gate  100, 103 noise shaping  210 n.18 noisy tracks  101 non-programmatic sequences  170–1 Nyquist, Harry  52 Oasis  43 “Champagne Supernova”  43 Ocasek, Ric  125 O’Leary, Kevin  33, 45, 49 omnidirectional microphones  34 omnidirectional response  37–8 on-axis and off-axis placement  38–9


online aggregators  201–2 online distribution and marketing  201 Orwell, George  169 out of phase signals  17 overdrive method  109–12 overdubbing  35, 37, 41, 70, 105, 121, 150 Owens, Buck  114 Owsinski, Bobby  214 n.1 Padgham, Hugh  101 Page, Jimmy  42, 115, 194 panning and masking  76, 79–81, 95, 131, 133–5 pan pots  79–80 parallel compression  182–3 parametric EQs  92 “Paranoid Android”  140 participatory discrepancies  131 past-tense auditory narratives  73–4 Pauls, Les  55, 110–11, 137 PayPal  165 peak EQs  92 Pearl Jam  54 “Can’t Keep”  54 Riot Act  54 Peavey 5150II  117 Peavey practice amplifier  117 pedagogy  205 pegging meters  199 Pennebaker, D. A.  123 Pensado, Dave “Hard Drive”  4, 7 percussion  12, 29, 38, 83, 103 periodic soundwave  17 Perry, Katy  82 Perry, Lee “Scratch”  85 personal computing  52, 154–5, 158, 179 Pharcyde  132 “Drop”  132 “Oh Shit”  132 Pharrell  64 phase, concept of  17–18 phase coherence and interference  16–18, 35 phasing  139–40, 139–41 Philips, Sam  127–8 Phil Spector  81, 113, 129 “Wall of Sound”  81 phonograph  9, 69

228 Index piano  45, 71, 139 acoustic  38 keyboard  57 Picasso, Pablo  4, 7 “Pictures of Matchstick Men”  140 Pink Floyd  41, 50, 79, 82, 111, 169, 172 Animals  169, 172 Dark Side of the Moon  50, 79, 169, 172 “One of These Days”  41 “On the Run”  50, 172 “Time”  50, 172 The Wall  41, 169 Wish You Were Here  172 pitched feedback and whammy slamming  122–3 pitch modulations  139, 140–1 Plant, Robert  40, 129 point of distortion  110 pop  25, 134, 147, 199 postwar  4 production  39, 81, 83, 94, 124, 139 records/recordists  55, 70, 85, 91, 109, 179 popular music  5, 38, 168, 205–6 Porno for Pyros  132, 135 “Orgasm”  132 “Pets”  135 Portishead  98 Dummy  98 “Pedestal”  98 Postal Service  39 “From a Great Height”  39 post-production process  64–7 pre-delay timings  142–3 Prefuse 73  60, 104, 172 Everything She Touched Turned to Ampexian  172 Preparations and Everything She Touched Turned to Ampexian  60 premaster qualities  163–4, 166 Presley, Elvis  127–8 “Baby, Let’s Play House”  128 “Blue Moon of Kentucky”  128 “Heartbreak Hotel”  128 “I’ll Never Let You Go (Little Darlin’)”  128 “Mystery Train”  128

“Tomorrow Night”  128 “Trying to Get to You”  128 Proby, P. J.  113 “Hold Me”  113 “Together”  113 ProCo Rat distortion  117 program EQs  92 programmatic sequences  170–1 ProTools  51, 55, 155 “the Protools world”  67 proximity effect  22, 37–8 proximity plane  81–8 auditory horizon  82 balancing depth  82–4 buried vocals  87–8 vocal priority  84–6 Prydz, Eric  93, 99 “Call on Me”  93, 99 pumping  97–8 Purcell, Denny  153 pushing-and-pulling energy  11 quantization noise  210 n.18 Radiohead  98, 170 “Exit Music (For a Film)”  98 “Idiotheque”  98 Kid A  98 Ok Computer  170 Radio Luxembourg  149 Rage Against The Machine  125, 181 “Battle of Los Angeles”  181 “Killing in the Name Of ”  125 Ramone, Phil  158 Raphael, Gordon  87–8, 117–18 raw audio  7, 9, 51 RCA  128 RCA 44BXs  29 RCA DX77s  29 real inputting  52 realism  4 re-amping  116–18 record industry  48, 150, 154, 157, 159, 179, 184, 199, 201–3 record-making process  3, 6, 142 record production  161–2, 198, 201, 206 Red Baraat  85 “Red Book”  154 Redding, Noel  71

Index REDD.51 mixing console  111 REDD.47 preamplifier  111 Redford, Robert  85 Red Hot Chili Peppers  43, 182 BloodSugarSexMagik  43 Californication  182 reel-to-reel tape technology  54 Reeves, Keanu  85 Reid brothers  40 reinforcement distortion  120–1 remastering engineers  195 Reni  81 resonant spikes  163, 175 reverberation  1, 2, 24, 25, 30, 40, 41, 42, 76, 106–7, 127, 141–7, 177 decay phase  144–6 early and late reflections  144 pre-delay  142–3 and tone shaping  146–7 Reverend Horton Heat  129 Smoke ‘Em If You Got ‘Em, It’s Martini Time, Space Heater  129 Spend a Night in the Box  129 Rhodes, Red  113 rhythm guitar  42, 81, 83, 89, 176 rhythm track leakage  36–7 ribbon microphones  20, 27–30 Richards, Keith  114–15, 120 RLS10 “White Elephant” speaker  35–7 RME Fireface 800  117 Robbins, Marty  113 “Don’t Worry”  113 rock ‘n’ roll  25, 87, 95, 98, 108, 112–13, 115, 120–1, 129, 179, 183 Rodgers, Nile  104 Roland  49 Roland SpaceEcho  134 Rolling Stones, the  40, 111, 114, 120 “All Down the Line”  120 “Casino Boogie”  120 Exile on Main Street  40, 120 “Happy”  120 “(I Can’t Get No) Satisfaction”  114–15 “I Just Want to See His Face”  40 “Jumpin Jack Flash”  111 “Let It Bleed”  120 “Loving Cup”  40, 120

“Rip This Joint”  40 “Rocks Off ”  120 “Soul Survivor”  40 “Stop Breaking Down”  40 “Street Fighting Man”  120 “Sweet Black Angel”  40, 120 “Sweet Virginia”  40 “Sympathy for the Devil”  120 “Torn and Frayed”  120 “Turd on the Run”  120 “Ventilator Blues”  40, 120 “We Love You”  70 Ronettes, the  81 “Be My Baby”  81 room acoustics  41 room ambience  40, 42 room mics  42–3, 45 room modes  16 room sound  43–5 Ross, Diana  104 “Upside Down”  104 “The Rover”  140 Royer R-121 ribbon microphone  43 Royer R-122V  28 Rubin, Rick  43 Rudolph, Thomas  210 n.18 Rush  65, 182 Vapor Trails  182 Ryan, Kevin  36, 131, 161 Samiyam  104 sample-curation practice  60 samples  52 Santana  181 “Supernatural”  181 Santana, Carlos  115 Sax, Doug  152, 154, 158 Schmitt, Al  158 Schoeps small-diaphragm condenser microphone  29 Scotch 250  128 sectional distortion  118–19 “Selective Synchronous” (SEL-SYNC) recording  150 self-noise  28, 46, 101, 150 Selway, Phil  98 semi-parametric EQs  92 Senior, Mike  176


230 Index sequencing  6, 9, 49–68, 168–71 audio editing  63–4 comping  64–5 digital-audio  52–4, 62–3 real inputting  58–61 step inputting  57–8 timing and tuning  64–5 workstation capacities  54–7 workstation paradigm  51–2 Setzer, Brian  129 Sex Pistols  74, 79 “Anarchy in the UK”  74 Nevermind the Bollocks  79 Shadows, the  47 “Sheer Heart Attack”  140 Shelvock, Matthew  60, 67, 107, 147 Shepherd, Ian  179 Shorter, Wayne  85 Shure SM57 dynamic microphone  20, 22, 29–30, 38, 43, 45, 117 side-chained compressor  99, 105–6, 207 side-chain filter (SCF)  98 side-chain pumping  98–100, 105–6 signal processing  6–7, 9, 45, 51, 55, 57, 69–70, 87 delay  126–35 devices  46 distortion  107–26 dynamics  96–107 equalization  91–6 equipment  211 n.10 modulation  136–41 reverb processing  141–7 signal processors  56, 62, 79, 90, 130 Silvertone Records  80 Sims, Alastair  65, 83, 146 sine waveform  12, 14, 178 single-ended noise reduction  193 Sky  82 slapback echo  127–9 slash and burn method  108–9, 112 small-diaphragm condenser  24–7 Small Faces  138 “Itchycoo Park”  138 Smashing Pumpkins  115, 118 “Bullet with Butterfly Wings”  118–19 Mellon Collie and the Infinite Sadness  118–19

Siamese Dream  118 “Today”  118 “Smiling EQ Curve”  180–1 Smith, Elliott  39, 130, 132 “Alameda”  132 “Angeles”  132 “Between the Bars”  132 “Cupid’s Trick”  132 Either/Or.  39, 132 “Pictures of Me”  132 “Rose Parade”  132 “Speed Trials”  132 Smith, Norman  35, 161 Smith, Robert  87 snare drum  1, 30, 77–8, 83, 85, 100, 101, 104 Snoddy, Glen  113–14 Snoop Dogg  64 soft clipping  110, 186–7 software synthesis  51, 55 Sola Sound MK I fuzzbox  115, 119 Sola Sounds MK II fuzzbox  115 solid-state technology  110 sonic signature  44, 97, 160 Sonic Solutions Digital-Audio Workstation  154–5 Sonic Youth  115 Sonnox Inflator  186 Sony-Oxford  186, 198 SONY PCM1610 and PCM1630  154 sound percussive  12 quality  55, 70, 149, 152, 159, 196 recordings  8 source and waves  6, 10–12, 15–17, 19–20, 22, 24–5, 27, 30, 33, 37, 39–41, 53, 91–2, 94, 143–4 systems  78 transient  12 soundbox  75–90 horizontal plane  77–81 proximity plane  81–8 vertical plane  88–90 SoundCloud  157, 164 Sound FX: Unlocking the Creative Potential of Recording Studio Effects (Case)  8

Index Soundgarden  118–19, 213 n.32 “Black Hole Sun”  119, 213 n.32 Superunknown  119 sound-pressure levels (SPLs)  22, 24–5, 28, 30 Spectrafoo  199 spectral content  92, 107, 180–1 spectral contour  93, 175, 180–1 Spotify  1, 4, 7–8, 12–14, 45, 60, 67, 157, 164, 184–5, 197 square waveform  12, 14, 110, 178–9 Staff, Ray  172, 177 standing waves  16 staple timbre  108, 113, 115 Starr, Ringo  35, 37 Star Wars (1977)  195 steel-string acoustic guitars  25 step inputting  52 step sequencer  57 stereo arrays  25–6 stereo-bus processing  164 stereo-mic applications  25 stereo mixing  36, 78–9, 176 stereophonic technology  78 stereo spectrum  47, 50, 78, 94, 131–4 stereo-switching  78–9 Sterling Sound  153 stomp-box method  112–13, 138 Stone Roses  49, 80, 87, 124 “Fool’s Gold”  49 Second Coming  87 “Waterfall”  49 Stooges  82 stop-motion animation  52 Strait, George  157 Strange, Billy  113 “Strange World”  140 Stratocasters  55 Stray Cats  130 “Everybody Needs Rock and Roll”  130 “Rock This Town”  130 “Rumble in Brighton”  130 “Runaway Boys”  130 streaming  4, 9, 51, 54, 154, 157, 158, 164, 184, 197 Streets, the  169 A Grand Don’t Come For Free  169

“String Quartet No. 1: Gutei’s Finger”  1 Strokes, the  82, 87, 98, 117, 124–5 “Alone Together”  95 First Impressions of Earth  125 Is This It?  87, 117, 124 “Juicebox”  125 “Last Night”  87, 95 “The Modern Age”  95 “New York City Cops”  95, 124 “Reptilia”  125 Room on Fire  87, 117 “Soma”  95 studio acoustics  44–5 subtle phasing effect  36 Sullivan, Big Jim  113 Summer, Donna  19, 154 Donna Summer  19 Sun Records  127–8 “Sun Sides”  127–8 Super Bowl  64 super-cardioid microphones  34 Supro practice amplifier  42, 115 Swedien, Bruce  19, 29, 41–2, 143–4 Swing Orchestras  110 synced echoes  134–3 synth bass  73–4, 119 Synthi-A  50 synth pads  83, 98–9, 103 tambourine  125 tandem technique  42–3, 48 tape delay  127, 137 tape machine  9, 10, 71 tape splicing  66–7 techno  98, 103 tempo misalignments  60 tempo of tracks  58 Thomas, Chris  79 Thompson, Daniel  110 Three Days Grace  33, 45, 75, 96, 147 Tidal  157, 184, 197 timbres  92, 108–9, 112, 113, 142, 176 timing and tuning  64–5 tonal distortions  17, 36, 140, 161 tone  89, 115 DI  48 fuzz  113


232 Index guitar  42–3, 47–9, 117–18 shaping  45, 146–7 tandem  48–9 Tool  20, 54, 122, 125 “Jambi”  122 10,000 Days   54, 122 Tosca  135 “Suzuki”  135 Townsend, Ken  47, 111, 130 Townshend, Pete  112, 115, 122 track automation  50 tracking  6–7, 78 direct-injection (DI)  46–9 sequencing  49–68 transduction  9–45 Traffic  70 “Hole In My Shoe”  70 Tragically Hip, the  33, 65 trance DJs  103 trance gating  103 transduction  6, 9–45, 55, 78 comb filtering  17–19 critical distance  41–2 demonstration tracks and playlists  12–14 frequency  15 microphone placement  37–40 phase coherence and interference  16–17 room mics  42–3 room sound  43–5 sound source  10–12 soundwave and waveform  10–12 wavelength  16 transfer engineers  150–2, 196 transfer mastering style  152 transfer tradition  156, 160, 162, 197, 198 transitional feedback  125 tremolo  50 trial-and-error rules  168 tube amplifiers  110 tube gear  107, 110 Tubescreamer  117 Tunecore  201 tuning  64–5 Turner  109 Twain, Shania  90 “Up”  90

U2  133 All That You Can’t Leave Behind  133 “Stuck in a Moment You Can’t Get Out Of ”  133–4 unidirectional microphones  34 unidirectional response  37 unmasked noises  193 unsynced echoes  133–4 Usher  64 Valensi, Nick  95 Van Gelder, Rudy  195 Ventures  113 “2000 Pound Bee”  113 veridic punk rock production  91 vertical plane  88–90 vibrational energy  19, 22, 27 Villa-Lobos, Heitor  73 “Prelude No. 3”  74 Vimeo  157 Vince, Peter  47 vintage ribbon microphone  20, 22 vinyl  150, 164, 200 violins  1–2, 85 virtual drum machines  57–8 virtual formats  201 virtual instrument  61, 83 Visconti, Toni  102, 122 vocal priority  84–6 as marketing  85–6 vocoding  105 voice coil  20 volume  178 Vox AC30 amplifier  109 VOX Tone Bender fuzzbox  115 Wadhams, Wayne  130 Waits, Tom  40 The Black Rider  40 Bone Machine  40 Frank’s Wild Years  40 Rain Dogs  40 Swordfishtrombones  40 Walk off the Earth  33, 45 Wallflowers, the  121 “One Headlight”  121 Wall Tonstudio  102 warbling  139, 141

Index Warner Brothers  182 Was, Don  195 Waters, Roger  50, 169 waveform  10–12 wavelength  16 Waves  190 Center  190 Weezer  125 “My Name Is Jonas”  125 WEM Copy Cat  134 Western University  86 wetransfer.com  159 White, Paul  81, 181 white noise  18, 210 n.18 White Stripes, the  82, 115 Who, the  20, 111 “My Generation”  111 Winwood, Steve  93 “I’m the same boy I used to beeee”  93


“Valerie”  93 Wolfmother  82 Woodstock Music & Art Fair (1969)  112 workstation capacities  54–7 workstation paradigm  51–2 Wu Tang Clan  106, 132 “Deep Space”  106 “Wu-Tang Clan Ain’t Nothin’ Ta F’ Wit”  132 Yardbirds  115 “Heart Full of Soul”  115 YouTube  14, 60, 157, 164, 197 Zak, Albin  207 Zedd  105 “The Middle”  105 Zevon, Warren  90 “The Werewolves of London”  90