5 FFTs and spectrograms
Frequency domain graphs
Frequency-domain graphs– also called spectrum plots and Fast Fourier transform graphs (FFT graphs for short)- show which frequencies are present in a vibration during a certain period of time. Spectrum plots are particularly useful for representing sounds, because sound mostly consists of vibrations that are quasi-periodic and complex. A frequency domain graph is essentially a list of ingredients, showing which frequencies are in the vibration and how much of each one is present.
Each frequency domain graph is made by transforming a short section of a time domain graph into a new graph (using math that’s far beyond the scope of this book). The resulting graph shows which frequencies are present in the vibration during that short time period, called the sample.
Frequency domain graphs have frequency along the x-axis and amplitude along the y-axis. Frequency domains graphs usually look like a series of mountain peaks. The location of the each peak along the horizontal axis indicates the frequency of the peak. Low frequency sounds appear at the left end of the FFT; higher frequency sounds are at the right. The height of the peak shows “how much” of that frequency is “in the sample.”
Keep in mind that frequency graphs are not the same as time graphs. Frequency graphs do not show when vibrations occur- they show which frequencies are present (and it what abundance) in a particular vibration.
Reading frequency domain graphs
The simple harmonic oscillator has a very simple frequency domain graph- a single vertical “spike” or “peak.” The location of the spike along the x-axis (frequency axis, really) indicates the frequency of the oscillator. The height of the peak gives the amplitude.
Frequency domain graphs of complex vibrations have multiple peaks- one for each frequency present in the vibration. Peaks appear in order of frequency- peaks at the left end of the graph have lowest frequencies. Since the fundamental is always the lowest frequency present in a vibration, the fundamental frequency of the complex vibration can be read by reading the frequency of the leftmost spike. This is exactly the same fundamental frequency that can be found by measuring the period from the time-domain graph.
The height of each peak indicates how much of each frequency is present in the vibration. Frequency components with large amplitudes have tall peaks. Frequencies with shorter peaks have lower amplitudes. While the fundamental (the left most-peak) is often the tallest peak, it does not have to be. Some sounds are made up of a weak fundamental and prominent overtones.
Frequency plots for non-periodic vibrations don’t have distinct peaks. Here’s an example:
Spectrum graphs “erase history”
A spectrum plot takes a short sample vibration (or sound) and distills it into a single diagram that shows which frequencies are present and how much of each frequency is present. The math that creates frequency domain graphs essentially “erases” all the history information, making it impossible to tell what order sounds occurred in. (If you want to find out what order sounds occur, look at a time-domain graph).
FFTs are most often used to analyze short samples of an unchanging vibration- like a few milliseconds of a single note from a recording. The duration of the sample is deliberately kept short so that the spectrum graph shows only the “ingredients” present in that single, uniform sample of a vibration (or sound). More often than not, a spectrum graph with multiple peaks represents a single complex vibration. For instance, the spectrum graph made from a short sample of an oboe note will have many peaks because that single oboe note is made up of a fundamental as well as many overtones, all happening simultaneously.
Stop to think
One of the figures above (“Frequency domain plot of a complex vibration with a fundamental frequency of 1 kHz”) shows peaks at many different frequencies. Which of the frequencies on the graph occurred first? Explain.
Time domain vs. frequency domain
Time domain graphs are useful for showing how vibrations change with time, but not very useful for showing spectral content. Frequency domain graphs show spectral content, but contain no information about how vibrations change over time. A complete description of a changing vibration requires both time-domain and frequency-domain graphs. Later in the book, you will see spectrograms- a type of graph that is mixture of time- and frequency-domain graphs.
You will probably never have to do the math that turns time domain graphs into frequency domain graphs, but you may have to select settings on a computer program that generates FFTs, so it’s important to recognize some of the limitations of the math.
First, choose a sample of the time-domain graph where the vibration is not changing much. Make the sample as long as possible. (There’s obviously a trade-off here: the longer the sample you choose, the more likely it is that the vibration will change during your selection). Second, choose the best time window. Time window is a parameter used in the FFT calculation. Most computer programs have a default setting for time window that the user can change. Longer time windows generally give more detailed FFT graphs. The time window cannot be longer than the sample. Time window is also limited by the sampling rate of the recording.
Stop to think answer
You can’t say for sure. This FFT could be from a single complex tone- the result of many frequencies happening at the same time. But there are many other sounds that could create this same FFT. For instance, this FFT could be from a series of notes played one at a time in ascending order, or from a series of notes played in descending order. You just can’t tell. FFTs only show which frequencies are present in a sound sample- FFTs do not show when those sounds occur.
A spectrogram is a hybrid between an FFT and a time domain graph- it shows how the spectral content of the vibration changes over time. Spectrograms are especially useful for analyzing quasi-periodic vibrations (like those in music and human speech).
A spectrogram is usually drawn in two dimensions, with time along the horizontal axis and frequency on the vertical axis. Amplitude is also included, using color or grayscale. If you think of FFTs as snapshots, a spectrogram is a movie- a series of FFTs displayed in the order they occurred. Each narrow strip of a spectrogram is essentially an FFT turned sideways, with color bands instead of peaks.
Spectrograms are especially useful for examining human speech and music. The figure below shows how spectrograms can be used to spot characteristic sounds in speech. Notice that the spectrogram is especially good for identifying differences between different vowel sounds.
If you read music, you might find spectrograms intuitive. Notes in Western music notation are arranged in chronological order from left to right- just like events on the time axis of a spectrogram. In musical notation, notes are placed vertically according to pitch- the higher pitch, the further up the staff it is. Frequency and pitch are not exactly the same (as you’ll see next chapter), but they are closely related- the higher the frequency, the higher the pitch.
The spectrogram of a recording of the opening of Bach’s Passacaglia and Fugue in c minor (played on pipe organ) illustrates some of these ideas.
Notice how the contour of the notes on the musical staff matches the shape of the bottom-most set of dashes on the spectrogram. You may notice that at each time on the graph (at t=5 seconds, for example) there are multiple frequencies present. This shows that each organ note is made up of multiple frequencies- the fundamental and several overtones.
Amplitude is not easy to show on a spectrogram- there’s no axis for amplitude. Some spectrograms (like the one for the Bach fugue above) use grayscale- darker lines indicate bigger amplitude. Some spectrograms use color. Others, called waterfall plots, are in three dimensions with amplitude on the third axis.
Time resolution vs. frequency resolution
Spectrograms would seem to be the best of both worlds, capturing both time and frequency information on the same graph. There must be a catch, right? Spectrograms involve compromise. Remember that a spectrogram is a series of FFTs “snapshots.” To make a spectrogram, a computer program splits the entire recording into short sections and an FFT is made for each section. If the recording is split into lots of short samples, the resulting spectrogram has great information about when each FFT occurred (good time resolution). However, each FFT will have imprecise frequency information (poor frequency resolution), since it’s based on a short sample. Cutting the recording into fewer, longer samples improves the frequency resolution (sharpens up the peaks on each FFT) but degrades the time resolution. The key to producing a useful spectrogram is finding the “sweet spot.”
Watch some musical not-quite-spectrograms. As music plays, these videos plot fundamental frequency (musical note, really) versus time. The result should look very familiar to those who read music. The videos don’t show overtones (like spectrograms do)- hence “not-quite.”
- Bach’s Little Fugue in g minor (3:45 min youTube) 
- Liszt’s Hungarian Rhapsody Nr. 2 (8:58 min youTube) 
- a whimsical Classical Music Mashup (6:08 min youTube) 
- Frequency domain graph for a simple harmonic oscillation with a frequency of 1000 Hz. Created by David Abbott using desmos.com.
- Frequency domain plot of a complex vibration with a fundamental frequency of 1 kHz and overtones at 2 kHz, 3 kHz, etc. Created by David Abbott using Desmos.com
- Frequency domain graph of non-periodic vibration (a.k.a. noise). Created by David Abbott using Audacity.
- Frequency and time domain graphs of the spoken phrase “understanding sound.” Created by David Abbott using Audacity.
- Opening to Bach’s Passacaglia in c minor (BWV 582). Created by David Abbott using Audacity.
- Malinowski, S. (2013, July 12). Bach "Little" Fugue in G minor, Organ. Retrieved from https://youtu.be/ddbxFi3-UO4. ↵
- Fillebrown, A. (2012, Oct.10). Liszt Hungarian Rhapsody 2. Retrieved from https://youtu.be/m6xWGVhZl1g ↵
- Woolard, G. (2016, Jan. 12). Classical Music Mashup. Retrieved from https://youtu.be/7OYkWSW7u4k. ↵
- Cornell Ornithology Lab (n.d.). Bird Song Hero. Retrieved from https://academy.allaboutbirds.org/features/birdsong/bird-song-hero-training. ↵
- Sandler, D. (2018, 10 Dec.). What is a Fourier Series? (Explained by drawing circles)- Smarter Every Day 205. Retrieved from https://youtu.be/ds0cmAV-Yek. ↵
- Constantinsen, B. (2014, 10 May). Timbre: Why different instruments playing the same note sound different. Retrieved from https://youtu.be/VRAXK4QKJ1Q. ↵