When a tone is frequency modulated by another tone at a frequency above the threshold of pitch discrimination, a complex sound containing sidebands above and below the carrier signal is produced. Frequency modulation synthesis, referred to simply as FM, uses this property to produce complicated sounds. When only sine waves are used with only one carrier and modulator, the process is called simple FM synthesis. Frequency modulation synthesis was invented by John Chowning in the late 1960s and implemented by Yamaha in its DX-7 synthesizer, which was the most successful synthesizer ever at the time it was introduced.
First, it is necessary to define some terms. The tone that is modulated is called the carrier frequency, and the tone that does the modulation is called the modulating frequency or simply modulator. The amount of modulation or frequency deviation is called the modulating index, and it is measured in relation to the modulating frequency. A modulating index of 1 means that the frequency deviation is the same as the modulator.
The sidebands produced in FM synthesis occur above and below the carrier frequency at intervals of the modulator. Those that are above the carrier form the upper sideband, and those below are the lower sideband. In mathematical terms, this is stated as follows: The formula for computing sidebands in simple FM with sine tones for both carrier and modulator is c+-km, where k is an integer that determines the order of the sidebands. The amplitudes of the components are determined by kth-order Bessel functions, the argument to which is the modulation index (as defined above). Putting an envelope (called the spectral envelope) on the modulator causes the amplitudes of the sidebands (the timbre of the signal) to be varied dramatically. (Graphs of the first six Bessel functions are shown on page 118 in the Dodge-Jerse book.)
In their book FM Theory and Applications, John Chowning and David Bristow state three rules for FM synthesis:
(1) The frequency components resulting from values of c and m will follow a pattern according to c+-km, for k=0, 1, 2, 3, ... n. While the formula c+-km works for frequencies themselves, it also works for ratios of frequencies, and these have the advantage that they can be transposed to any pitch, so that the ratio values are in effect partials of the pitch. While c:m is usually expressed as a ratio, they can also be expressed simply as frequencies.
(2) To determine the number of sidebands that should be calculated for an FM pair with modulation index of I, add 2 to the value of the index I, and that's about how big k should become. This means that, if the modulation index of an example were 4, we would need to consider sidebands of up to the 6th order.
(3) The bandwidth (BW) of the spectrum can be estimated approximately from the following relations (the symbol "*" means multiplication):
(a) Where there are reflections about 0 Hz: BW approx. = c + m * (I + 2).
(b) Where the carrier is greater than the modulator, and there are no reflections: BW approx. = 2M * (I + 2).
Rule 3 takes account of the fact that, when km is larger than c, negative frequencies are produced. Negative frequencies are also produced when the Bessel functions produce negative values for positive frequencies. Negative frequencies are identical in sound to positive frequencies but are phase shifted by 180 degrees. This means that, if positive frequencies of the same values are present, some cancellation of the frequency can occur. Complete cancellation can occur only when the amplitudes are identical, which rarely occurs.
When simple integers are used as c:m ratios, harmonic spectra are generated. The following generalizations apply to this process:
The modulator determines the partials in the series, and the carrier determines the starting positions of the partials in their orders. A modulator of 1 produces all partials, starting from the carrier. Modulators of 2 and 4 produce spectra of only odd-numbered partials, as in square and triangle waves. Other even-numbered modulators produce subsets of the odd-partial series. Other modulators produce spectra that are clustered around multiples of the modulator without including the modulator. For example, 1:3 produces 2 and 4, 5 and 7, 8 and 10, 11 and 13, etc., or upper and lower neighbors of multiples of 3. 1:5 produces 4 and 6, 9 and 11, 14 and 16, 19 and 21, 24 and 26, etc. or upper and lower neighbors of multiples of 5.
The c:m ratio must be reduced to its lowest common denominator as a fraction; otherwise, the series will simply produce the series of the lowest common denominator transposed to a higher pitch.
When spectra contain a number of low harmonic partials that are fairly close together (some researchers suggest a minimum of three consecutive harmonic partials are necessary), they meld into a tone with a complex tone color, with the fundamental frequency perceived as the pitch and the partials being perceived as the timbre. If the components are widely spaced and do not contain consecutive elements of a harmonic series, they do not meld into a timbre but are perceived as a cluster of separate tones. Such clusters can have qualities of "timbre" but lack the quality of a fundamental frequency.
In order to clarify the components produced by a simple FM process, the sidebands are written out in a table with the upper sideband on the right and lower sideband on the left, and the partials written down in the corresponding order, as follows:
| c:m ratio | 1:1 | ||
| 0th order (k=0) | 1 | ||
| 1st order (k=1) | 0 | 2 | |
| 2nd order (etc.) | -1 | 3 | |
| 3rd order | -2 | 4 | |
| 4th order | -3 | 5 | |
| 5th order | -4 | 6 |
Other examples of harmonic spectra are as follows:
| 1:2 | 1:3 | 1:4 | ||||||
| 1 | 1 | 1 | ||||||
| -1 | 3 | -2 | 4 | -3 | 5 | |||
| -3 | 5 | -5 | 7 | -7 | 9 | |||
| -5 | 7 | -8 | 10 | -11 | 13 | |||
| -7 | 9 | -11 | 13 | -15 | 17 | |||
| -9 | 11 | -14 | 16 | -19 | 21 |
| 1:5 | 1:6 | 1:7 | ||||||
| 1 | 1 | 1 | ||||||
| -4 | 6 | -5 | 7 | -6 | 8 | |||
| -9 | 11 | -11 | 13 | -13 | 15 | |||
| -14 | 16 | -17 | 19 | -20 | 22 | |||
| -19 | 21 | -23 | 25 | -27 | 29 | |||
| -24 | 26 | -29 | 31 | -34 | 36 |
| 1:8 | 1:9 | 1:10 | ||||||
| 1 | 1 | 1 | ||||||
| -7 | 9 | -8 | 10 | -9 | 11 | |||
| -15 | 17 | -17 | 19 | -19 | 25 | |||
| -23 | 25 | -26 | 28 | -29 | 31 | |||
| -31 | 33 | -35 | 37 | -39 | 41 | |||
| -39 | 41 | -44 | 46 | -49 | 51 |
| 2:1 | 3:1 | 3:2 | ||||||
| 2 | 3 | 3 | ||||||
| 1 | 3 | 2 | 4 | 1 | 5 | |||
| 0 | 4 | 1 | 5 | -1 | 7 | |||
| -1 | 5 | 0 | 6 | -3 | 9 | |||
| -2 | 6 | -1 | 7 | -5 | 11 | |||
| -3 | 7 | -2 | 8 | -7 | 13 |
| 3:4 | 3:5 | 4:1 | ||||||
| 3 | 3 | 4 | ||||||
| 11 | 7 | -2 | 8 | 3 | 5 | |||
| -5 | 11 | -7 | 13 | 2 | 6 | |||
| -9 | 15 | -12 | 18 | 1 | 7 | |||
| -13 | 19 | -17 | 23 | 0 | 8 | |||
| -17 | 23 | -22 | 28 | -1 | 9 |
| 4:3 | 5:4 | 6:5 | ||||||
| 4 | 5 | 6 | ||||||
| 1 | 7 | 1 | 9 | 1 | 11 | |||
| -2 | 10 | -3 | 13 | -4 | 16 | |||
| -5 | 13 | -7 | 17 | -9 | 21 | |||
| -8 | 16 | -11 | 21 | -14 | 26 | |||
| -11 | 19 | -15 | 25 | -19 | 31 |
Some examples of "inharmonic" spectra are as follows:
| 9:11 | 7:11 | 11:7 | ||||||
| 9 | 7 | 11 | ||||||
| -2 | 20 | -4 | 18 | 4 | 18 | |||
| -13 | 31 | -15 | 29 | -3 | 25 | |||
| -24 | 42 | -26 | 40 | -10 | 32 | |||
| -35 | 53 | -37 | 51 | -17 | 39 | |||
| -46 | 64 | -48 | 62 | -24 | 46 |
| 1:1.41 | 1:1.05 | 0.5:1.6 | ||||||
| 1 | 1 | 0.5 | ||||||
| -.41 | 2.41 | -.05 | 2.05 | -1.1 | 2.1 | |||
| -1.82 | 3.82 | -1.1 | 3.1 | -2.7 | 3.7 | |||
| -3.23 | 5.23 | -2.15 | 4.15 | -4.3 | 5.3 | |||
| -4.64 | 6.64 | -3.2 | 5.2 | -5.9 | 6.9 | |||
| -6.05 | 8.05 | -4.25 | 6.25 | -7.5 | 8.5 |
The envelope of the modulator, called the spectral envelope, causes the timbre to change over the course of the tone. Once the basic elements of the series have been determined, the spectral envelope allows movement up and down the full collection of relevant Bessel functions, producing a range of change starting from a sine wave alone when the modulation index is zero up to some maximum value and any intervening value. The higher the modulation index, the more partials in the timbre, and the "brighter" the sound. Note, however, that the different c:m ratios shown above have different partial numbers in their respective orders, so the timbre shift may not be the same for a different series. Some c:m ratios like 1:1 and 1:2 have partial cancellations (or reinforcements) between the upper and lower sidebands of nearly every component (except the highest upper sidebands), and others, like 1:3 and 1:4 have no cancellations at all. For higher values of m, higher orders produce very high partial numbers.
Spectral envelopes are often similar to amplitude envelopes, with the same kinds of rise-sustain-decay shape, but there is no reason why they have to begin or end at zero. A zero value produces a sine wave as the spectrum, and a non-zero value a more complex spectrum of some kind. Using stored functions allows a range of shapes to be tried.
The csound program includes a special unit-generator called foscil that produces simple FM synthesis by itself. It is described as follows:
ar foscil xamp,kcps,kcar,kmod,kndx,ifn[,iphs]
where xamp is the amplitude, kcps is the frequency, kcar and kmod are the carrier and modulator expressed as a ratio, kndx is the modulation index, and ifn designates a sine wave (for sine waves only, foscili, the interpolating version of the unit, is not necessary). It is important to understand that this implementation is identical to the following series of csound statements:
kc = kcps*kcar km = kcps*kmod amd oscil kndx*km,km,ifn ar oscil xamp,amd+kc,ifn
(Here the multiplications kc and km are performed first to clarify the operations; these calculations could be written out in the inputs themselves.) Since kndx is multiplied by both kcps and kmod, its range of values is not usually too great (the manual mentions 0 to 4, but the DX-7 produces values as high as 13).
1. Use the following csound instrument to experiment with different c:m ratios and spectral envelopes:
instr 1 i1 = cpspch(p5) k1 oscil1 0,p8,p3,2 k2 linen p4,.025,p3,.025 a1 foscil k2,i1,p6,p7,k1,1 out a1 endin
This instrument uses p4 for the amplitude of the sound, p5 for the pitch in 8ve.pc form, p6 and p7 for the c and m ratios, respectively, and p8 for the modulation index. Function 1 is a sine wave and function 2 a spectral envelope. Try the following for your first spectral envelope:
f2 0 512 7 0 512 1
and use a maximum value of the modulation index of 4. Using these values, try each of the c:m ratios shown above. Try this with short notes and longer notes, high and low notes, and also use some larger and smaller values of the modulation index. Also try some further c:m ratios. These experiments will give you an idea of the process. Are there any of the given spectral families that remind you of any particular acoustic instruments? If so, what are they? If not, what qualities are lacking that would be needed to help identify any particular instrument?
2. Instead of the spectral envelope given, try each of the following with a selection of the c:m ratios given above:
(1) an A-R envelope:
f2 0 512 7 0 64 1 384 64 0
(2) a sharp exponential decay:
f2 0 512 5 1 512 .001
(Try this one with both harmonic and inharmonic spectra.)
(3) a crescendo-diminuendo shape (exponential):
f2 0 512 5 1 256 .5 256 1
Do any of these give a more interesting spectral envelope? Do they remind you of any particular instruments?
3. Generate the tones described on pages 123-126 of the Dodge-Jerse book for producing a bell-like sound, a wood-drum like tone, a clarinet-like tone and a brass-like tone.