Frequency Modulation (FM) Synthesis

by Hubert Howe

FM synthesis is a technique for synthesizing sounds that was invented by John Chowning in the late 1960s and subsequently implemented in musical instruments by Yamaha, who also patented the implementation, licensing other companies to use it if they wanted to. It has been used widely in computer music as well as keyboard-produced electronic music. The method is simple, because only simple sine wave oscillators and envelope generators are used to produce rich, time-varying spectra.

Simple FM Synthesis

In order to create frequency modulation (FM), the output of one sine wave oscillator (the modulator) is added to the frequency of another (the carrier), producing a complex spectrum consisting of sidebands above and below the carrier at intervals of the modulator. (Other types of modulation, such as AM, will also work, and these have been investigated by other researchers and composers.) Much of the theory of FM is devoted to understanding what sidebands are produced and how the amplitudes of the components change by means of varying the amplitude of the modulator. When only one sine wave modulates another, it is called simple FM. Other more complex types are discussed below.

The formula for computing sidebands in simple FM with sine waves for both carrier and modulator is c+-km, where k is an integer that determines the order of the sidebands. Additions form the upper sideband, and subtractions the lower sideband. The amplitudes of the components are determined by kth-order Bessel functions. The argument to the Bessel functions is the modulation index, which is directly proportional to the amplitude of the modulating signal. Putting an envelope (called the spectral envelope) on the modulator causes the amplitudes of the sidebands (timbre of the sound) to be varied dynamically.

In their book FM Theory and Applications, John Chowning and David Bristow state three rules for FM synthesis:

(1) Frequency components resulting from values of c and m will follow a pattern according to c+-km, for k=0, 1, 2, 3 ... n.

(2) To determine the number of side bands that should be calculated for an FM pair with a modulation index of I, add 2 to the value of the index I, and that's about how big k should become.

(3) The bandwidth (BW) of the spectrum can be estimated approximately according to the following relations:

(1) Where there are reflections about 0 Hz:

BW approx. = c + m * (I + 2).

(2) Where the carrier is greater than the modulator, and there are no reflections:

BW approx. = 2M * (I + 2).

While the formula c+-km works for frequencies themselves, it also works for ratios of frequencies, and these have the advantage that they can be transposed to any pitch that can be determined externally (i.e., by pressing a key). When km is larger than c, negative frequencies are produced. Negative frequencies are also produced when the Bessel functions produce negative amplitude values for positive frequencies. Negative frequencies are identical in sound to positive frequencies but are phase-shifted by 180 degrees, so that they can cancel with positive frequencies. Complete cancellation can occur only if the amplitudes are identical, which rarely happens.

There is one further complication to all these negative frequencies and negative amplitudes of positive (and negative) frequencies: odd order lower sidebands have a negative sign, so that when k=1, 3, 5, etc. the lower sideband's amplitude coefficient is multiplied by -1. This will be indicated by an "odd order" column in the tables below. To summarize, the only important aspect of negative frequencies, or positive frequencies with negative amplitudes, is how they interact with positive amplitudes to produce cancellations. In effect, move the sign to the amplitude and add or subtract it with the other values.

Bessel Functions

The graphs at the top of page 3, which is figure 4.3 from Chowning and Bristow's book, show the first six Bessel functions. The graphs show amplitude against the value of the modulation index, which varies from 0 to 16. The modulation index is the amplitude of the modulator, measured in terms of the modulating frequency (i.e., when the modulation index and the modulating frequency are the same, the modulation index would have a value of 1). Bessel functions, named J₀, J₁, ..., J_n, are natural mathematical functions (discovered by a man named Bessel) that explain the relationship between the modulation index and the amplitudes of the frequency components. That they work is a fact of nature; it is not necessary to understand anything about them other than that they account for these phenomena. You can use them if you want to know precisely what components are produced.

As you can see from the graphs, only the zeroth-order Bessel function has a positive value at zero, and all the others begin at zero and briefly hold that value until beginning a sine-like oscillation above and below zero, with maximum levels of about +.5 to -.3. Higher order functions hold the zero value longer, which is where Chowning and Bristow derive their second rule that the number of sidebands should be about I-2.

A Detailed Example

Let us use the information gleaned thus far to explain a few examples of the spectra produced by simple FM synthesis. In this example, our carrier frequency f_c is 220 Hz, the modulator f_m is 440 Hz, and the modulation index I is 4 (which means that the amplitude of the modulator will be 1760, since 1740/440=4). Rule 1 states that the frequency components are c+-k_m. It is convenient to show the upper and lower sidebands next to each other in a table, as follows:

Figure 1: The first six Bessel functions J₀ through J₅. The modulation index is on the horizontal axis of each graph and the amplitude on the vertical axis. The intersection of the Bessel function with a value for the index yields the amplitude scaling factor for the appropriate sidebands.

				c:m = 220:440
k	odd order	lower 			upper 		Jn for I=4
                        sideband                sideband
0					220			-.397
1	(-1)		 -220			 660		-.066
2			 -660			1100		 .364
3	(-1)		-1100			1540		 .414
4			-1540			1980		 .281
5	(-1)		-1980			2420		 .132
6			-2420			2860		 .049

The frequency for k=0 is shown in the middle because it is not a sideband. K is shown at the left, and the right column shows the values for the Bessel functions (taken from a table). Following rule 2, we show only the first six components (because I=4). Let us note that this table shows only what is happening when I equals 4, not when it is changing.

The table shows that we have frequencies from 220 to 2860, which all happen to be odd-numbered harmonic partials of the fundamental frequency of 220 Hz. Each positive frequency in the upper sideband is matched by a negative one in the next order of the lower sideband; but all the values of the lower sideband are negative. Nevertheless, in the right column we also see that the amplitude values of STRONG>J₀ and J₁ are also negative, which means that 220 in the zeroth order will become positive and 220 in the first order will also be positive, while the first order 660 will become negative. In order to determine the complete spectrum of the sound, we need to add the components together. Showing the results in the same tabular format as above, we get:

		lower sideband			upper sideband
				 220 at -.397
		 -220 at  .066			 660 at -.066
		 -660 at  .364			1100 at .364
		-1100 at -.414			1540 at .414
		-1540 at  .281			1980 at .281
		-1980 at -.132			2420 at .132
		-2420 at  .049			2860 at .049

Adding these together and showing the negative values as amplitudes, we get:

	partial number		frequency		amplitude
		 1		  220			 -.463
		 3		  660			 -.430
		 5		 1100			  .778
		 7		 1540			  .133
		 9		 1980			  .413
		11		 2420			  .083
		13		 2860			  .049

It is best to think of the amplitudes as relative to one another. This shows that the fifth partial is the strongest, the first, third and ninth are next, and that the seventh, ninth and thirteenth are last.

Now let us imagine that the spectrum changes by applying a varying modulation index from 0 to 4 over the course of the duration of the tone. The following tables show the spectra that are produced when I=0, 1, 2 and 3, and they also change the frequency values to ratios, thus showing partial numbers rather than absolute frequencies (so that you can see that you will get the same spectrum regardless of the frequency used).

When I=0, the only significant value is that of the fundamental, so that the spectrum is a sine wave of amplitude 1.

When I=1, we obtain the following sidebands:

k	odd order	lower 		upper 		Jn for I=1
                        sideband	sideband
0				   1			 .765
1	(-1)		-1		3		 .440
2			-3		5		 .115
3	(-1)		-5		7		 .020

The spectrum is as follows:

	partial number		amplitude
	1			.765+.44 = 1.205
	3			.44-.115 = .325
	5			.115+.02 = .117
	7			.02

At this point, the spectrum has 4 partials, with 1 very strong and 3 about a quarter of that, with the others less significant.

When I=2, we have the following:

k	odd order	lower 		upper 		Jn for I=2
			sideband	sideband
0				    1			 .224
1	(-1)		-1		3		 .577
2			-3		5		 .353
3	(-1)		-5		7		 .129
4			-7		9		 .033

	partial number		amplitude
	      1			.224+.577 = .801
	      3			.577-.353 = .224
	      5			.353+.129 = .482
	      7			.129-.033 = .096
	      9			.132-.043 = .089
	     11			.033

Now there are five partials, with 1 the strongest, 5 about half that, 3 about half of 5, and the others less significant.

When I=3, we have:

k	odd order	lower 		upper 		Jn for I=3
			sideband	sideband
0				    1			 -.260
1	(-1)		-1		 3		  .339
2			-3		 5		  .486
3	(-1)		-5		 7		  .309
4			-7		 9		  .132
5	(-1)		-9		11		  .043

	partial number		amplitude
	      1			-.260+.339 = .079
	      3			 .339-.486 = .147
	      5			 .486+.309 = .795
	      7			 .309-.132 = .177
	      9			 .132+.043 = .175
	     11		         .043

At this point, the 5^th partial is very strong, the 3^rd, 7^th and 9^th less so, and the fundamental and 11^th partials are weak.

Now we can generalize about the spectral change that occurs as the modulation index is varied from 0 to 4. The tables provide snapshots of the spectrum at five distinct points, but if we were working with a continuously changing modulation envelope, the changes would have been continuous. The spectrum starts as a sine wave and progresses to a rich series including up to the thirteenth partial; but the amplitudes of the individual partials change gradually, moving up and down as the values of the Bessel functions interact with each other. The fundamental started at 1, moved to 1.205, .801, .079, and ended at -.463. The third partial entered gradually and reached an amplitude of .325 at I=1, then moved to .224, -.147, and ended at -.430. The other partials made different changes, as shown in the tables.

These generalizations apply to all spectral envelopes. As the value of the index approaches zero, the spectrum becomes a sine wave, and as it increases, new sidebands are brought in up to the order of about I+2. The main difference is that different c:m ratios cause interactions between different groups of partials. This leads us to investigate the types of spectra produced by different groups of c:m ratios.

FM Spectral Families

When simple integers are used as c:m ratios, harmonic spectra are generated. The following generalizations apply to this process:

The modulator determines the partials in the series. The carrier determines the starting positions of the partials in their orders. A modulator of 1 produces all partials, starting from the carrier. Modulators of 2 and 4 produce spectra of only odd-numbered partials (as in square waves and triangle waves). Other even-numbered modulators produce subsets of the odd-partial series. Other modulators produce spectra that are clustered around multiples of the modulator without including the modulator. For example, 1:3 produces 2 and 4, 5 and 7, 8 and 10, 11 and 13, etc. or upper and lower neighbors to multiples of 3. 1:5 produces 4 and 6, 9 and 11, 14 and 16, 19 and 21, 24 and 26, etc. or upper and lower neighbors to multiples of 5.

The c:m ratio must be reduced to its lowest common denominator as a fraction; otherwise, the series will simply produce the series of the lowest common denominator transposed up to a higher pitch.

When spectra contain a number of low harmonic partials that are fairly close together (some researchers suggest a minimum of three consecutive harmonic partials are necessary), they meld into a tone with a complex tone color, with the fundamental frequency being perceived as the pitch and the partials being perceived as the timbre. If the components are widely spaced and do not contain consecutive elements of a harmonic series, they do not meld into a timbre, but are perceived as a cluster of separate tones. Such clusters can have qualities of "timbre" but lack the fundamental pitch quality of a harmonic series.

The following examples demonstrate some simple c:m ratios:

Harmonic Spectra

	1:1				1:2				1:3
	 1				 1				 1
 0		2		-1		 3		 -2		 4
-1		3		-3		 5		 -5		 7
-2		4		-5		 7		 -8		10
-3		5		-7		 9		-11		13
-4		6		-9		11		-14		16

	1:4				1:5				1:6
	 1				 1				 1
 -3		 5		 -4		 6	 	-5		 7
 -7		 9		 -9		11		-11		13
-11		13		-14		16		-17		19
-15		17		-19		21		-23		25
-19		21		-24		26		-29		31

	1:7				1:8				1:9
	 1				 1				 1
 -6		 8		 -7		 9		 -8		10
-13		15		-15		17		-17		19
-20		22		-23		25		-26		28
-27		29		-31		33		-35		37
-34		36		-39		41		-44		46

	1:10				2:1				3:1	
 	 1				 2				 3
 -9		11		 1		3		  2		 4
-19		21		 0		4		  1		 5
-29		31		-1		5		  0		 6
-39		41		-2		6		 -1		 7
-49		51		-3		7		 -2		 8

	3:2				3:4				3:5
	 3				 3				  3
  1		 5		 -1		 7		 -2		 8
 -1		 7		 -5		11		 -7		13
 -3		 9		-9		15		-12		18
 -5		11		-13		19		-17		23
 -7		13		-17		23		-22		28

	4:1				4:3				4:7
	 4				 4				 4
 3		 5		 1		 7		 -3		11
 2		 6		-2		10		-10		18
 1		 7	 	-5		13		-17		25
 0		 8	 	-8		16		-24		32
-1		 9		-11		19		-31		39

"Inharmonic" Spectra

	9:11				7:11				11:7
	 9				  7				 11
 -2		20		-4		18		  4		18
-13		31		-15		29		 -3		25
-24		42		-26		40		-10		32
-35		53		-37		51		-17		39
-46		64		-48		62		-24		46

	1:1.41				1:1.05				0.5:1.6
	 1				 1				  0.5
 -.41		2.41		 -.05		2.05		-1.1		2.1
-1.82		3.82		-1.1		3.1		-2.7		3.7
-3.23		5.23		-2.15		4.15		-4.3		5.3
-4.64		6.64		-3.2		5.2		-5.9		6.9
-6.05		8.05		-4.25		6.25		-7.5		8.5

Feedback FM

We have seen that a c:m ratio of 1:1 produces all partials from 1 up to some limit. In order to increase the power of the algorithms in its FM instruments, Yamaha developed feedback FM as a way of producing a complex spectrum from a single oscillator, by (rescaling and) feeding the output of an oscillator back into its own frequency input. The only limitation of feedback FM is that the spectral envelope must be identical to the amplitude envelope. This is useful in a number of situations.

Complex FM: Parallel Modulators

Parallel modulation occurs when two (or more) oscillators are summed and fed into the frequency of a single carrier. In this case, the frequencies of the modulators interact to produce more complex results. The formula for determining the resulting frequencies is c +- k₁m₁ +- k₂m₂, for k₁ = 0, 1, 2, ..., n₁ and k₂ = 0, 1, 2, ..., n₂. The amplitude scaling factors are multiplicative: Jk₁(I₁) times Jk₂(I₂).In order to compute the results for a single example, we will need several tables similar to each single table for simple FM. The following example shows what is needed when n₁.and n₂ both equal 3:

Parallel Modulation: c : m₁ : m₂, where n₁ and n₂ = 3


k1	k2	odd order	Amplitude Coefficients		Simple Sidebands
0	0			J0(I1) * J0(I2)					c
1	0	(-1)		J1(I1) * J0(I2)			c-m1			c+m1
2	0			J2(I1) * J0(I2)			c-2 m1			c+2 m1
3	0	(-1)		J3(I1) * J0(I2)			c-3 m1			c+3 m1 

0	1	(-1)		J0(I1) * J1(I2)			c-m2			c+m2
0	2			J0(I1) * J2(I2)			c-2m2			c+2m2
0	3	(-1)		J0(I1) * J3(I2)			c-3m2			c+3m2

		 odd		 Amplitude		 odd		Combination
k1	k2	order		Coefficients		order		  Sidebands
1	1	(-1)		J1(I1) *  J1(I2)			c-m1+m2		c+m1+m2
		(-1)(-1)				(-1)	c-m1-m2		c+m1-m2
2	1			J2(I1) *  J1(I2)			c-2m1+m2		c+2m1+m2
		(-1)					(-1)	c-2m1-m2		c+2m1-m2
3	1	(-1)		J3(I1) *  J1(I2)			c-3m1+m2		c+3m1+m2
		(-1)(-1)				(-1)	c-3m1-m2		c+3m1-m2

1	2	(-1)		J1(I1) *  J2(I2)			c-m1+2m2		c+m1+2m2
								c-m1-2m2		c+m1-2m2
2	2	(-1)		J2(I1) *  J2(I2)			c-2m1+2m2		c+2m1+2m2
								c-2m1-2m2		c+2m1-2m2
3	2	(-1)		J3(I1) * J2(I2)			c-3m1+2m2		c+3m1+2m2
		(-1)						c-3m1-2m2		c+3m1-2m2

1	3	(-1)		J1(I1) *  J3(I2)			c-m1+3m2		c+m1+3m2
		(-1)(-1)				(-1)	c-m1-3m2		c+m1-3m2
2	3			J2(I1) *  J3(I2)			c-2m1+3m2		c+2m1+3m2
		(-1)					(-1)	c-2m1-3m2		c+2m1-3m2
3	3	(-1)		J3(I1) * J3(I2)			c-3m1+3m2		c+3m1+3m2
		(-1)(-1)				(-1)	c-3m1-3m2		c+3m1-3m2

Comparing this to the tables above regarding simple FM, it is obvious that this simple process produces spectra of great complexity, even though the same amount of computation is involved in the production of each oscillator. This is a demonstration of the power of FM synthesis.

Let us compute a spectrum based on one carrier and two parallel modulators for the values c : m1 : m2 = 500 : 100 : 10 and indices of I₁=1 and I₂=0.5. Placing values into the table, we arrive at the following:

Parallel Modulation: c : m1 : m2 = 500 : 100 : 10, I1=1 and I2=0.5


		odd	Amplitude
k1	k2	order	Coefficients				Simple Sidebands
0	0		J0(1) * J0(.5) = .76				500
1	0	(-1)	J1(1) * J0(.5) = .42		500-100=400		500+100=600
2	0		J2(1) * J0(.5) = .10		500-200=300		500+200=700
3	0	(-1)	J3(1) * J0(.5) = .03		500-300=200		500+300=800
0	1	(-1)	J0(1) * J1(.5) = .2		500-10=490		500+10=510
0	2		J0(1) * J2(.5) = .02		500-20=480		500+20=520

		 odd	 Amplitude		 odd		Combination
k1	k2	order	Coefficients		order		 Sidebands
1	1	(-1)	J1(1) *  J1(.5)=.11		500-100+10=410	500+100+10=610
		(-1)2				(-1)	500-100-10=390	500+100-10=590
2	1		J2(1) *  J1(.5)=.03		500-200+10=310	500+200+10=710
		(-1)				(-1)	500-200-10=290	500+200-10=690
3	1	(-1)	J3(1) *  J1(.5)=.008		500-300+10=210	500+300+10=810
		(-1)2				(-1)	500-300-10=190	500+300-10=790

1	2	(-1)	J1(1) *  J2(.5)=.01		500-100+10=420	500+100+10=620
		(-1)					500-100-10=380	500+100-10=580
2	2		J2(1) *  J2(.5)=.003		500-200+10=320	500+200+10=720
							500-200-10=280	500+200-10=680
3	2	(-1)	J3(1) *  J2(.5)=.001		500-300+10=220	500+300+10=820
		(-1)					500-300-10=180	500+300-10=780

Sorting out the spectrum (and showing only the relative amplitudes of the components), we obtain the following:


frequency	amplitude			frequency	amplitude
 180		 .001				 510		 .2
 190		 .008				 520		 .02
 200		 .03				 580		 .01
 210		 .008				 590		 .11
 220		 .001				 600		 .42
 280		 .003				 610		 .11
 290		 .03				 620		 .01
 300		 .1				 680		 .003
 310		 .03				 690		 .03
 320		 .003				 700		 .1
 380		 .01				 710		 .03
 390		 .11				 720		 .003
 400		 .42				 780		 .001
 410		 .11				 790		 .008
 420		 .01				 800		 .03
 480		 .02				 810		 .008
 490		 .2				 820		 .001
 500		 .76

As you can see, the partials are clustered 10 and 20 Hz above and below multiples of 100, from 200 to 800 (the second through eighth partials of a 100 Hz tone, which is what we would expect from c : m₁), with the loudest point at the carrier frequency itself and harmonic partials significantly higher than non-harmonic ones (because I₁ is greater than I₂).

In this example, neither of the indices is large enough to produce negative frequencies. (It is also unusual that the values of the indices are so low that they only have positive values.) Also, no cancellations occur, because each of the expressions produces a unique value. In the following example, we change the value of m₂to 500, the same as c, and we leave the indices at I₁ = 1 and I₂ = .5. We will also express the frequencies in terms of ratios. The values produced are as follows:

c : m1 : m2 = 5 : 1 : 5, I₁=1 and I₂=0.5

		odd	Amplitude
k1	k2	order	Coefficients				Simple Sidebands
0	0		J0(1) * J0(.5) = .76				5
1	0	(-1)	J1(1) * J0(.5) = .42		5-1 = 4			5+1 = 6
2	0		J2(1) * J0(.5) = .10		5-2 = 3			5+2 = 7
3	0	(-1)	J3(1) * J0(.5) = .03		5-3 = 2			5+3 = 8
0	1	(-1)	J0(1) * J1(.5) = .2		5-5 = 0			5+5 = 10
0	2		J0(1) * J2(.5) = .02		5-10 = -5		5+10 = 15

		 odd	 Amplitude		 odd		   Combination
k1	k2	order	Coefficients		order		     Sidebands
1	1	(-1)	J1(1) *  J1(.5)=.11		5-1+5 = 9		5+1+5 = 11
		(-1)2				(-1)	5-1-5 = -1		5+1-5 = 1
2	1		J2(1) *  J1(.5)=.03		5-2+5 = 8		5+2+5 = 12
		(-1)				(-1)	5-2-5 = -2		5+2-5 = 2	
3	1	(-1)	J3(1) *  J1(.5)=.008		5-3+5 = 70		5+3+10 = 13
		(-1)2				(-1)	5-3-5 = -3		5+3-10 = 3

1	2	(-1)	J1(1) *  J2(.5)=.01		5-1+10 = 14		5+1+10 = 16
		(-1)					5-1-10 = -6		5+1-10 = -4
2	2		J2(1) *  J2(.5)=.003		5-2+10 = 13		5+2+10 = 17
							5-2-10 = -7		5+2-10 = -3	
3	2	(-1)	J3(1) *  J2(.5)=.001		5-3+10 = 12		5+3+10 = 18
		(-1)					5-3-10 = -8		5+3-10 = -2

Sorting out the entire spectrum, and taking cancellations into account, we arrive at the following spectrum:

partial		amplitude			partial		amplitude
   1		.11 + .11 = .22			   10		   .2
   2		.03+.03-.03 = .03		   11		   .11
   3		.1[+-.008]-.003 = .092	  	   12		   .01-.001 = .029
   4		.42 - .01 = .41			   13		   .008 - .003 = .005
   5		.76 - .02 = .74			   14		   .01
   6		.42 - .01 = .41			   15		   .03
   7		.1 + .008 - .003 = .105	 	   16		   .01
   8		.03 + .03 - .001 = .059	 	   17		   .003
   9		.11

The fifth partial is the strongest by far, followed by the fourth and sixth that are a little more than half of the fifth, and the fundamental and tenth are about half of the fourth and sixth, then come the ninth and eleventh, and the rest are very weak. In a more general sense, the amplitudes are clustered around the fifth most prominently, followed by the tenth, and to a much lesser extent the fifteenth. This is because of the ratios of c : m₁, and that I₁ = 1, which is higher than I₂. If you adjust the envelope of m₁ and m₂ so that they have different shapes, you will see how the two modulators interact to bring out different aspects of the spectrum.

Cascade Modulators

When three oscillators are connected together in a cascade manner, the top modulator m₁ modulates the middle oscillator m₂, which in turn modulates the carrier. In effect, the carrier is modulated by a complex wave. The frequencies produced are the same as with parallel modulation; the main difference is the amplitudes of the components. The order of the first modulator k₁ is used to scale the index of the second modulator k₂. The frequencies are determined by c +- k₁ m₁ +- k₂ m₂, and the amplitudes are J_k1 (I₁) * J_k2 (k₂ * I₂), k₁ = 0, 1, 2, ..., n₁ (where n₁ = I₁ + 2) and k₂ = 0, 1, 2, 3, ..., n₂ (where n₂ = I₂ + 2). Chowning and Bristow list three important differences between parallel and cascade modulation:

(1) The amplitude of the carrier is determined by the index I₁ only (unless there are reflected side frequencies that fall at the carrier).

(2) Because orders of I₁ greater than 1 (J₂(I₁) ... Jk₁(I₁)) cause I₂ to be even larger (Jk₂ (2*I₂) ... Jk₂ (k₁*I₂)) there is greater energy in the higher order combination sidebands when compared to parallel modulation.

(3) There are no simple sidebands around the carrier resulting from m₂ ( +- 1m₂ ... c +- k₂ m₂). The order for I₁ in this case is 0, while the order for I₂ ranges from 1 to k₂. Since the index I₂ is multiplied by 0, the effective index is 0, and for all orders of J greater than 0 the resulting coefficient is 0.

When we show the formula for computing the sidebands in cascade modulation c : m₁ : m₂, where n₁ and n₂ = 3, the result is as follows:

Cascade Modulation: c : m₁ : m₂, where n₁ and n₂ = 3

k1	k2	odd order	Amplitude Coefficients		Simple Sidebands
0	0			J0(I1) * J0(0*I2)				c
1	0	(-1)		J1(I1) * J0(1*I2)		c-m1			c+m1
2	0			J2(I1) * J0(2*I2)		c-2 m1			c+2 m1
3	0	(-1)		J3(I1) * J0(3*I2)		c-3 m1			c+3 m1 

0	1	(-1)		J0(I1) * J1(0*I2)		c-m2			c+m2
0	2			J0(I1) * J2(0*I2)		c-2m2			c+2m2
0	3	(-1)		J0(I1) * J3(0*I2)		c-3m2			c+3m2

k1	k2	 odd 		 Amplitude 		 odd		Combination
		order		Coefficients		order		  Sidebands
1	1	(-1)		J1(I1) *  J1(1*I2)		c-m1+m2			c+m1+m2
		(-1)(-1)				(-1)	c-m1-m2			c+m1-m2
2	1			J2(I1) *  J1(2*I2)		c-2m1+m2		c+2m1+m2
		(-1)					(-1)	c-2m1-m2		c+2m1-m2
3	1	(-1)		J3(I1) *  J1(3*I2)		c-3m1+m2		c+3m1+m2
		(-1)(-1)				(-1)	c-3m1-m2		c+3m1-m2

1	2	(-1)		J1(I1) *  J2(I2)		c-m1+2m2		c+m1+2m2
		(-1)						c-m1-2m2		c+m1-2m2
2	2			J2(I1) *  J2(I2)		c-2m1+2m2		c+2m1+2m2
								c-2m1-2m2		c+2m1-2m2
3	2	(-1)		J3(I1) * J2(I2)			c-3m1+2m2		c+3m1+2m2
		(-1)						c-3m1-2m2		c+3m1-2m2

1	3	(-1)		J1(I1) *  J3(I2)		c-m1+3m2		c+m1+3m2
		(-1)(-1)				(-1)	c-m1-3m2		c+m1-3m2
2	3			J2(I1) *  J3(I2)		c-2m1+3m2		c+2m1+3m2
		(-1)					(-1)	c-2m1-3m2		c+2m1-3m2
3	3	(-1)		J3(I1) * J3(I2)			c-3m1+3m2		c+3m1+3m2
		(-1)(-1)				(-1)	c-3m1-3m2		c+3m1-3m2

Since the Bessel function coefficient for an index of 0 for any order greater than zero is zero, there is no energy in the second set of simple sidebands. The amplitudes of the combination sidebands also differs from parallel modulation in that they increase slightly for higher orders of Jk₁.

Now we will use the same values, c : m1 : m2 = 500 : 100 : 10 and indices of I₁=1 and I₂=0.5. which we used for parallel modulation, in order to show the components for cascade modulation. Compare the following table to that on page 10 above:

Cascade Modulation: c : m1 : m2 = 500 : 100 : 10, I₁=1 and I₂=0.5


		odd	Amplitude
k1	k2	order	Coefficients				Simple Sidebands
0	0		J0(1) * J0(0*.5) = .8				500
1	0	(-1)	J1(1) * J0(1*.5) = .42		500-100=400		500+100=600
2	0		J2(1) * J0(2*.5) = .10		500-200=300		500+200=700
3	0	(-1)	J3(1) * J0(3*.5) = .165		500-300=200		500+300=800
0	1	(-1)	J0(1) * J1(0*.5) = 0		500-10=490		500+10=510
0	2		J0(1) * J2(0*.5) = 0		500-20=480		500+20=520

		 odd	 Amplitude		 odd			Combination
k1	k2	order	Coefficients		order			  Sidebands
1	1	(-1)	J1(1) *  J1(1*.5)=.11		500-100+10=410	500+100+10=610
		(-1)2				(-1)	500-100-10=390	500+100-10=590
2	1		J2(1) *  J1(2*.5)=.04		500-200+10=310	500+200+10=710
		(-1)				(-1)	500-200-10=290	500+200-10=690
3	1	(-1)	J3(1) *  J1(3*.5)=.017		500-300+10=210	500+300+10=810
		(-1)2				(-1)	500-300-10=190	500+300-10=790

1	2	(-1)	J1(1) *  J2(1*.5)=.01		500-100+10=420	500+100+10=620
		(-1)					500-100-10=380	500+100-10=580
2	2		J2(1) *  J2(2*.5)=.01		500-200+10=320	500+200+10=720
							500-200-10=280	500+200-10=680
3	2	(-1)	J3(1) *  J2(3*.5)=.008		500-300+10=220	500+300+10=820
		(-1)					500-300-10=180	500+300-10=780

Sorting out the spectrum (and showing only relative amplitudes), we obtain the following:

frequency	amplitude			frequency	amplitude
 180		 .008				 580		 .01
 190		 .017				 590		 .11
 200		 .165				 600		 .42
 210		 .017				 610		 .11
 220		 .008				 620		 .01
 280		 .01				 680		 .01
 290		 .04				 690		 .04
 300		 .1				 700		 .1
 310		 .04				 710		 .04
 320		 .01				 720		 .01
 380		 .01				 780		 .008
 390		 .11				 790		 .017
 400		 .42				 800		 .165
 410		 .11				 810		 .017
 420		 .01				 820		 .008
 500		 .8

As you can see, the amplitudes are a bit higher, and the harmonic partials higher, than before.

In general, cascade modulation produces the most complex types of spectra. One reason why Yamaha decided not to include a noise generator in the DX-7 was because it is quite simple to obtain a noise-like spectrum simply by using cascade modulation with unrelated-looking elements. For example, try setting the ratios as follows: c : m₁ : m₂ = 53.63 : 0.72 : 0.58, and set the indices (output levels) of the modulators to about 1.5 and 9. Then play a key around middle C. Varying the frequency of the carrier down to 0.9 will produce many varieties of "noise" from "hissing steam" to "thunder".

Implementation in Yamaha Instruments

Yamaha devised a system of algorithms in implementing FM for real-time performance. Each FM instrument contains a number of operators (six for the more expensive instruments like the DX-7 and the SY77, and four for the less expensive ones like the DX-5, DX-21 and TG55), consisting of an envelope generator and sine wave oscillator. The algorithm shows how the operators interconnect in order to produce FM. These connections are shown by diagrams embossed on the front panel of the instrument (for the DX instruments) or displayed in the edit mode (SY instruments). If the output simply goes out the result is a sine wave. If the output of one operator is fed back into its input, the result is feedback FM. If one operator goes into another and the output of that operator goes out, the result is simple FM. If three or four operators are connected in series, the result is cascade FM, and if two (or more) go into a third (or fourth) operator the result is parallel FM. Since all six operators are always used (although the programs may zero out the amplitudes of components that are unnecessary), most sounds consist of a mixture of two to six tones (all sine waves in the latter case).

When investigating a sound, the user should first determine how many constituent tones there are and listen to each separately. The operators can be turned on and off by means of buttons in the edit mode. The position of the operator in the algorithm diagram shows whether the operator is a carrier or modulator. Carriers determine the frequency of the tone, and the carrier envelope is the amplitude envelope. The modulator(s) determine the timbre of the tone, and the modulator envelope is the spectral envelope. The user needs to understand the algorithm, or the parameters may be meaningless.

In the edit mode, all parameters of the FM process may be set. Algorithm allows the use to set the algorithm; of course, if the algorithm of a preset sound is changed, the result may not be meaningful, although some interesting results may be discovered by such a process! The oscillator parameter allows the user to set the coarse frequency either as a ratio (from .5 to 31) or as a fixed value (in this case, the frequency is the same regardless of which key is pressed). The fine adjustment allows the value to be adjusted in terms of 1/100th of the amount (.01 for 1, .02 for 2, etc.). The detune feature (-15 to +15) allows the pitch to be detuned in very fine steps of about 1.17 cents (detune is a much finer adjustment than the fine frequency). In addition to using a sine wave, Yamaha's SY77 allows the user to select from 16 different waveforms, which produce more complex (and much more complicated!) Spectra.

The operator on/off switches are located on the right switchboard of the SY77, or on the 32 program keys of the DX-7. To hear a tone, the carrier must always be on; if all modulators are off, it will be a sine wave.

The output level of each operator determines the amplitude level of the output. For carriers, this is simply the amplitude level of the sound (often set to the maximum in the program so that the performer can determine the level by velocity or volume pedals). Yamaha has developed an exponential scale for the relationship between the output level and the modulation index. The maximum value of the modulation index that can be obtained is 13, when the level is at the maximum (99 for the DX-7, 127 for the SY77). The exponential curve rises from nearly zero to reach a value of 4 at 85 per cent of the maximum level (85 on the DX-7, 108 on the SY77). Thus, extremely fine adjustments can be made at lower levels.

The envelope generators (EGs) on both the DX-7 and SY77 are four-position contour generators with five levels and four rates, where L0 is the beginning of the sound and L4 the level reached after the final decay is finished. For uniformity, all levels may be set to any value (0-63 on the SY77, 0-99 on the DX-7). For carriers, if L0 or L4 is not zero, the sound will "leak through" and never go off (AWM EGs do not allow non-zero values for these levels). It is perfectly reasonable for modulation EGs to begin and end at non-zero values

EG attack and decay times are set in terms of rates rather than times, and there is no documentation explaining the relationship between rates and times. The SY77 has rates between 0 (the lowest rate, and basically an infinite time) are 63 (the fastest, almost instantaneous). Researchers have measured time values for the DX-7, which allows rates from 0 to 99, and they have discovered that there is something like an exponential scale.

Output levels can be modified by up to 4 user-selectable break points to allow the keyboard to produce greater or lesser values in different regions. The offsets can be set from -127 to +127, so the changes can be dramatic.

In addition to having the spectral envelopes shape the timbre of the sound, Yamaha has also built two filters into the SY77 (not into the DX-7), which work exactly like those in the AWM mode. The filter types can be set from Thru (not used) to LPF or HPF (low or high pass filters), while filter 2 can only be a LPF. The resonance value converts LP-type filters into bandpass filters, which can also be achieved by setting filter 1 to HP and 2 to LP and adjusting the frequencies. There is also a control EG and LFO to produce time-varying responses (for "wa-wa" and "yeow" effects) as well as sensitivity settings. This process applies analog synthesis technology to FM.

The AFM job directory in the SY77's edit mode also has settings for sensitivity, LFO, and pitch EG. These are the same as in AWM synthesis.

Further Investigation

There is an extensive literature on FM. Chowning's original article in the Computer Music Journal has been reprinted several times, and other researchers have also investigated it extensively. Several articles by the Canadian composer Barry Truax explain compositional applications and spectral theories. There are also many books on the Yamaha instruments. While these do not go very deeply into theory, they are valuable for learning the practical aspects of programming these interesting and complex instruments.