Synthesis Methods - A Brief Review

I suppose there is a possibility that many who would have the occasion to access this article online are unaware of the variety that exists in models of sound synthesis. It is ironic that the word synthesizer, in popular vocabulary, is nearly always associated only with subtractive synthesisers based on a modular design. FM synthesis, as implemented in Yamaha's machines, exists in this conceptual framework also. But to say that the true variety of synthesis methods is enormous would be an understatement: to say that some of the techniques are mathematically esoteric is pointing in the right direction. This article does not cover the full range of the subject of synthesis methods, but rather concentrates on a few pertinent areas so as to provide the reader with a useful introduction.

Further articles, to be published on this site, shall follow on from this one, and will give greater detail on some of the subjects that are alluded to here. Until those articles are published, you may find one of the most useful links to free online information on synthesis and other audio topics at the Helsinki University of Technology, (http://www.acoustics/hut.fi/) To view their papers, you will need a postscript viewer program. Try GhostScript and its companion interface, GhostView.


Modular (Subtractive) Synthesis.

When small, affordable synthesisers such as the Mini Moog, first appeared in the early 1970's, the most cost-effective method of synthesis was a rigid, modular design with passive filters. The rigid nature of the standard modular design was the idea of Robert Moog, who sought to create an affordable instrument. The ancestors of this design provided jacks and patch plugs for the connection of modules. One of the cost-saving ideas was to eliminate these in favour of a hard-wired system. Another advantage is that a rank beginner can begin making music immediately, whereas previously a full understanding of the instrument was required. Because this was such a boon to musicians who previously could not afford to own their own synthesiser, this design was very quick to become the industry standard, not only for instrument makers, but also for musicians. It is arguable, however, that seeing as more elaborate and flexible methods are now possible at an affordable price, the commonness of this design is incongruent with its worth. Nevertheless, many computer applications exist for the generation of sounds by this sort of algorithm.

As to the common modular design itself, just briefly, you start with one of a selection of pre-designed wave shapes. An oscillator, or any number of oscillators (which together allow for polyphony), produces a signal according to the chosen waveform. The wave is then sent through an amplifier and a filter, which are controlled by either an envelope generator or a low frequency oscillator (LFO). These together are able to produce a semblance of expression and tonal variation. The pitch of sound generated is controlled by the frequency input device (a keyboard, a sequencer, or an algorithm), a portamento control, if such is provided, and by another envelope generator and LFO. An on-board DSP unit may be available to provide additional effects, although the classic design did not incorporate this. It is a useful inclusion, especially when the unit is able to accept external input, but in my own mind, unless the unit is capable of extreme contortions it is a case of over-engineering.

Here is a point to be aware of in re-creating an analog circuit on a computer.

Wave Shaping Synthesis.

There is more than one form of wave shaping synthesis around. A form developed by Risset is used quite effectively in several wave editing programs such as Cool Edit (Syntrillium Software) as an algorithm for the creation of a wide variety of distortion effects. For some reason, Cool Edit provides an input interface which allows only a form of input suitable to distortion, even while the algorithm may be used to create other effects also. But for reason of its usefulness in distortion, it is detailed in another article on this site, Distortion and Civilised Behaviour.

It will be more productive, however, to look for now at the application of Chebychev Shaping Functions to soundwaves, as developed by Arfib and LeBrun. It has been demonstrated that the Chebychev polynomials can be used to add specific harmonics to a constant sinusoid wave. A shaping function is derived from the input wave, either a sine or cosine, which consists of values between -1 and +1, and an appropriate member of the set of Chebychev polynomials. Taking T(sub k) to indicate one of these polynomials, and k to be the ordinal value of the polynomial within the set, the shaping function, w, can be described like this:

w = Tk * cos(q) = cos(k *q)

This is another way of saying that the frequency of the wave which is generated from the left hand portion is k times the frequency of the fundamental or input wave. To put it yet another way, the output is the kth harmonic.

As an example, a shaping function to generate a steady cosine wave so as to add a 2nd harmonic at 0.4 of the fundamental amplitude and a 3rd harmonic at 0.2 of the fundamental amplitude a formula like this would be used:

w = T0 + (0.4 * T2) + (0.2 * T3)

The results, using changing x values for the Chebychevs, can then be placed in a transfer function wavetable (or to put it in programming jargon, a lookup table). An input cosine wave then contains the harmonics that appear here as the k values.

The foregoing has been only the foundation to the theory. Arfib showed that an input of a wave having a changing frequency (as opposed to the constant frequency of the above) yields inharmonic partials and formant structures. More importantly from a musician's point of view is the effect generated when the input wave is a complex recorded sound. The effect is similar to phase shifting, as undulating harmonics are generated.


Physical Modelling.

One of the broadest, most fascinating, and as luck would have it, most awkward areas of sound synthesis is physical modelling synthesis. Through physical modelling, a soundwave is created that is very close to a natural sound. For example, the acoustically significant properties of a clarinet, and the way these properties affect one another, are translated into program code. Such a program may be created to take input of some kind - a series of notes from a MIDI device, a data file, or perhaps even a device based on an instrument, which would enable a musician to provide input for playing style.

Depending on the capabilities of the system on which such a program is run, one just might hear it happening in real time. It is far more likely, however, that on the average desktop computer of today (that is, the 27th of February 1999, perhaps not in a week or two), that a worthwhile physical modelling application is going to create its output so slowly that it would be almost useless.

An application that models a real sound source would generate waves from functions and procedures that are based not only on physical laws, but experimental data. The output will then represent a sort of statistical average of a multitude of smaller units and relations (molecules, atoms, internal structures and their influence on each other), that if occurring in the real world, will not seem at all incongruent.


Dynamic Stochastic Synthesis.

Dynamic Stochastic Synthesis (DSS) is a brainchild of the composer Iannis Xenakis and is explained in greater depth in his book Formalized Music. It is included here not for its importance so much as for its interest value. In a certain sense, it is the exact reverse of the now-cliched modular synthesis formula. Even though subtractive synthesis removes elements from a waveform with passive filters, it can be seen as adding elements. An ordered waveform becomes less ordered. Additive techniques which involve mixing, cross-modulation, active filtering, and the addition of individual harmonics (maybe by a resonant LPF or otherwise) obviously also fall into this category. The result comes about by a process of disordering. But DSS follows the reverse begins with a wave-based on a very complex pseudorandom function based on probability distributions like Poisson, exponential, Gaussian, uniform, Cauchy, arcsine and logistic. It then introduces restrictions that lend a semblance of order to the resulting sound.

There are five means by which this is done.

  1. Alteration of amplitude and time variables through functions based on elastic forces and random numbers.
  2. Use of random numbers that bounce back and forth between elastic boundaries.
  3. Using probability functions to generate the values of parameters of other probability functions, for the creation of waveforms.
  4. Recognising an ordered hierarchy of classes of wave shape characteristics, as generated by the probability functions.
  5. Random usage of the available elements.

The technique, then, would be rather complicated to use. It is possibly best suited to the creation of art music. The accompanying figure exemplifies the "input" and final output stages of DSS. 

Granular Synthesis.

The basis of granular synthesis is the construction of sounds from a mass of very short segments of sound. These sounds may be generated waves or they may be windowed envelopes from a sample file or tape recording (such as those used in the Fourier Transform, and similar bell-shaped functions). These segments are known as grains. The duration of a grain might be in the order of 1ms to 100ms.

A piece of music generated by granular synthesis will usually involve a massive layering of grains, each of which may repeat at varied intervals and in varied forms, to create rich and changing sonic texture. Granular synthesis then has a great deal of power to offer the composer, keeping in mind that human sonic perception responds not only to the shape, amplitude and major period of a wave, but also to patterns of pitch change over short periods.

Serious applications in granular techniques can involve thousands of parameters for the creation of each individual grain (and a higher-level interface to simplify the input process). The massive amounts of calculation involved, then, make this sort of process a very computationally demanding one indeed. The basic concept, however, should allow for the great breadth of definition. Thus, simplified forms might be attempted so as to create satisfactory results in a shorter time.

Some granular techniques are based on the Fourier and Wavelet analyses, which deconstruct sampled sounds according to the pitches and phases of basic components. The grains are composed of the waves that form the output of such an analysis. Thus, the input source may be used to govern the type of music that is produced: whether it is dissonant or harmonious, sharp or mellow, and so forth.

Pitch Synchronous Granular Synthesis also involves spectrum analysis, but only as one part of the process. The first stage involves pitch detection. A separate grain is copied for each wave-period of any detected pitch. Each grain is subjected to a spectrum analysis by impulse response. The analysis is used to set parameters for a series of FIR filters, which process the grain. The grain is then added to the output stream in such a way as to overlap with the previous grain. Extensions to this method exist. One such allows the separation of the harmonic and inharmonic components of the sound.

Asynchronous Granular Synthesis might be said to epitomise granular synthesis since its approach clearly discards the methodology of linear resynthesis and/or analysis that is in common with the Fourier and wavelet transforms. AGS begins with a given time frame, that is the duration of the finished piece. Sonic grains are scattered according to a statistical function across this time period. The dimension of time is used as a co-ordinate along with pitch, to create what is known as a cloud. The cloud is the basic unit that the composer works with. So, a specific range of pitches can be described along the time axis, and this cloud-area is then filled with grains. The density of grains per second can be specified, and changed over time, as can the amplitude of grains. The most powerful feature is the ability to specify the waveform of grains. These either may be sampled or generated waves. Panning attributes also take advantage of the number of available output channels.

References

Arfib, Daniel, 1979.
"Digital synthesis of complex spectra by means of multiplication of non-linear distorted sine waves" in Journal of the Audio Engineering Society, 27(10), p757-779.

Le Brun, Marc, 1979.
"Digital Waveshaping Synthesis" in Journal of the Audio Engineering Society, 27(4), 1979, p250-266.

Risset, Jean-Claude, 1969.
Catalog of Computer Synthesized Sound, Murray Hill, Bell Telephone Laboratories.

Xenakis, Iannis, 1992.
Formalized Music, Revised Edition, Pendragon Press, New York.

Related Posts

Bias and Subsonics in Mixing and Stereo Imaging
Bias and Subsonics in Mixing and Stereo Imaging
  DC Bias or Baseline Offset is one of a class of effects that are utterly unappreciable unless used in the right con...
Read More
Sound Positioning
Sound Positioning
Many studies have been carried out since the late 19th century, which show cues other than left-right balance to be o...
Read More
The Man Who Invented Music
The Man Who Invented Music
We all know Pythagoras did quite a lot for high school mathematics; most of us are aware that his "Theorem" is also t...
Read More