Rackmode Vocoder

The Rackmode Vocoder replicates and expands on a classic 16 Channel Vocoder, originally released in 1978. Interestingly, the original was something of an "OEM" rebadge, as it was originally marketed as the Bode model 7702. As far as we know, the two are functionally identical, and both sound awesome. The original was used extensively on moustache 'n' shades innovator, Giorgio Moroder's release "E=MC²" - a spectacularly awful-yet-fantastic time capsule of late-70s electronic music.

If you're not familiar with how a vocoder operates, the basic concept is that a vocoder imparts the spectral characteristics (usually a human voice) upon a constant tone source (usually a synthesizer playing bright, static tones). This classic voice-modulating-synth setup results in the classic robot tones associated with vocoders, but vocoders are actually capable of a wide variety of effects dependent upon the audio sources used.

How A Vocoder Works

We're going to simplify the building blocks of a vocoder somewhat in order to make things easy to conceptualize. After this, you'll totally understand how vocoders work and you'll be the life of the party - you can thank us later!

As mentioned, a vocoder imparts the frequency spectrum of one sound onto another. In the diagram above, a microphone's frequency spectrum (i.e., speech), is imparted onto the sound of a synthesizer playing a basic sawtooth-wave patch. Let's break down what's happening:

At the bottom of the image, a synthesizer plays constant notes using a harmonically rich sawtooth wave. This is referred to as the "carrier" signal. The synthesizer output is multed to three bandpass filters (aka, BPFs). Because each filter only allows a certain region of the audio frequencies to pass through, this effectively splits the signal into low-, mid-, and high-frequency components (an actual vocoder has many more individual bandpass filters which we'll discuss later, but for our example, we've simplified the circuit to three bandpass filters). Each of these feeds an individual voltage-controlled amplifier (VCA), which acts as volume control that gets louder or softer depending on a separate incoming control voltage (CV). If the VCA's CV inputs are not currently receiving a control voltage, the VCA's are "closed," that is, no sound is allowed to travel through. This is why a vocoder doesn't make any sound unless you're holding down keys and speaking into the mic.

Now let's examine what happens when the split-up synthesizer carrier signal is playing into the VCAs and the mic is spoken into. When the user speaks into the microphone, the mic signal is multed into an identical second set of bandpass filters and also split into frequency bands - this is called the "modulator" signal (the original refers to this as the "program" signal, as does Rackmode Vocoder), and the modulator BPF's are referrred to as "analyzer" filters, because they're effectively analyzing the incoming mic signal. The low-, mid-, and high-frequency components each travel to individual envelope followers. The envelope followers convert the constantly oscillating, up and down AC audio signal voltages to single-polarity DC control voltages proportional to each frequency band's overall signal volume. These control voltages are routed to the carrier signal VCA CV ins and (here's where the magic happens) cause the carrier (synth) bandpass filter volumes to mimic the volume and frequency spectrum of the modulator signal filters. The audio outputs of all VCA's are then mixed together and travel to the vocoder output.

The preceding is a simplified explanation. First and foremost, if we want good fidelity from our vocoder (voice intelligibility is the main factor here), more than three sets of bandpass filters (bands) are necessary. Most commercially available vocoders have at least 10 bands, and Rackmode Vocoder has 16 (we're not impressed though - Woodstock had 32 bands). Taking into consideration that this means 32 individual bandpass filters, 16 VCAs, and 16 envelope followers, and it's easy to see why hardware vocoders have historically been pricey affairs.

There are few other things going on under the hood that help with speech intelligibility and general fidelity of a vocoder, including adding highpassed noise and modulator signal to clarify S sounds (the Hiss you see on the front panel), weird EQ curves, etc. Suffice to say, theoretically making a vocoder isn't too complicated, but making a really great-sound vocoder is indeed tricky business.

An important aside is that a vocoder does not detect specific note pitches in the modulator signal, nor will they be audible in the vocoder output. This means that if you're using a mic as a modulator source, it makes no difference whatsoever whether you speak words or sing, on key, off key, or otherwise. The actual notes you'll hear will always be those of the carrier signal. When using a microphone (or recorded speech), the best results are usually obtained by speaking very clearly and exaggerating diction. TL;DR: singing ability has no effect on vocoder results. (Hooray for lousy singers!)

Rackmode Vocoder Instrument vs. FX Version
(Important, please read!)

As mentioned, vocoders make use of two separate audio inputs, a modulator (Program) signal, and a carrier signal. As a result, it's very important to understand how these two signals get routed within DAW software otherwise it could lead to frustration, head scratching, tech support tickets, and those clowns bugging me, which no one wants.

Rackmode Vocoder includes two different plug-in versions. Rackmode Vocoder Instrument includes a one "rack-space" MIDI-controlled dual-oscillator synthesizer, which acts as the carrier signal source. This is hard-wired to Rackmode Vocoder's carrier input and Level knob. It features ramp, square, and a very special "glottal" wave that enables exceptional sounding voice synthesis. The instrument version appears in your DAW's instrument menu like any other virtual instrument. Its onboard dual-oscillator synth is always the carrier signal, and the modulator signal is derived from your DAW's sidechain input (this routed is configured within a DAW using a popup menu in the effect window). If you're primarily using Rackmode Vocoder to create robot voices or synthetic choirs, the instrument version may be all you'll ever need.

Conversely, the Rackmode Vocoder FX version appears in your DAW's effects menu; it does not include the single rack-space onboard synthesizer. Carrier and modulator sources are completely up to you; the carrier signal is the audio content of the current track (either a prerecorded audio region or live incoming audio); as with the instrument version, the modulator signal is derived from the sidechain input. Rackmode Vocoder FX version offers endless possibilities, because any audio source maybe used as the modulator or carrier input. For example, a recording of an orchestral string section could be vocoded using a drum loop as the modulator signal.

Your DAW must support sidechain signal routing to use either vocoder version. Most professional DAW software should include sidechain capabilities; GarageBand is a notable exception.

Alright, that's enough bold-print and and exclamation points for now, let's get rockin':

Main Vocoder Controls

The instrument version consists of two sections - the top synthesizer section, and the bottom vocoder section. The FX version consists of the bottom vocoder section only and is identical in both versions. We'll review its controls in a way that makes most sense from an operational standpoint (as opposed to moving directly across the front panel):

Program / Level- Sets the volume of the incoming modulator signal from DAW sidechain input. Note that "program" is oddball 70s terminology - in just about any other context, this signal will be referred to as the "modulator" signal, and we will refer to it as such in this user guide.

If you're using the vocoder to create standard "robot voice" effects, the modulator signal should be either a live microphone input or an audio track containing spoken or sung audio. (Drum loops or rhythmic guitars also make excellent modulator signals). Set the level so the overload lamp (OL) only flashes occasionally; optimizing the modulator level will result in the best speech intelligibility.

Carrier / Level- Sets the volume of the incoming carrier signal, aka, the signal that gets modulated. If the instrument version is being used, the carrier signal will be the onboard synthesizer; if the FX version is used, the carrier signal will be the audio on the DAW track (or real-time live audio routed to that DAW track).

By its nature, a vocoder is a subtractive synthesizer - that is, it applies numerous filters to the carrier that constantly remove sections of the audio spectrum. Because of this, it's best to use a carrier signal with a dense harmonic spectrum - sawtooth waves are used most frequently because of their rich even-order harmonics. Square and pulse also make good carrier tones, and Rackmode Vocoder's unique "glottal" wave excels at creating realistic vocalesque tones. Conversely, we don't recommend using dull or excessively thin sounds as carrier signals.

Hiss & Buzz

One caveat of vocoders is that they don't reproduce non-pitched sibilant sounds very well (i.e., anything with an "S" sound). These S sounds lie at the top of the audible frequency range, and they're very important for intelligibility (feel free to make a drinking game out of the use of the word intelligibility in this manual).

There are two ways vocoders compensate for this:

  • The modulator signal is run through a steep highpass filter and this signal triggers a highpass filtered white noise generator to "fill in" the S frequencies.

  • The modulator signal is run through a steep highpass filter, leaving only the very highest "hissy" parts of the signal. This highpassed direct modulator signal (typically a microphone) then bypasses the main vocoder bandpass filter band and is mixed directly into the vocoder's master output.

The original made use of both of these techniques. Non-pitched sibilants are referred to as hiss, whereas pitched "body" sounds are referred to as buzz. The clever bit (as our English friends say) is that the vocoder automatically detects whether the incoming modulator audio is non-pitched hiss sound, or a pitched buzz sound, and instantaneously switches between these signals, resulting in excellent intelligibility and overall fidelity. This switching is always active, with the current signal mode displayed by the green and red LEDs adjacent to the Hiss and Buzz mode switch positions. There a couple of controls that enable, disable, and set the mix level of each component:

Switched/Direct - 5080-15000 Hz- When in the Switched (left) position, only the highpassed white noise signal is heard when S sounds are detected in the modulator signal (i.e. green Hiss LED is glowing)

When in the Direct (right) position, highpass-filtered direct modulator audio is mixed into the output in addition to the aforementioned highpassed white noise signal, resulting in more pronounced sibilance. If you're speaking or singing into a mic, you'll hear a faint "ghost" version of your voice in the vocoder output (this is easiest to hear with the Mode knob set to Hiss position). Note that the highpass filtered direct audio is always on if the Direct switch is enabled - that is, it doesn't turn on and off via the hiss detector circuit like the highpassed white noise signal.

Balance- Sets the mix of hiss and buzz signals when the Mode knob is in the Hiss & Buzz position. It has no effect if the Mode knob is set to the Hiss or Buzz positions.

Mode- Determines whether hiss, buzz, or both are heard in the vocoder output. In the Hiss & Buzz position, the hiss detector will alternate between signals depending on the incoming modulator audio content; in the Hiss position, the highpassed white noise signal is constantly audible, i.e., it's not switched by hiss detector circuit. (We don't find the Hiss position to be super useful for final vocoder audio, but it's an effective way to "solo" and hear exactly what's happening in the hiss component.)

For most situations, we recommend parking the Mode switch at the Hiss & Buzz position and using the Balance knob to dial in the desired amount of hiss vs. buzz signals. The buzz signal is usually where the sonic interest comes from, but hiss can be very helpful for intelligibility.

Formant and Formant Mod

As we've mentioned, a vocoder consists of two identical banks of bandpass filters, and these correlate exactly to each other by default. For example, if you made spoke the WAHHH into the mic with a honky, nasal tone, the four modulator "analyzer" bandpass filters from 317 Hz to 800 Hz might open to varying degrees; this in turn would open the corresponding carrier bandpass filters in the exact same way.

But let's say an offset was applied between the two filter banks. Instead of the 317 Hz analyzer filter causing the 317 Hz carrier filter to open, the 317 Hz would cause the 400 Hz filter to open. This is what the Formant knob does - it gradually "shifts" the relationship of the two filter banks up or down relative to each other. We call this Formant, because when used for vocal sounds, the audible effect is of the vocal tract growing larger, i.e. more manly, or shrinking, i.e., child-like (or at extremes, like a chipmunk, but a kickass funky robot chipmunk).

This control isn't on the original and it's a very nice effect. In development, we noticed it sounded particularly cool when we moved the knob back and forth, so we added a syncable LFO modulation section to automatically modulate knob movements.

Formant- Adjusts the amount of up or down offset shift between the modulator analyzer and carrier BPF banks.

Formant Mod / Rate- Adjusts the rate of formant modulation, from 0.1 to 10 Hz (with Sync switch off) or from 8 beats up to 1/64th note triplets (Sync switch on). The flashing red LED indicates the current mod rate.

Formant Mod / Sync- The Sync switch locks the LFO to DAW host tempo. The green LED illuminates to indicate sync mode is active.

External Patch and Jacks

Each vertical jack pair represents one of the 16 pairs of bandpass filters. The numbers to the left and right of each jack indicate the lower and upper frequencies it passes (like any analog filters, they have a "slope" on either side, so adjacent BPF frequencies will overlap to some degree depending on the Resonance control setting). For example, the first BPF on the left passes from 50 Hz to 159 Hz, the second BPF passes from 159 Hz to 200 Hz, etc. The LEDs between the jacks illuminate with varying intensity, dependent on the current signal strength of each band, e.g., the VCA control voltage amount.

Rackmode Vocoder's jack section is closely related to the Formant knob explained above, but instead of a gradual shift in relationship between the modulator and carrier BPF banks, enabling the External Patch switch completely disconnects the modulator-side control voltages from the corresponding carrier-side VCA's (the blue arrows in the How A Vocoder Works overview diagram at the top). This allows the modulator and carrier BPF filters to be rerouted or "cross-patched" in any desired configuration, opening many creative possibilities. External patch jacks will only function when the External Patch switch is enabled.

To patch a cable, simply click and hold on a jack then drag to another jack and release the mouse button. Cables can be routed from an input to an output, or from an output to an input. When a cable is being dragged, all potential destinations will appear normal, and non-destinations will gray out - this prevents inputs from being routed to other inputs and outputs from being routed to other outputs.

Note that Rackmode Vocoder utilizes the same highly developed cabling system as Cherry Audio's Voltage Modular virtual modular synth, so it's got a number of tricks up its sleeve that aren't immediately obvious - please check out External Patch Cabling Tips and Tricks at the very bottom of this section for more information about cable behavior (or call your local cable operator if it's been on the fritz).

The patching configuration shown above is identical to the normal "hardwired" internal patching when the External Patch switch is disabled.

Additional Top Toolbar Controls

We covered the majority of the top purple menu strip controls way back yonder in the Top Toolbar and Preset Browser section, but there are a few extra top toolbar controls associated with the patch cables described in the preceding section.

Cable Transparency- Clicking the checkerboard icon displays the cable transparency slider. Slide this to the left for more transparent cables, or to the right for more opaque cables.

Cable Color Select- Click this to select the global cable color, i.e. the color of any newly patched cable. Clicking Random randomly chooses a color for each new cable.

Show/Hide Cables- Clicking this hides or shows all cables. It has no effect on sounds, and its status does not save with patches. Cables can also be shown or hidden using the key shortcuts [CONT-D] (Windows) or [⌘-D] (Mac).

Special Cool Cable Color Select Feature- The color of any existing cable can be changed by right-clicking in jack area. Right-clicking on a jack that doesn’t have a cable plugged in will change the global cable color (i.e. the same as changing the color with the toolbar button).

––––––––––––––––

Resonance- Sets the width or “Q” of all bandpass filters of both banks simultaneously. Narrow bandwidths let less audio through, whereas wider bandwidths let more audio through for a denser sound. A good analogy would be to imagine water running through a comb with wider or narrower tooth spacing.

Articulation- This sets the decay/release time of the modulator side envelope followers. In use, this equates to the overall responsiveness of the vocoder to modulator signals - fast articulation times equate to a tight, responsive sound, whereas slower times result in "lazier" recovery from transients. Setting to max Freeze position is essentially the same as enabling the Hold switch described below.

Hold- Enabling the Hold button freezes the current state of the carrier VCAs. This is useful if you'd like to infinitely sustain a carrier filter curve without turning blue in the face endlessly singing a note with a mic. Specifically, it's useful for sustained, choral "aah" type pads.

The Hold switch is especially handy when a sustain pedal is mapped to control it (super easy: right-click the Hold switch, select MIDI Learn, and tap the sustain pedal). The LED illuminates when Hold is active.

Chorale

Chorale is a great-sounding stereo chorus effect. It's highly recommend it for choral pad patches (hence the name). The original did not have a built-in chorus, but just about every Roland vocoder ever made does, and it's easy to hear why.

In/Out- Enables and disables the chorus effect.

Amount- Increases intensity of chorus effect as the knob setting is increased.

Output

Mix- Sets the balance of the signal between the dry modulator signal (i.e. incoming sidechain signal) and vocoded sound. Typically this would be set all the way the to the right, but it's useful for monitoring the modulator signal or as an easy way to mix vocoder with "dry" modulator signal.

Level- Sets the master output level. It's a good idea to carefully optimize the Program, Carrier, and Output gain for best fidelity.

Synthesizer Oscillator Section

The instrument version of Rackmode Vocoder includes a synthesizer oscillator section that is hardwired to the vocoder carrier section. Oscillator pitches are controlled by incoming MIDI just like any standard virtual instrument. The Rackmode Vocoder synthesizer oscillator section consists of two polyphonic oscillators only. It does not contain filters (that's what the vocoder is for), or amplitude envelopes - the incoming modulator signal from the sidechain input controls amplitude via the carrier-side VCA's (and likely, the "signal" coming from your face).

Tune- Allows fine-tuning of master pitch by 100 cents, up or down.

Glide- Also known as "portamento," glide delays the voltage change between pitches for a sliding effect. Keyboard Glide works in mono and poly oscillator modes.

Vibrato / Delay- Delays the onset of vibrato from 0.1 up to 5 seconds.

Vibrato / Rate- Sets the vibrato modulation rate from 2 to 18 Hz. The flashing LED above indicates the current rate.

Vibrato / Depth- Sets the vibrato depth from 0 to 100% (about a whole-step).

Oscillators / Mono/Poly- When set to Mono, the oscillators will play one note at a time with last-note priority. The most recently played note takes priority. In the Poly position, up to 16 notes may be simultaneously played.

Osc 1 Range- Sets oscillators 1's overall coarse pitch range in standard organ footage settings of 32', 16', 8', and 4'.

Osc 2 Range- Sets oscillators 2's overall coarse pitch range in standard organ footage settings of 32', 16', 8', and 4'. The Off position disables oscillator 2 (hence the clever name).

Detune - Tuning control for Oscillator 2 only. This can be used to fatten up sounds by detuning a small amount, or for "building-in" a set interval. Detune range is up or down a fifth (seven half-steps). The Detune knob is disabled if Osc 2 Range is set to the Off position.

Wave- Selects ramp, variable-width pulse, or Rackmode Vocoder's exclusive glottal wave. The glottal wave selection engages a vocal cord wave model widely used in speech synthesizers. Each cycle of the wave is divided into two sections in time referred to as the "open" and "closed" phases. When air passes the vocal cords which have been brought together to produce voiced speech or song, the cords open and close once each pitch cycle admitting a single puff of air, similar to a trumpeter's lips. 

Tone- Emphasizes bass or treble frequencies, center setting is neutral. The Tone knob affects all three waveforms.

Width- When the pulse wave is selected, this affects the pulse width, with maximum width (square) at the center setting.

When the glottal wave is selected, Width adjusts the relative time between open and closed phases, similar to the duty-cycle of the pulse wave. During the open phase, air passes through the opening between the vocal cords, so at higher settings, you'll hear a lot of "air." When the Width is at maximum, the vocal cords are open for almost all of each cycle and lots of air is used. Narrow widths (i.e. lower Width settings) are associated with holding the vocal cords in a tense manner, thus the open phase is short and you don't hear much air.

Try this yourself - speak with your throat very open and relaxed and you'll find that you will use up your air supply very quickly. When creating high voices, especially with the Formant control turned to the right, you will find that using a wide Width and some Breath dialed to taste will create a very realistic sounding voice.

If the ramp wave is selected, the Width knob is disabled and grays out .

Breath- Sets the volume of the "air" heard at higher Width settings. When the ramp or pulse wave are selected, the Breath knob is disabled.

Continue to 12 Stage Phaser section