DRUM ACADEMY · listen with headphones
Ten Nouns teach you sound.
Ten interactive lessons. Each one is a few interactive sliders and a little oscilloscope. Real audio off the Web Audio primitives. No marketing — just the actual signal.
A stereo signal is just two mono channels played simultaneously into your left and right ears. Panning is the volume balance between them. Pan all the way left and only your left ear hears the sound; pan center and both ears hear it equally.
Tap play. A kick repeats. Drag the slider — the kick walks from your left ear to your right. Watch the L/R meters move with it.
L/R METERS
Try this: enable auto-walk and close your eyes. Most listeners can localize a kick within ~10° accuracy.
Play a 200 Hz tone in your left ear and a 204 Hz tone in your right ear. There is no 4 Hz signal anywhere in the air — but your brain, trying to merge the two ears into one perception, generates a perceived 4 Hz pulse called a binaural beat.
The illusion only works on headphones. On speakers, the two tones mix in the room and you hear them as a regular acoustic beat (which is real, not perceptual).
brainwave bands: 1–4 Hz delta · 4–8 Hz theta · 8–13 Hz alpha · 13–30 Hz beta
SIGNAL · L (top) · R (bottom)
Try this: set the beat to 1 Hz and you'll hear a slow throb. Set it to 15 Hz and the perception changes from "slow pulse" to "buzz." Above ~30 Hz the brain stops merging and you hear two separate tones.
The frequency of a wave decides the pitch. The shape decides the timbre — what makes a violin sound different from a flute even at the same note.
A pure sine wave has only its fundamental frequency — no overtones. A square wave is the fundamental + odd harmonics (3rd, 5th, 7th…) at decreasing volume — that's the buzzy, hollow sound. A sawtooth includes all harmonics — that's the bright, brassy sound. A triangle has odd harmonics like a square but they fall off faster — gentler.
TIME · waveform
FREQUENCY · spectrum
Try this: switch between sine and sawtooth at the same pitch. The pitch is identical — your brain tracks the fundamental. The color changes wildly because of the harmonics.
AM = amplitude modulation. A high-frequency carrier (radio station: 1090 kHz, here we use 800 Hz so you can hear it) has its volume modulated — slowly raised and lowered — by a modulator signal. That modulator is the audio you want to broadcast: voice, music, anything.
When your radio receives the wave, it strips away the carrier and leaves only the modulator's volume curve — which is the original audio. You're hearing the shape of the carrier wave.
CARRIER × MODULATOR · the broadcast wave
Try this: set MOD HZ to 5 — that's a guitar tremolo effect. Set MOD HZ to 8 and the tremolo gets fast enough to hear as a separate buzz. Around 15+ Hz it crosses into ring modulation territory.
FM = frequency modulation. Instead of modulating the carrier's volume (AM), you modulate its pitch. A 440 Hz carrier whose frequency is wobbled by another oscillator generates a complex wave with multiple sidebands — and that's how you get bell sounds, electric pianos, plucky basses, glassy pads.
This is what powered the Yamaha DX7 in 1983 — the synth that defined 80s music. The Rhodes piano, the marimba, the bass on every Phil Collins record: all FM.
RESULTING WAVE
Try this: set RATIO to 1.0× and INDEX low → soft tone. Set RATIO to 3.5×, INDEX 3.0 → bell. Set RATIO to 0.5×, INDEX 4.0 → electric piano. The DX7 had 32 of these "operators" you could route into each other in any topology.
Every room has a tail. Clap once in a bedroom and the sound is gone in 0.3 seconds; clap in a cathedral and you'll still hear yourself two or three seconds later. Reverb is the thousands of tiny reflections off walls, ceilings, and objects — arriving in such dense succession that your ear hears them as a single fading wash rather than discrete echoes.
A ConvolverNode bakes a whole room into a short audio clip called an impulse response: literally what one click sounds like in that space. Convolve any signal with that clip and the signal "puts on" the room. Below, we synthesize an IR from white noise + an exponential decay, then run a kick drum through it.
bedroom 0.3 s · large room 1.0 s · concert hall 2.0 s · cathedral 4.0 s
IMPULSE RESPONSE · decay envelope
Try this: hit sweep — over two seconds the kick fades from completely dry to completely wet, the same trick mix engineers use on snare hits to make them "bloom" without drowning the mix.
A compressor is an automatic volume knob that turns itself down whenever the signal gets too loud. The threshold is where it starts working; the ratio is how aggressively (4:1 means every 4 dB over threshold becomes 1 dB out); the attack is how fast it clamps down; the release is how fast it lets go.
Compression is why a pop vocal sits forward in the mix without ever spiking, why kick drums punch but never clip, and why the music on every radio station sounds roughly the same loudness. Hit the button below to hear loud-soft-loud-soft alternating notes — then slam the threshold down and listen to the dynamics flatten.
GAIN REDUCTION · dB over time
Try this: set ratio to 20:1, threshold to -30 dB, attack 1 ms — the loud and soft notes become almost the same volume. That's how a vocal stays on top of a dense mix.
Here is the entire algorithm: take a short burst of random noise, send it into a delay line, route the delay output back to the delay input through a low-pass filter. The noise circulates. Each pass, the filter shaves off a little high-frequency content. After a few hundred milliseconds, only the resonant frequency of the delay length remains — and that's the pitch you hear.
This is Karplus-Strong synthesis, 1983, two researchers at Stanford. It captures something almost philosophical about plucked strings: the sound is noise that hasn't decayed yet, recirculating through a tube. Delay length sets the pitch (delay seconds = 1 / frequency). The filter is the air resistance.
WAVEFORM · note decay
Try this: drop damping to 800 Hz for a dull nylon string; push it to 7 kHz for a bright steel string. Burst length controls how hard you "pick" — short = soft, long = scrapy.
Every note has four phases. Attack: silence → peak. Decay: peak → sustain level. Sustain: the held level for as long as the key is down. Release: sustain → silence after you let go. Change these four numbers and you change the instrument.
A plucked string is fast attack, fast decay, zero sustain, medium release. A pipe organ is fast attack, no decay, full sustain, instant release. A bowed violin is slow attack, no decay, full sustain, medium release. The same sawtooth oscillator wears all three costumes — only the envelope changes.
ENVELOPE · drawn live as you move sliders
Try this — make a pluck: A=1, D=300, S=0, R=200. Make an organ: A=10, D=10, S=100, R=10. Make a pad: A=1500, D=10, S=80, R=1500. Same oscillator, three instruments.
Take a buzzy sawtooth and run it through a filter — say a lowpass set to 400 Hz. Only the bass survives, the rest is shaved off. Now animate the filter cutoff with a slow oscillator (a low-frequency oscillator, or LFO) and the timbre breathes open and closed in time with the LFO.
That is how every synth bass wobbles. That is how dubstep wobbles. That is the rising whoosh of a riser before a drop — an LFO at sub-audio rate (under 20 Hz) modulating something audible. Same idea modulates pitch (vibrato), amplitude (tremolo), or pan (auto-pan). Below: a 110 Hz sawtooth, LFO-driven cutoff, choice of filter type.
CUTOFF · live filter cutoff over time
Try this: lowpass + rate 0.5 Hz + depth 90% = classic dubstep wobble. Switch to highpass + rate 0.3 Hz = riser breath. Bandpass + rate 4 Hz = vocal wah.