Tutorial: Creating scratchable audio with the Web Audio API


This is the first in a several part tutorial on how to create real time scratchable audio, that is, audio that can easily be sped up, slowed down and played backwards, with a play-head that has a velocity and position that can be moved around easily, changing the tempo and direction of the audio.

In 2021, I created this simple proof of concept:
https://www.johncotterell.me/2021/11/scratchable-boris-video.html

In that version, I synced it with a rotating video clip, which didn't work too well, I suspect the combination of rotating video element and the audio worklet was putting too much strain on the browser resources.

So, I've tidied it up a lot, used a piece of music, removed the video and replaced it with a png of a vinyl disc, here the result:

https://www.johnc.pro/scratchableaudio/

Code is here:

https://github.com/Cotterzz/scratchableaudio

In this version, I'm using a rotating image.

The next version will place it in a 3D environment. Then show how that can be used in a VR or AR application.

Firstly though we'll cover the Web Audio API and its most exciting component, the AudioWorkletProcessor.

Teaching the whole API is outside the scope of this article, but I will summarise it briefly and then move on to writing the custom audio node.

Getting the audio API up and running involves creating an audio context and populating it with audio nodes - at least one to generate a signal or stream audio data, and one connected to the audio context's output, if you want to play sound. There are also some rules about only generating sound after a user interaction.

There are several built-in audio nodes; oscillators for generating tones, filters for modifying an audio signal and others for analysing sound, including Fourier analysis for converting a wave to pitch data used in spectral analysis.

In short, the Web Audio API is a full featured node based digital signal processor and sound synthesiser.

Or at least, it has those capabilities, the built-in native nodes have very limited functionality, there isn't even a white noise generator, you're limited to playing audio data through it or generating tones with simple waveforms, unless you write custom audio nodes.

You do that by extending a very unique kind of class that has rather special status in the JS runtime. It's a type of web worker, a worklet, and there are only a few different types of worklet, each one catering to a specific use case for performant code.

To create a custom Audio Node, you need to extend the AudioWorkletProcessor class.
The extended class has to reside in its own file, with no other code.
When executed it exists in its own thread, and it's own allocated memory.
It can't be accessed via a method function, it has specific methods for specific uses, but there is a message-port specification for sending and receiving custom data, to be used instead of your own public functions, though you can create regular private functions.

Weird huh? It's to allow the JS runtime to give it extra power to run highly performant code that also doesn't need to use the main thread. The reason for this is it executes it's main processing loop in real time, per chunk of data, per channel, and you loop through the channels (usually two, but you can have more) and then loop through each sample point, giving you a loop body that executes once per sample, which at 41khz stereo is 82,000 samples per second.

So it needs all the juice it can get.

It's the fastest you'll see regular JS running, I think it's probably close to webassembly speeds.

The only thing that's considerably faster in the browser is GLSL, which runs parallel C code per pixel, per frame on dedicated hardware.

When the Web Audio API was first deployed, there was no AudioWorkletProcessor, only a ScriptProcessorNode, which did a similar job, but with no threading, or native optimisation, this is the biggest, most important development of the API since it was made available.

The class that extends the AudioWorkletProcessor needs to reside in its own JS file, so for the first part of this tutorial begin, we'll try and fit everything into three JS files:

main.js - For setting up the UI objects and starting the main audio class.

ScratchableAudio.js  - this is a regular class that sets up the audio context, all the audio code outside the audio worklet, playhead logic.

CustomWaveProcessor.js - this is where the actual audio data processing happens.

This is main.js:

import { ScratchableAudio } from './ScratchableAudio.js'

var scratchable =  new ScratchableAudio('./tune.mp3', setRotation);

var buttons = document.getElementById("overlay_left");
buttons.innerHTML = "<button type='button' id='button1'>START</button><br/>";
var disc = document.getElementById("vc");

document.getElementById("button1").onclick = function () { 
	scratchable.setupSample();
	document.addEventListener( 'touchmove', (event) => {onDocumentMouseMove(event)}, false );
	document.addEventListener( 'touchstart', (event) => {onDocumentMouseDown(event)}, false );
	document.addEventListener( 'touchend', (event) => {onDocumentMouseUp(event)}, false );
	document.addEventListener( 'mousemove', (event) => {onDocumentMouseMove(event)}, false );
	document.addEventListener( 'mousedown', (event) => {onDocumentMouseDown(event)}, false );
	document.addEventListener( 'mouseup', (event) => {onDocumentMouseUp(event)}, false );
	buttons.innerHTML = "";
}

function onDocumentMouseDown(event){ event.preventDefault(); scratchable.down(event.pageX);}
function onDocumentMouseUp(event){ event.preventDefault(); scratchable.up();}
function onDocumentMouseMove(event){ event.preventDefault(); scratchable.move(event.pageX);}

function setRotation(angleTo){
	disc.style.transform = 'rotate('+angleTo+'deg)';
}

This imports and instantiates the main audio class and sets up the UI to handle up, down, and move events for the scratching, as well as a start event to make the audio playing comply with security rules.

I pass the audio path and a feedback function to the audio class. The feedback function is so that the disc can sync its movement to the audio, it does this, as opposed to syncing with the input movement.

This is the ScratchableAudio Class:

export class ScratchableAudio {
	audioContext;customWaveNode;rawAudio;mp3Data;
        totalTime;rotationTime;rotationFunction;
	rotations = 4;
	holdingVelocity = 0;
	holdingOldPosition = 0;
	holdingPlayHead = false;
	constructor(data, rotatorFunction){
	    this.mp3Data = data;
	    this.rotationFunction = rotatorFunction;
	}

	down (pos) {
	    this.holdingPlayHead = true;
	    this.holdingOldPosition = pos;
	    this.sendData({label:'playheadspeed', velocity: (pos - this.holdingOldPosition)/5});
	}

	up (){
	    this.holdingPlayHead = false;
	    this.sendData({label:'playheadstatus', status: "free"});
	}

	move(pos){
	    if(this.holdingPlayHead){
		this.sendData({label:'playheadspeed', velocity: (pos - this.holdingOldPosition)});
		this.holdingOldPosition = pos;
	    }
	}

	async setupSample() {
	    this.audioContext = new AudioContext();
    	    const filePath = this.mp3Data;
    	    const sample = await this.getFile(this.audioContext, filePath);
    	    this.rawAudio = sample;
    	    this.totalTime= sample.length/48000;
    	    this.rotationTime = this.totalTime/this.rotations;
    	    thissetUpAudio();
	}

	async setUpAudio(){
	    await this.audioContext.audioWorklet.addModule('CustomWaveProcessor.js');
	    this.customWaveNode = new AudioWorkletNode(this.audioContext, 'CustomWaveProcessor')
	    this.customWaveNode.connect(this.audioContext.destination);
	    this.customWaveNode.port.onmessage = (e) => {this.setFrame(e.data)};
	    this.sendData({label:'raw', rawdata:this.rawAudio.getChannelData(0)});
	}

	async getFile(audioContext, filepath) {
	    const response = await fetch(filepath);
	    const arrayBuffer = await response.arrayBuffer();
	    const audioBuffer = await this.audioContext.decodeAudioData(arrayBuffer);
	    return audioBuffer;
	}

	setFrame(samples){
	    var currentTime = samples/48000;
	    var angleTo = 360-(360*(currentTime/this.rotationTime));
	    this.rotationFunction(angleTo)
	}

	sendData(object){
	    this.customWaveNode.port.postMessage(object)
	}
}

Lots of async code going on here, the class loads the audio, then the very peculiar way the CustomWaveProcessor is loaded and instantiated. This isn't a module loading pattern you'll see very often.

Also, because you can't speak directly to the CustomWaveProcessor  this class uses the messaging protocol provided by the API to speak to it. 

Lastly, here's the CustomWaveProcessor class itself:

class CustomWaveProcessor extends AudioWorkletProcessor {

  sampleData = null;
  sampleLength = 0;
  sampleLoaded = false;

  held = false;

  playOffset = 0;
  playVelocity = 1;
  normalVelocity = 1;

  blockLength = null;

  constructor(...args){
    super(...args);
    this.port.onmessage = (e) => {
      this.receiveMessage(e.data);
    }
  }

  process (inputs, outputs, parameters) {
    const output = outputs[0]
    if(this.sampleLoaded){
      output.forEach(channel => {
        this.blockLength = channel.length;
        for (let i = 0; i < channel.length; i++) {
          channel[i] = this.sampleData[Math.round(this.playOffset+(i*this.playVelocity))];
        }
      })
      this.playOffset += this.blockLength*this.playVelocity;
      if(!this.held){this.playVelocity = ((this.normalVelocity + (this.playVelocity*100))/101)}
     
      if(this.playOffset>this.sampleLength){ this.playOffset = this.playOffset % this.sampleLength;}
      if(this.playOffset<0){ this.playOffset = this.sampleLength-this.playOffset;}
      this.port.postMessage(this.playOffset);
    }

    return true
  }

  getBufferValues(bOffset, bLength){
    var vArray = this.waveData.slice(bOffset, bOffset+bLength);
    return vArray;
  }

  receiveMessage(data){
    if(data.label=='raw'){
      this.sampleLoaded = true;
      this.sampleLength = data.rawdata.length;
      this.sampleData = data.rawdata;
    } else if(data.label=='playheadspeed'){
      this.playVelocity = data.velocity;
      this.held = true;
    } else if(data.label=='playheadstatus'){
      this.held = false;
    }
  }

  createWaveBuffer(waveType, wResolution, bResolution){
    var waveData = new Array(bResolution);
    for (let i = 0; i < bResolution; i++) {
        waveData[i] = Math.sin((i*2*Math.PI)/wResolution);
    }
    return waveData;
  }
}

registerProcessor('CustomWaveProcessor', CustomWaveProcessor)

Notice the supplied method for registering the custom audio node at the end.

Anyway, there's a lot to unpack here, and this is turning into a long post already.

So I'm going to wrap it up for now.

Part 2, I will be taking a deeper dive into this and walking through the code, and in Part 3 I will be creating another version of this in a 3D environment for some more exciting ways to interact with it.

Post a Comment

Previous Post Next Post