Feb 22, 2019

How to Work With Sound In JS: Сustom Audio Player With Web Audio API (Part 1)

Recently I've had a chance to work with the sound for one project. My task was to create and visualize a custom audio player with React.js and Web Audio API. I had to dig deeper into this topic and now I want to share my knowledge with you.

I will start with some theory and then proceed to real-life examples and practical tips on how to create, manipulate, and visualize sound with JavaScript.

The nature of sound

First of all, let's get to the basics and talk about what sound is. In physics, a sound is a vibration that typically propagates as an audible wave of pressure, through a transmission medium such as gas, liquid or solid. If we represent the sound graphically, it will look like a waveform f (t), where t is the time interval.

The next thing is: how do our devices reproduce this wave? For this purpose, a digital audio - a method for storing a sound in the form of the digital signal is used. As the sound is a point in a certain moment, these moments can be selected and saved in samples (numerical values of the waveform data points at certain moments of time)

Each sample is a set of the bits (with 0 or 1 value). Usually, 16 or 24 bit are used. The number of samples per second is determined by the frequency of discritization (sample rate), measured in hertzs. The higher the discritization frequency is - the higher frequencies may the sound signal contain.

In a nutshell, you can imagine a sound as a large array of sound vibrations (both in bites and numerical values -N< 0 >N after decoding). For example, [0, -0.018, 0.028, 0.27, ... 0.1]. The length of the array depends on the discretization frequency. For example, if the sample rate is 44400, the length of this array is 44400 elements per 1 second of recording.

How sound is saved on devices

Now when you know the theory of sound wave, let's see how it's stored on a device. For this purpose, the audio file format is used. Each audio file consists of 2 parts: data and header.

Data is our sound wave, the data array also known as .raw format. Header is the additional information for our data decoding. It contains information about the discretization frequency, number of recording channels, author of the album, date of recording, etc.

The difference between .wav and .mp3 is that mp3 is the compressed format.

Now let's get to the practice. I have used Express+React for all the examples, however the main approaches I've mentioned are not tied to any particular framework.

How to load the sound from the server

First of all, you have to retrieve the file you will work with. You can fetch it either from the client or server. For the client, you can use file input element. Check out how to load the file from the server with Express.js. You can read the whole code here.

...
const express = require('express');
const app = express();
const api = express();

api.get('/track', (req, res, err) => {
  // generate file path
  const filePath = path.resolve(__dirname, './private', './track.wav');
  // get file size info
  const stat = fileSystem.statSync(filePath);

  // set response header info
  res.writeHead(200, {
    'Content-Type': 'audio/mpeg',
    'Content-Length': stat.size
  });
  //create read stream
  const readStream = fileSystem.createReadStream(filePath);
  // attach this stream with response stream
  readStream.pipe(res);
});

//register api calls
app.use('/api/v1/', api);

const server = http.createServer(app);
server.listen('3001',  ()  => console.log('Server app listening on port 3001!'));

In general, there are 3 main steps: reading the file and information about it, turning the audio/mpeg into response, file loading. You can load any files by using this approach. All you need is to make the request by url api/v1/track.

How to work with sound on client

Now, when you know how to load the files from the server, the next step is to get our file on the client. If we just make get request in the browser, we will get our file. However, we want to use it on our page somehow. The simplest way to do this is to use the audio element in the following way.

<audio controls>
  <source src="/api/v1/track" type="audio/mpeg" />
</audio>

How to work with sound in the background

It’s great that browser api gives us such simple elements out of the box. However, it would be good to have a bigger control over the sound inside our code.

This is when Web Audio API - set of tools for working with the sound in browser comes to help. What should you start with? Let’s start with answering several questions.

How to play the audio file

First of all, you need to load it from the server. For this purpose, you can use the fetch method or other libraries (for example, I use axios).

const response = await axios.get(url, {
   responseType: 'arraybuffer', // <- important param
});

Note: Type responseType: 'arraybuffer' into header so the browser will know that it loads buffer but not json

To play the file, you need to create AudioContext class.

const getAudioContext =  () => {
  AudioContext = window.AudioContext || window.webkitAudioContext;
  const audioContent = new AudioContext();
  return audioContent;
};

Here is an important thing to remember. Some browsers allow using Audio Context only after user interaction with the page. If the user will not make any action on the page, an error will occur. That's why you need getAudioContext.

To play the file, you need to create BufferSource. You can use the method create BufferSource in the AudioContext.

After this, BufferSource requires audioBuffer. We can take it from our file using the decodeAudioData method. Here is how it all will look like:

// load audio file from server
const response = await axios.get(url, {
 responseType: 'arraybuffer',
});
// create audio context
const audioContext = getAudioContext();
// create audioBuffer (decode audio file)
const audioBuffer = await audioContext.decodeAudioData(response.data);

// create audio source
const source = audioContext.createBufferSource();
source.buffer = audioBuffer;
source.connect(audioContext.destination);

// play audio
source.start();

After this, you only have to call source.start() method.

How to stop playback

To stop the playback, just call source.stop() method. You also need to save the time when you have pressed the stop button. It will come in handy if you replay the audio from the pause. In this case, you have to call source.start() already with the parameter.

// start play
let startedAt = Date.now();
let pausedAt = null;
source.start();

// stop play
source.stop();
pausedAt = Date.now() - startedAt;

// resume from where we stop
source.start();
startedAt = Date.now() - pausedAt;
source.start(0, audionState.pausedAt / 1000);

How to display the process of playback

Here you can choose two approaches. One is to use the method createScriptProcessor and its callback onaudioprocess.

const audioBuffer = await audioContext.decodeAudioData(response.data);
// create progress source
const  scriptNode = audioContext.createScriptProcessor(4096, audioBuffer.numberOfChannels, audioBuffer.numberOfChannels);
scriptNode.connect(audioContext.destination);
scriptNode.onaudioprocess = (e) => {
  const rate = parseInt((e.playbackTime * 100) / audioBuffer.duration, 10);
};

To see the percentage of the audio that has been played, you need two things: - the song duration audioBuffer.duration and the current e.playbackTime.

One drawback is that when calling source.stop(), you need to nullify this callback. The other approach is to save the time of playback and run the update each second.

const audioBuffer = await audioContext.decodeAudioData(response.data);
...
const startedAt = Date.now();
const duration = audioBuffer.duration;
source.start();

setInterval(() => {
  const playbackTime = (Date.now() - startedAt) / 1000;
  const rate = parseInt((playbackTime * 100) / duration, 10);
},1000)

How to rewind the audio to a certain point

Here the situation is somehow reverse. First, you need to define therateand then, based on it, calculate the playbackTime. To define the rate, you can count the length of the progress element and the position of the mouse relative to the point where the user has clicked.

onProgressClick: (e) => {
  const rate = (e.clientX * 100) / e.target.offsetWidth;
  const playbackTime = (audioBuffer.duration * rate) / 100;

  source.stop();
  source.start(o, playbackTime);
  // dont forger change startedAt time
  // startedAt = Date.now() - playbackTime * 1000;
}

The important thing here is not to forget to change startedAt or your progress will not be displayed in a proper way.

How to control the volume

For this purpose, you need to create gainNode by calling audioContext.createGain(); method. After this, you can easily write a method setVolume.

const gainNode = audioContext.createGain();
...
source.connect(gainNode);
const setVolume = (level) => {
 gainNode.gain.setValueAtTime(level, audioContext.currentTime);
};
setVolume(-1); // mute
setVolume(1); // speek

So, you know everything on how to write your own component for audio files playback. Now you only need to customize it and include such features as playback, name of the file, switch to the next track and so on.

How to generate your own sound

What about generating your own sounds? Here you can see a small example of how to generate the sound of different frequency. Check out the whole code here. To generate the sound of different frequency, use the method createOscillato.

const getOscillator = (startFrequency) => {
  const audioCtx = new (window.AudioContext || window.webkitAudioContext)();
  const oscillator = audioCtx.createOscillator();
  oscillator.type = 'square';
  oscillator.frequency.setValueAtTime(startFrequency, audioCtx.currentTime);
  oscillator.connect(audioCtx.destination);

  const start = () => oscillator.start();
  const stop = () => oscillator.stop();
  const change = frequency =>
    oscillator.frequency.setValueAtTime(frequency, audioCtx.currentTime); // value in hertz

  return { start, stop, change };
};

let frequency = 100;
const oscillator = getOscillator(frequency);
oscillator.start();

const interval = setInterval(() => {
  frequency = frequency + 100;
  oscillator.change(frequency);
}, 100);
setTimeout(() => {
  clearInterval(interval);
  oscillator.stop();
}, 2000);

How to use a microphone

Using the microphone in the browser is a common thing nowadays. For this, you have to call the method getUserMedia of the window.navigator. You will get the access to the object stream. With its help you can make audioSource. To get the chunk of data from mic, you can use createScriptProcessor and its method onaudioprocess.

// get permission to use mic
window.navigator.getUserMedia({ audio:true }, (stream) => {
    const audioContext = new AudioContext();
    // get mic stream
    const source = audioContext.createMediaStreamSource( stream );
    const scriptNode = audioContext.createScriptProcessor(4096, 1, 1);
    source.connect(scriptNode);
    scriptNode.connect(audioContext.destination);
    // output to speaker
    // source.connect(audioContext.destination);

    // on process event
    scriptNode.onaudioprocess = (e) => {
      // get mica data
      console.log(e.inputBuffer.getChannelData(0))
    };
}, console.log);

You can also use audioContext.createAnalyser() with the microphone to get the spectral characteristics of the signal.

How to visualize a sound

In this chapter, I will show how to improve our audio player by adding the visualization of the sound waveform (sinewave) and spectral characteristics (frequency) or equalizer (let’s call it audiobars).

So what should you start with? You will need two canvases. Write them in html and get the access to them in js.

// in hmml
 <div className="bars-wrapper">
   <canvas className="frequency-bars" width="1024" height="100"></canvas>
   <canvas className="sinewave" width="1024" height="100"></canvas>
</div>
....
// in  js
  const frequencyC = document.querySelector('.frequency-bars');
  const sinewaveC = document.querySelector('.sinewave');

We will review them later on. We have already learned how to use AudioContext to decode the file and replay it. To receive more detailed information, we use AudioAnalyser. So we have to modify the method getAudioContext a little.

const getAudioContext = () => {
  AudioContext = window.AudioContext || window.webkitAudioContext;
  const audioContext = new AudioContext();
  const analyser = audioContext.createAnalyser();

  return { audioContext, analyser };
};

Now let’s do the same with our method loadFile:

const loadFile = (url, { frequencyC, sinewaveC }) => new Promise(async (resolve, reject) => {
   const response = await axios.get(url, {  responseType: 'arraybuffer' });
   const { audioContext, analyser } = getAudioContext();
   const audioBuffer = await audioContext.decodeAudioData(response.data);
   ...
   let source = audioContext.createBufferSource();
   source.buffer = audioBuffer;
   source.start();

As you can see, this method receives our canvases as parameters. We have to connect analyser to source in order to use it for our audio file. Now you can call two methods drawFrequency and drawSinewave for building audio bars.

source.connect(analyser);
drawFrequency();
drawSinewave();

To build Sinewave, you have to know two things: how to take data and visualize it. In the first chapter, I have described the concept of the sound and how it is saved on devices. I have mentioned that sound is a certain value in some point of time also known as digital audio. Let’s decode and display our points now.

For this purpose, we use the method getBytheTimeDomainData. As this method works with arrays, let’s create our array first.

...
 const audioBuffer = await audioContext.decodeAudioData(response.data);
 analyser.fftSize = 1024;
 let sinewaveDataArray = new Uint8Array(analyser.fftSize);
    // draw Sinewave
   const drawSinewave = function() {
     // get sinewave data
     analyser.getByteTimeDomainData(sinewaveDataArray);
     requestAnimationFrame(drawSinewave);

     // canvas config
     sinewaveСanvasCtx.fillStyle = styles.fillStyle;
     sinewaveСanvasCtx.fillRect(0, 0, sinewaveC.width, sinewaveC.height);
     sinewaveСanvasCtx.lineWidth = styles.lineWidth;
     sinewaveСanvasCtx.strokeStyle = styles.strokeStyle;
     sinewaveСanvasCtx.beginPath();

     // draw wave
     const sliceWidth = sinewaveC.width * 1.0 / analyser.fftSize;
     let x = 0;

     for(let i = 0; i < analyser.fftSize; i++) {
       const v = sinewaveDataArray[i] / 128.0; // byte / 2 || 256 / 2
       const y = v * sinewaveC.height / 2;

       if(i === 0) {
         sinewaveСanvasCtx.moveTo(x, y);
       } else {
         sinewaveСanvasCtx.lineTo(x, y);
       }
       x += sliceWidth;
     }

     sinewaveСanvasCtx.lineTo(sinewaveC.width, sinewaveC.height / 2);
     sinewaveСanvasCtx.stroke();
   };

Here are two main things to remember.

The first is the parameter analyser.fftSize. It indicates the accuracy with which the audio data is decoded or, in other words, the length of the array sinewaveDataArray. Try to change this value and see how the type of the wave changes. The second is requestAnimationFrame(drawSinewave) - which means that our function will work before the update of screen frames in a browser .

All the rest is just a simple code for working with canvas. To build an equalizer, let's write the function drawFrequency. It’s implementation is like the previous one, the only difference is in calling the method analyser.getByteFrequencyData(frequencyDataArray) and code of the canvas (now we build the rectangles, not the line).

analyser.fftSize = styles.fftSize;
let frequencyDataArray = new Uint8Array(analyser.frequencyBinCount);
   const drawFrequency = function() {
    // get equalizer data
     analyser.getByteFrequencyData(frequencyDataArray);
     requestAnimationFrame(drawFrequency);

     // canvas config
     frequencyСanvasCtx.fillStyle = styles.fillStyle;
     frequencyСanvasCtx.fillRect(0, 0, frequencyC.width, frequencyC.height);
     frequencyСanvasCtx.beginPath();

     // draw frequency - bar
     const barWidth = (frequencyC.width / analyser.frequencyBinCount) * 2.5;
     let barHeight;
     let x = 0;

     for(let i = 0; i < analyser.frequencyBinCount; i++) {
       barHeight = frequencyDataArray[i];

       frequencyСanvasCtx.fillStyle = styles.strokeStyle;
       frequencyСanvasCtx.fillRect(x, frequencyC.height - barHeight / 2, barWidth, barHeight / 2);

       x += barWidth + 1;
     }
   };

Now you know how to build a simple sound visualization. If you have worked with the canvas before, adding new effects is not a big deal.

In the second part of the article, you will learn useful tips and tricks on how to stream an audio file. Don't hesitate to check it out right now!

Part two: How to work with sound in JS: Audio streaming