How to Work With Sound In JS: Audio File Streaming (Part 2)

How to Work With Sound In JS: Audio File Streaming (Part 2)

This is the second part of the article "How to Work with Sound In JS: Сustom Audio Player With Web Audio API". Here you will learn about ins and outs of audio streaming process.

Let's dive in.

How to stream an audio file

You may have noticed that we have been always waiting for the entire file to load. When the Internet is slow, it feels like ages. What about playing the and loading the file simultaneously? You can see the example of such approach here, and the whole code of implementation right here.

Unfortunately, Fetch API doesn’t support the streaming response. However, the similar functionality can be implemented with the help of sockets. To split our file into separate chunks and send them to the client in parts, use the library socket.io-stream. Below you can see how the implementation will look like. Check out the entire code here

 

const ss = require('socket.io-stream');
const server = http.createServer(app);
const io = require('socket.io').listen(server);

io.on('connection', client => {
  const stream = ss.createStream();
  client.on('track', () => {
    const filePath = path.resolve(__dirname, './private', './track.wav');
    // get file info
    const stat = fileSystem.statSync(filePath);
    const readStream = fileSystem.createReadStream(filePath);
    // pipe stream with response stream
    readStream.pipe(stream);
    ss(client).emit('track-stream', stream, { stat });
  });
});

This code is waiting for client connection first. After emitting new event track-stream for client, it listens the event track. In such a way we have an access to the stream object with on('data' method on client. Now we have to change the method loadFile by adding socket.

import ss from 'socket.io-stream';
const socket = socketClient(url);
const loadFile = ({ frequencyC, sinewaveC }, styles, onLoadProcess) => 
  new Promise(async (resolve, reject) => {
   socket.emit('track', () => {});
   ss(socket).on('track-stream', (stream, { stat }) => {
     stream.on('data', (data) => {
       // calculate loading process rate
       const loadRate = (data.length * 100 ) / stat.size;
       onLoadProcess(loadRate);
       // next step here
     })
   });

This code is connected to our server. After this, we receive stream ss(socket).on('track-stream' and together with the stream get chunk stream.on('data'. Having the information about the size of the file and the length of the chunk, we can calculate the loading time.

One of the most difficult parts now is to figure out how to play the entire file if we have only its separate parts. There are several challenges.

The first one is that we can’t just write in the following way

audioContext.decodeAudioData(data); // will throw exeption here

The main reason for this is that socket.io-stream sends the data in the raw format and decodeAudioData doesn’t process it. Instructions on what to do with the data are stored in header. In the first chapter, I have shown the example of the file structure and described a header. Here is when it comes to help.

There are two ways of solving this problem: cut the header from the server and send it to the client or generate it on the client (which is more simple, in my opinion). For this, you can use the function withWaveHeader. (the entire code is available here) Now our code will look the following way:

 const audioBufferChunk = await audioContext.decodeAudioData(withWaveHeader(data, 2, 44100));

How to play the entire file

Since our data is decoded, we can replay it. The problem is that this is only a small part of data. If we try to replay data as they become available, it will turn out into mess. Try to execute the following code

stream.on('data', async (data) => {
      const audioBufferChunk = await audioContext.decodeAudioData(withWaveHeader(data, 2, 44100));
      source = audioContext.createBufferSource();
      source.buffer = audioBufferChunk;
      source.connect(audioContext.destination);
      source.start();
...

You will hear the set of random sounds, but not the song. The fact is that the data will appear faster than each set of audiofile. So each piece of data should be played with a certain delay. But which one? This delay should be equal to the duration of the previous chunk of data. Try to change only one line of code.

  source.start(source.buffer.duration);

This code means: Play next audio chunk after source.buffer.duration time period. However, such approach is not perfect. Here are some of the questions you may want to ask. How to visualize the process of playback? How to restore the playback? The first problem is easy to solve.

As we already know from the previous example, to display the process of playback, we have to know its start time and duration. Defining the start time is easy. To calculate the duration, you have to know the duration of the received piece of data and how many percents it constitutes from the general duration of the track.

 ss(socket).on('track-stream', (stream, { stat }) => {
     stream.on('data', (data) => {
       // calculate loading process rate
       const loadRate = (data.length * 100 ) / stat.size;
        const audioBufferChunk = await audioContext.decodeAudioData(withWaveHeader(data, 2, 44100));
        source = audioContext.createBufferSource();
        source.buffer = audioBufferChunk;
        // here duration of track
        const duration = (100 / loadRate) * audioBufferChunk.duration;
       })
 })

The second problem is a little bit harder. Each time we get the data, we have to reinitiate source. We can call the source.start or source.stop only in the instance we have. Moreover, if we save firstSource in the moment of calling firstSource.start , it will play only the amount of data it received at the moment of start and after that will just stop. That’s why we have to merge all chunks of data and play the audio even if it has stopped.

To merge the chunks, let’s use the utility appendBuffer (the whole code is available here)

const newaudioBuffer = (source && source.buffer)
    ? appendBuffer(source.buffer, audioBufferChunk, audioContext)
    : audioBufferChunk;
    source = audioContext.createBufferSource();
    source.buffer = newaudioBuffer;

To continue playback from the pause, let’s record the duration of the source.buffer.duration at the moment of playback, and start an interval that will check if the time has passed. If it has, we have to start playing a new source with the period of time we have recorded.

After the loading has stopped, let’s delete our interval and work with the file itself. The whole code is available here

const whileLoadingInterval = setInterval(() => {
 if(startAt) {
   const inSec = (Date.now() - startAt) / 1000;
   if (playWhileLoadingDuration && inSec >= playWhileLoadingDuration) {
     playWhileLoading(playWhileLoadingDuration);
     playWhileLoadingDuration = source.buffer.duration
   }
 } else if(source) {
   playWhileLoadingDuration = source.buffer.duration;
   startAt = Date.now();
   playWhileLoading();
 }
}, 500);

How to stream a server mic

As I’ve already mentioned, file is not the only one source of the stream. There are other sources as well, for example the server microphone. It’s used not so often, however, there are some cases. For example, I was dealing with transferring the sound on Raspberry Pi and visualizing it on the phone. I just want to emphasize that it’s possible and there are libraries for that. One of them is mic.

micInputStream.on('data', function(data) {
    console.log("Recieved Input Stream: " + data.length);
});
// or
const stream = ss.createStream();
micInputStream.pipe(stream);

Conclusion

In these set of articles, I have shown you how to work with sound in JavaScript and how to create and visualize a custom audio player with React and Web Audio API. If you have any questions, don't hesitate to leave a comment below.

Also, if you are building an app and wondering how much the development process costs in a professional software development company, try out our App Cost Calculator. You'll find out in no time the approxamite number of hours (and prices) needed to develop the essential features for your web or mobile app.

 

Read also: