Fast Sub-band Adaptive Filtering (FSAF)- an introduction

A long time ago, sound experts had to use their ears to do their job, which was hard. They wanted a better way to measure speakers and rooms. They could have used advanced math, but they didn't have the powerful computers that were needed back then (and still don't exist today).

In 1985, the iNTEL 80386 CPU just had just been released. Doug Rife founded DRA Labs in 1986, and released MLSSA in 1987. It was software AND hardware to go hand in hand for the PC. It came with sound card with a 12bit ADC that fit into the computer's ISA slot, and was capable of sampling in excess of 100KHz.
(IBM and compatible PCs didn't have ADCs at that time)

Around the same time a company called Audiomatica in Italy releases Clio, which it's own HR-2000 hardware that plugged into a PC's ISA slot + software to run in DOS.

These hardware/software solutions needed a 32bit PC, as well as also an optional math co-processor 80387, which costed US $800 in 1987)… $2000 in today’s money) just for MLSSA to run.

At this time computer were at their infancy, but anyone doing any real work had to opt for that math co-processor. When the 80486 CPU came in 1989, it had a built-in math co-processor, and a complete PC with monitor, keyboard and costed around US$3,000 (US$7,500 in today's money). Sounds like a lot, but it seemed like a bargain compared to the original IBM PC which has meagre performance.

Moving forward, in 2000, Angelo Farina with his AES paper, discussed the logarithmic sine sweep, aka log chirp, borrowing from radar technology, created by Sidney Darlington in 1947. It was better than the previous method… yet still people said these measurements didn't match what they actually heard.

In the early 2000s, Michael Tsiroulnikov invented a "divide and conquer" method that allowed acousticians to use better math, without spending too much money or computing resources and named it FSAF.

What is FSAF?

"Fast Sub-band Adaptive Filtering (FSAF) is a technique used in audio processing to measure and separate the music from everything else. It is a proven technology that has been tested, and used in the real world in the field of acoustic echo cancellation.

Have you experienced this technology before? Good chances you probably have. Have you ever witnessed someone talking to a smart speaker to say e.g.

“Hey Google, what’s the time?” whilst the smart speaker was still playing music?
How was the microphone able to hear what was said, above all the loud music that was being played?

Here's how it works in the context of listening to, and testing audio devices, like speakers.



1. Splitting the Signal:
FSAF divides the audio signal into smaller frequency bands, called sub-bands. Think of it as dividing a picture into small pieces, like pieces of a jigsaw puzzle. Each sub-band contains a specific range of frequencies, making it easier to analyze.

2. Adaptive Filtering:
The technique uses adaptive filters that can adjust themselves in real-time to changes in the audio signal. This helps in accurately measuring each sub-band.

3. Measuring distortions
In each sub-band, changes in the signal can be treated as additive noise. By using a mathematical method called Least Squares, we can estimate the true response of the speaker and the room for that specific sub-band. This estimation gives us a clear picture of how the audio should sound without the added character, effects or noise from the speaker.

4. Combining Results: After analyzing each sub-band individually, the results are combined to create a full-band response. Think of it like putting all the puzzle pieces back together to see the complete picture. The added character, effects and noise calculated for each sub-band are actually discarded in this step.

5. Full-Band Distortion Measurement: The full-band response, which is now a more accurate representation of the speaker and room's true sound, is used to identify the character, effects and noise. These distortions are the leftover “residuals” after applying linear filtering.


Thus, FSAF allows one to play music through a DAC/ADC, an amplifier, or a speaker, and “subtract” the original input from it.

What you have left over is the “residual” - parts that weren’t in the original audio file.

So what can expect to hear with FSAF, when measuring your speakers?

For starters, yes you will get your frequency response.

You will also be able to listen to distortion. Harmonic distortion, intermodulation distortion… all audio that isn't the original file being played back. This includes noise, the Barkhousen effect, ringing, echoes… even faint sounds that you had been accustomed to and your brain may have tuned out e.g. cars driving by, the steady tick of the second hand of a distant wall mounted clock, it’s all there.

For a long time FSAF was used in industry in acoustic echo cancellation, but as of June 2024, it was incorporated in REW 5.40 beta32. It is still in beta testing...
 
Last edited:
I would look at FSAF as a primarily time domain analysis, which has the ability to catch or identify things that are not easily found using a frequency domain analysis like a sine sweep, or even a multitone spectrum analysis. I've mentioned multiple times now, the ability to create audio that contains only the residual distortion + noise in the recording path allows for great insight into loudspeaker performance and analysis. It also allows for measurement with real audio, so you no longer had the problem of using a synthetic signal like a sine wave, and then requiring multiple measurements and multiple magnitudes to provide enough detail to capture what occurs with real audio. You may use your favourite tracks that you've heard 1000 times before, the crest factor benefit here is real. Being able to listen to the distortion residual standing alone can make it a lot easier to pick out during playback, since you will now know what you are listening for. Those small nuances can be a lot more easily identified this way.
 
  • Like
Reactions: aslepekis