Removing Loudspeaker Group Delay using reverse-IIR filtering

Correct. This technique and FIR have many things in common. They are not exactly the same and the calculation is done differently, so it is good that we are doing this exercise to compare them. Honestly I only learned about this algorithm about 10 days ago, so I also getting to know it better.

One are where I assume this technique may be better than FIR is at lower frequencies, where the FIR kernel must be long. At least that is the case when you are implementing a linear phase filter and also need the FIR to modify the amplitude properly. This will require more and more taps for lower frequencies. IIR filtering still uses the same structure, so it can efficiently implement low frequency filters. It would be the reverse-IIR filter that must also have a relatively long impulse but like FIR the length can be truncated in powers of 2. Since it is an allpass filter I assume that the impulse length will not influence the frequency response, so it may be possible to truncate more with RIIR. A more fair comparison would be the RIIR allpass versus an FIR filter that is only doing delay compensation, e.g. with flat amplitude, but again when the FIR kernel is strongly truncated there will also be frequency response artifacts (errors) created, at least based on my limited experience synthesizing FIR kernels.
 
Well for one, you don't need to use FIR at all. That's a plus in my book!
Wait, it's still FIR as the impulse response has been truncated (and must be windowed, of course). But the processing used is not our simple well-known convolution, folding the signal with the convolution kernel.
The convolution process is what eats FPU power, notably when using direct time-domain convolution**) which literally explodes with longer kernels... but even with the workaround, switching to the frequency domain via FFT, multiply spectra and switch back to time domain via iFFT the computational effort is large.

**) FWIW, I'm using this algorithm which reduces effort to O(N^log2(3)) as I needed a 100% artifact-free convolution which FFT/iFFT is not (especially not when using only 32bit floating point).
 
  • Like
Reactions: lrisbo
Below are some runs for a 100Hz LR4 crossover, using a variety of different latencies (impulse lengths) in the reverse-IIR processing on a 96kHz audio signal. You can observe the effect of the latency, producing zero output for the first N samples. Latencies of 2048 and 1024 correctly EQ the delay, however, shorter impulse lengths give rise to a glitch for a single sample but overall still manage to correct the group delay pretty well overall.

100Hz_LR4_various_RIIR_latencies.png
 
  • Like
Reactions: lrisbo
Wait, it's still FIR as the impulse response has been truncated (and must be windowed, of course). But the processing used is not our simple well-known convolution, folding the signal with the convolution kernel.
The convolution process is what eats FPU power, notably when using direct time-domain convolution**) which literally explodes with longer kernels... but even with the workaround, switching to the frequency domain via FFT, multiply spectra and switch back to time domain via iFFT the computational effort is large.

**) FWIW, I'm using this algorithm which reduces effort to O(N^log2(3)) as I needed a 100% artifact-free convolution which FFT/iFFT is not (especially not when using only 32bit floating point).

Correct! It's not using ANY convolution at all. The processing is done in what I call "stages". Each stage is like a 2-tap FIR filter in reverse and consists of two equations of the following generalized form:
output_sample = a * input_sample +/- b * another_input_sample + the_input-sample_2^N_ago
The processing consists of calculating the N "stages" one after the other and feeding the output from stage N into the stage N-1. There is only the one sample from "N samples ago" held in a circular buffer that is used in the calculation of each stage. So, for each stage there are only 4 multiplies and 4 additions, for a total of 8*N operations IN TOTAL per input sample. There are 2^N samples held in the circular buffer for each stage, and I am listing the number of samples in the longest one, e.g. for 2048 samples N=11.

In a way it is like running only the first stage of a partitioned convolution N times, but never needing to run the longer parts at all.
 
Last edited:
Correct. This technique and FIR have many things in common. They are not exactly the same and the calculation is done differently, so it is good that we are doing this exercise to compare them. Honestly I only learned about this algorithm about 10 days ago, so I also getting to know it better.

One are where I assume this technique may be better than FIR is at lower frequencies, where the FIR kernel must be long. At least that is the case when you are implementing a linear phase filter and also need the FIR to modify the amplitude properly. This will require more and more taps for lower frequencies. IIR filtering still uses the same structure, so it can efficiently implement low frequency filters. It would be the reverse-IIR filter that must also have a relatively long impulse but like FIR the length can be truncated in powers of 2. Since it is an allpass filter I assume that the impulse length will not influence the frequency response, so it may be possible to truncate more with RIIR. A more fair comparison would be the RIIR allpass versus an FIR filter that is only doing delay compensation, e.g. with flat amplitude, but again when the FIR kernel is strongly truncated there will also be frequency response artifacts (errors) created, at least based on my limited experience synthesizing FIR kernels.
As far as I understand it, there should be no difference between standard convolution and the new method, both using the same impulse response. The advantage of convolution is that you can use any kernel directly, without restrictions, whereas the new method seems to put some constraints on it and requires pre-processing.

This is where we can arrive at with rePhase, for a LR4/1kHz @96kHz correction kernel, 192 samples long:
1726301557606.png


With 128 samples, the errors -- notably the phase error -- get larger and choice of window shows higher impact:
1726301793792.png
 
Last edited:
Correct! It's not using ANY convolution at all. The processing is done in what I call "stages". Each stage consists of two equations of the following generalized form:

output_sample = a * input_sample +/- b * another_input_sample + the_input-sample_2^N_ago

The processing consists of calculating the N "stages" one after the other and feeding the output from stage N into the stage N-1. There is only the one sample from "N samples ago" held in a circular buffer that is used in the calculation of each stage. So, for each stage there are only 4 multiplies and 4 additions, for a total of 8*(N+1) operations IN TOTAL per input sample.
Well this sounds fantastic, almost too good to be true.

But currently I'm lost when it comes to the mentioned preprocessing of the given impulse response. As far as I can see it needs to be given in analytical form and then must be factorized, quoting the paper:
For higher order IIR filters there are various options to break them down into single-pole and c.c.-pole units. One obvious way is straight factorization, another is partial fraction expansion.
That would be a quite severe practical drawback.
Say, we have a composite excess phase from LR4@80Hz + LR4@300 + LR6@3kHz, how do we arrive at the final coefficients for the processing?
 
  • Like
Reactions: dimitri
Let me explain what I mean by "pre-processing":

The crossover filters are changing the amplitude and phase of the input signal. In a loudspeaker crossover, the LP and HP outputs are fed to drivers that convert the electrical signal into pressure waves. These propagate out into air and "sum" together. The crossover is designed so that on some "reference axis" this summation occurs properly. The crossover sum has some group delay response because of non-linear phase response of the "analog style" filters used to create the crossover. This is the group delay response we want to flatten (or linearize the phase of).

There are two ways to do this:
1. Using TWO corrections, one on the LP filter output and one on the HP filter output:
Determine the group delay response for the LP filter, design an allpass filter to have the same group delay response, implement this allpass filter as a reverse-IIR filter and place that filter AFTER the LP filter.
Similarly for the HP filter, determine its group delay response, design another allpass filter to have the same group delay response, and implement this second allpass filter as a reverse-IIR filter and place that filter AFTER the HP filter.
2. Using ONE correction on the input to the crossover:
Determine the group delay response for the CROSSOVER SUM, design a single allpass filter to have the same group delay response, implement this allpass filter as a reverse-IIR filter and place that filter BEFORE the crossover filters.

I am advocating for method #2 because it requires half the number of RIIR filters. But either way could work. When you put the RIIR filter in front of the crossover it is applying an impulse response that only has pre-ringing. It's just the impulse reverse of a normal allpass filter (that has only post-peak-ringing) reversed in time. I call this "pre-distorting" the signal, or "pre-processing" the signal. Take your pic. It's just some word-smithing on my part with no strict meaning per se.

Sorry if that caused some confusion. I could use different terminology if that made things clearer.

What you quoted from the paper, about factorizing higher order transfer functions, is not necessary. That is only if you were trying to design a higher order RIIR filter so that it could be implemented in one step. I have already used the method explicitly shown in the paper for complex conjugate poles (a second order transfer function) and the first part of the paper talks about real poles, which is the case for a first order transfer function. Higher order filters are simply a series cascade of first and second order filters, and the same applies to reverse filters.

When you have what you call a "composite phase response" from multiple crossovers, you just need to synthesize an allpass group delay response that closely approximates the group delay from all of the filters, and then apply those allpass filters in reverse, upstream of all of the crossover filters. I will try to come up with an example of this.
 
Last edited:
  • Like
Reactions: lrisbo
Hhm, I think I didn't get my point across:
Determine the group delay response for the CROSSOVER SUM, design a single allpass filter to have the same group delay response, implement this allpass filter as a reverse-IIR filter and place that filter BEFORE the crossover filters.
(Bold mine).
The first two steps is exactly what rePhase does, spitting out an impulse response in numerical form after giving it the XO parameters. But that is already restrictive as it allows only LR-type XO functions. If I have something else, like Butterworth or some hybrid non-textbook filter function (I sometimes use Bu/LR hybrids with constant 60deg phase offset between ways) then I have to use other tools, and in the most general form the excess phase is given as a measured response (properly smoothed etc, of course).

So the general question #1 would be: Is it possible at all with the new method to use arbitrary IR data in numerical form?

And question #2: Let's assume I can create an analytical allpass model (in form of cascaded digital biquads) by curve-fitting to the measured phase response, how would I proceed from there?

The processing itself is simple and I think I fully understood it, but the derivation of the coefficients is not... my math skills are a bit rusty, I will have to add.
 
@KSTR Regarding implementation: you are welcome to use my LADSPA plugin or adapt the code to your own needs and I can give you advice on how to do that. As of today, that is the only implementation of the RIIR algorithm that I am aware of. I'm still cleaning up the code so haven't released it publicly but you can PM me if you want to get something immediately. If you can pipe or route (e.g. ALSA loopback) audio data from one program to another you can use my LADSPA plugin upstream from the software you use to implement your crossovers. It's not optimal but it will work.

Also, if you have a filter kernel as numerical data, I believe you can compute the amplitude and phase from it and then get the group delay from the phase response. I'm not exactly sure what tools you have or use for DSP/filter design so I cannot recommend an exact path to you. Then it is a matter of constructing the all-pass filters to match the group delay response. Maybe others can chime in on software that can do these steps?

Of course you could measure the group delay of the working loudspeaker and then design the allpass filters based on that.

I wrote my own tools for filter and crossover design about 10 or 15 years ago. They are available on the web on my website, called ACD Tools. They are some Excel spreadsheets that are linked together. There is a tutorial but the learning curve is not easy. I generate the group delay response in the ACD Tools, so that is what I typically use. It's just designing analog style filters so it's probably not suited to your own needs.
 
Hmm, didn't check the paper or other stuff yet, but as I understood FIR filter has symmetric impulse which means pre-echo, but which cancels out with output from the other device if it's a crossover. On a point source system pre-echo is thus not an issue, but on a stacked multiway the pre-echo would show up vertical off-axis. How about with the RIIR, does it have the pre-echo and trade-off in this sense?
An IIR filter has the ringing after the impulse (not symmetric like that phase linear FIR filter, a minimum phase FIR filter is behaving like IIR), so guess where it lands if you reverse an IIR filter.

PLParEQ was a phase linear EQ that used reverse IIR processing. I've used it for exactly this job in the past. No longer in development though.
 

Attachments

  • Like
Reactions: tmuikku
I looked up PLParEQ (thanks for mentioning it). The description says:
Phase Linear Operation is achieved by processing your sound in both the forward-time and reverse-time directions through classic filters - all in realtime.
So that is an example of forward-backward IIR processing probably using the block processing method. The method I am promoting is superior to that in terms of latency (has lower latency than the block processing method).

PlParEQ only seems to offer linear phase forward-time filters, so you cannot use it to perform group delay correction on other filters or loudspeakers with passive crossovers.
 
PlParEQ only seems to offer linear phase forward-time filters, so you cannot use it to perform group delay correction on other filters or loudspeakers with passive crossovers.

True, but I've compared PLParEQ to FIR solutions and came to the conclusion there were no real world differences. So RePhase should be able get you the same results (as PLParEQ).
This would make for a great A/B/X listening test on the audibility of group delay! Let's see if all of those "linear phase" proponents can actually tell the difference in a blind test...

There's still the room to consider. The room alone will mess up the time results if it isn't accounted for. Remove the room as with headphones strips away a large part of our senses to evaluate. We feel as well as hear the sounds around us. So better make sure the speaker + room work together.
 
Say, we have a composite excess phase from LR4@80Hz + LR4@300 + LR6@3kHz, how do we arrive at the final coefficients for the processing?
Getting back to your example of a four-way loudspeaker:
The group delay responses of these crossover filters are cumulative, that is they add. The group delay for any LR4 is exactly the same as the group delay of a second order allpass filter with the same Fc and Q=0.707. The LR6 = 3rd order Butterworth Filter * 3rd order Butterworth Filter, so its group delay should be the same as a first order allpass with the same Fc + second order allpass with the same Fc and Q=1.0 (since the 3rd order Butterworth uses these values).

To EQ the phase from this hypothetical four way loudspeaker you place in series and upstream of any crossover filters the following RIIR filters:
2nd order allpass @ 80Hz, Q=0.707
2nd order allpass @ 300Hz, Q=0.707
3rd order allpass (as described above) with Fc=3k Hz.

The latencies from these filters would also add - the overall latency is the sum of the latencies of each filter. That might be something like 1024 + 512 + 128 = 1664 samples @ 96kHz, or 17.3 milliseconds.
 
Here is an example of a 3-way system, using LR4 crossovers at 300Hz and 2kHz. It's something that might be found on a 3-way speaker project.

As shown in the pic, I processed that 20Hz square wave through both crossovers to produce the signal with group delay shown a the top of the image. Next I used two 2nd-order reverse-IIR allpass filters to "reverse" the group delay, producing the lower signal. On a real system you would place the reverse-IIR stages before the crossover, but the order doesn't matter for this simulation.

Remember that the group delay produced by the sum of an Nth order crossover's LP and HP outputs is the same as the group delay of an allpass filter of order N/2 having the same Q value. That is why I keep running the LR4 - it's an easy example to show complete cancellation of group delay because its GD response is just a 2nd order allpass with Q=0.707.

The latencies produced by the REV-IIR filters was 512 and 64 samples @ 96kHz.


3way_300-2kHz_RIIR_example.png
 
  • Like
Reactions: lrisbo
Below are some runs for a 100Hz LR4 crossover, using a variety of different latencies (impulse lengths) in the reverse-IIR processing on a 96kHz audio signal. You can observe the effect of the latency, producing zero output for the first N samples. Latencies of 2048 and 1024 correctly EQ the delay, however, shorter impulse lengths give rise to a glitch for a single sample but overall still manage to correct the group delay pretty well overall.

Thanks for that!

I compared a standard FIR linear-phase 100Hz LR4, using a 96kHz sample rate and same set of samples/latencies..
Latencies of 2048 and 1024 looked similarly fine. With slippage vs ideal creeping in at 512, and 256 showing relatively strong slippage.
Both FIR and RIIR seem in line with each other.
 
There are two ways to do this:
1. Using TWO corrections, one on the LP filter output and one on the HP filter output:
Determine the group delay response for the LP filter, design an allpass filter to have the same group delay response, implement this allpass filter as a reverse-IIR filter and place that filter AFTER the LP filter.
Similarly for the HP filter, determine its group delay response, design another allpass filter to have the same group delay response, and implement this second allpass filter as a reverse-IIR filter and place that filter AFTER the HP filter.
2. Using ONE correction on the input to the crossover:
Determine the group delay response for the CROSSOVER SUM, design a single allpass filter to have the same group delay response, implement this allpass filter as a reverse-IIR filter and place that filter BEFORE the crossover filters.

At a higher level of categorization, I think the techniques for achieving linear-phase/no group delay, fall into two bins:
Done individually per channel, like your #1 way
Or done globally across all channels, like your #2 way.

Under the individual channel category, I use a #3 way (I've talked about probably way too much.)
Which is simply use FIR with linear-phase xovers on each individual channel.

I can't really understand why I would use an IIR along with a RIIR, when a FIR linear-phase high pass or low pass, accomplishes those two-steps in one easier to implement step, with less room for error.
Another reason I can't understand there would be an advantage to use IIR+RIIR vs FIR in individual channel implementations, is that for the same latency and the same number of processing channels, FIR matches the IIR+RIIR response, and offers so much more capacity for adding in additional driver corrections into the FIR file.


With regard to ONE filter applied globally #2., my experience has been this works great in 1D electrical space (where it has to if done correctly).
My attempts to do this on speakers on top of existing processing, both passive passive and analog active, have been why bother.
Most of the reports from folks who have also tried to use FIR as a global correction on existing speakers, seem to say the same thing.
I think doing this rarely accomplishes more than becoming a fancy EQ, and for global corrections one might as well stay IIR.


The individual channel route has been the where I hear and measure clear improvements.

gonna end this post for brevity/ & one topic. Will offer a FIR comparison to your 3-way next.
Again, thx for this thread...this stuff is near and dear to a lot of my DIY efforts...
 
  • Like
Reactions: uriy-ch
Here is an example of a 3-way system, using LR4 crossovers at 300Hz and 2kHz. It's something that might be found on a 3-way speaker project.
The latencies produced by the REV-IIR filters was 512 and 64 samples @ 96kHz.
Here's the same electrical 3-way, with LR4 crossovers at 300Hz and 2kHz, using FIR linear-phase xovers. Same 512 sample latency @ 96kHz.
Each sections impulse; and mag and phase below.

1726334255450.png


Flat phase of course equaling excellent square waves.
 
There is no magic in signal processing, only math. Basically, it all boils down to:

1) If you follow an IIR filter with a FIR filter that approximates its magnitude and phase response, but reversed in time, you end up with linear phase (must be delayed for causality), and double the filter order. For example, a 4th-order IIR filter treated in this way results in an 8th-order response.

2) If you can approximate the phase response of the forward IIR filter with an allpass FIR filter, and follow the IIR filter with the time-reversed FIR phase approximation, your filter magnitude response remains the same but you get linear phase (must be delayed for causality).

3) The minimum amount of causality delay is determined by the length of the FIR filter.

4) There exist ways to implement FIR convolution that are more efficient than brute-force. BruteFIR comes to mind as an example.
https://torger.se/anders/brutefir.html#whatis

5) If you have full control over the signal path, then it's probably easier just to implement your desired filter as a linear-phase FIR filter. (Symmetry can reduce the number of operations by a significant amount.)

6) If part of your signal path is not in your control, e.g., you want to linearize the phase of a loudspeaker driver, then option #2, above, may be effective.

EDIT: Added symmetry comment to #5.
 
I've always embraced IIR filtering. I've never embraced FIR filtering. I think the latter has too many pitfalls, at least I know that there are some that I don't really know and that turns me off to the technique. I have seen FIR filtering abused and misused by many well-meaning but poorly informed DIYers. IIR filtering, on the other hand, is something that I can understand. I like the light-weighted-ness of IIR filtering compred to FIR filtering and that at low frequencies or high, they work the same (apart from very near Nyquist). I don't need to worry about kernels or windowing or any of the tricks that have been developed over time by thousands of practitioners or engineers. IIR just plain works straight out of the box.

Perhaps it is because reverse-IIR filtering is a sort of half-IIR half-FIR beast, yet is still a lightweight means of linearizing phase that makes it really appealing to me. It's new and hat I find the algorithm interesting, so I thought I would share it here. It doesn't bother me that you can achieve the same thing with FIR. Up until now I have been using the trivial case of LR4 crossover, just because they make for an easy case to talk about and to correct with RIIR filtering. I could certainly offer up a very steep IIR crossover, and show that it can be made linear phase with a few RIIR stages. But this is really not at all about correcting LR4 crossovers but instead it is about stretching what is possible with IIR filtering. So that is the main message here. If one person out there finds it useful, then my job has been done. Well, one more apart from myself at least.