rePhase, a loudspeaker phase linearization, EQ and FIR filtering tool

Actually at LF, it is simple to get pressure response flat. The smaller the room, the easier it is.

There are 3 regions for EQ of speaker in rooms.
  • At HF, the speaker is sufficiently far from boundaries to enable EQ of the 'anechoic' response. This has been the focus of most Digital EQ efforts in the last decade or so. It is known to give good results.
  • At LF, when room dimensions are the same or greater then the wavelength, the room is a 'point' and very simple EQ gives good results.
  • The difficult part is the 'midrange' where there are significant reflections which can't be distinguished from the direct response but cause complicated response changes.

It is the 3rd region where I believe room & speaker EQ still needs a lot of research.

We know that speakers in anechoics sound terrible. So what are we trying to achieve? With respect to Bohdan and other workers in this field, I think the jury is still out on this question.

I have not done serious work on this for well over a decade so am interested in what others have found. This millenium, the DSP power available to us is far greater than the early 90's. But what should we be using it for?

Barley, how different are the EQs for the 2 channels on your final system? Can you post a curve(s)?

You mean you can get linear phase interconnects? :eek:

All the interconnects I've measured have been Minimum Phase. They satisfy one of the necessary AND sufficient criteria.

How do you measure your interconnects?

Even when room is point phase rolls off. Room acts as large closed coupler; when doing calibration work coupler is chosen with dimensions <1/6th wavelengths studied for flattening phase response.

Above sub as band pass system passes fundamental and three next terms of square wave at 15Hz resulting in recognizable square wave:

sq 15Hz.gif

Going down to 8Hz more terms are passed and square form is still fairly good:

sq 8Hz.gif

Phases of output components remain largely coherent to source stimulus.

Above is obtainable at >2m. For listener, sweet spot at these frequencies is big.

I think Bohdan and my goals are fairly clear: Improving temporal fidelity.



Integrated with two-way that does square waves <60Hz, result is phase coherent system all the way to HF cut off. Midrange difficulty averted; this of course is by thinking along lines of JohnK concerning listener and source transfer in producing measurements for correction, and manner of integrating components together.

I've no idea how long is would take to get this result with rePhase, if possible at all. Prior to using Kirkeby I've done lots of iterative tweaking with individual filters as found in rePhase, and typical of boxes such as DCX2496, flattening responses and optimizing delays, and know it is easy to blow a lot of time, and get nowhere the results with mathematical inversion.
 
Prior to using Kirkeby I've done lots of iterative tweaking with individual filters as found in rePhase, and typical of boxes such as DCX2496, flattening responses and optimizing delays, and know it is easy to blow a lot of time, and get nowhere the results with mathematical inversion.

This I'm very interested in. In this document,

http://pcfarina.eng.unipr.it/Public/Presentations/audioprecision-workshop.pdf

the Kirkeby transform is described as:

The IR to be inverted is FFT transformed to frequency domain:H(f) = FFT [h(f)]2) The computation of the inverse filter is done in frequency domain:Where ε(f) is a small, frequency-dependent regularization parameter3) Finally, an IFFT brings back the inverse filter to time domain:c(t) = IFFT [C(f)]

(plus a couple of formulae that haven't copied across)

It is supposedly an automatic way to invert the impulse response without excessive gain at the driver's frequency extremities. Does this kind of thing work even if the system is non-minimum phase?

(I ask, because complete inversion of the impulse response does actual magic, doesn't it? A time domain echo can be cancelled out with an appropriately-timed further impulse from the speaker, and then the first cancellation impulse is cancelled out with another impulse, and then that impulse is cancelled out with a further impulse, and so on. In a non-minimum phase system it is possible for this to go unstable if the impulses are not diminishing - I think that's the general idea.)

@Barleywater, when using the Kirkeby transform, what do you personally use for ε(f) for the various drivers you're testing?
 
I've no idea how long is would take to get this result with rePhase, if possible at all. Prior to using Kirkeby I've done lots of iterative tweaking with individual filters as found in rePhase, and typical of boxes such as DCX2496, flattening responses and optimizing delays, and know it is easy to blow a lot of time, and get nowhere the results with mathematical inversion.

It will take some time with current version of rePhase because you will have a lot of correct/convolve/measure iterations to do (or simpler correct/convolve if using the C=A*B thingy in HOLM or REW), but it will soon be easier when loading measurements will be available in a next version (soon to be released... :wchair: ).

If doing a measurement inversion is your goal then DRC-FIR is probably the best tool to use (with a given target curve) as it will automatically take care of a lot of potential pitfalls related to frequency variable windowing and correction limits (or PORC for MP correction and then textbook FIR filters...).
But even with these precautions this kind of automated correction requires a lot of care and knowledge in the measurement procedure.
In all cases, inversion based on a single measurement will likely not be valid for all frequencies, nor for all listening positions, and can cause problems greater than the ones it corrects in some situations...

I am sure you know how to do it properly of course (even without DRC-FIR) but for most users I find that these automated approaches often promise a lot of things but give disappointing results to the unsuspecting user.
In this regard I find manual corrections to be a much more coherent and safer approach for most situations. It also forces the user to really ponder each correction and know exactly what he is doing, and I think it is a good thing...

Anyway, rePhase is focused on manual corrections, so these inversion discussion are quite OT :p
 
In all cases, inversion based on a single measurement will likely not be valid for all frequencies, nor for all listening positions, and can cause problems greater than the ones it corrects in some situations...

Could you elaborate on that further? I would be most interested in correcting a driver from a nearfield measurement, I think. Would this be so close to a minimum phase system, that complete impulse response inversion would be equivalent to hand-tailored rePhase EQ, anyway?

Yes, DRC seems to be the way for room correction (but I have never yet got good results with it - my own shortcomings for sure, not the program's).
 
Could you elaborate on that further? I would be most interested in correcting a driver from a nearfield measurement, I think. Would this be so close to a minimum phase system, that complete impulse response inversion would be equivalent to hand-tailored rePhase EQ, anyway?
For a narrow bands it should be quite easy, as it would just be a matter of finding the good measurement technique (including impulse post processing) and distance.
But some frequency bands are more difficult than others like midbass, where room starts to dominate, and baffle also typically starts not to be large enough to restrict the radiation to <180°...
In such situations the only practical solution is to average measurements taken at several positions in the listening "window", as anechoic measurement would not really help even if they could be achieved.
Low frequencies are much easier (close mic), but then it is quite meaningless in a room anyway...
The high frequency driver can be accurately dealt with with an anechoic measurement (windowed), but then if it is not a directive device (and even then) you will have to deal with diffraction artifact that are not stable with measurement position.

A brutal impulse response inversion, without any processing done to the impulse, would be the worst case.
When you see perfectly linear amplitude (and phase) responses you know that it was measured at exactly the same position as the correction was calculated from. Move the mic from XX centimeters and guess what happen...

Yes, DRC seems to be the way for room correction (but I have never yet got good results with it - my own shortcomings for sure, not the program's).
Room is part of the system anyway, and automated frequency-dependent windows are the best way to deal with its effect.
Jean-Luc Ohl, the author of Align2 (Jlo on this forum) is working on a multi-point version of Align2 that will use DRC-FIR. I bet it will give very nice results.
 
Last edited:
A brutal impulse response inversion, without any processing done to the impulse, would be the worst case.
When you see perfectly linear amplitude (and phase) responses you know that it was measured at exactly the same position as the correction was calculated from. Move the mic from XX centimeters and guess what happen...
... In this regard I find manual corrections to be a much more coherent and safer approach for most situations. It also forces the user to really ponder each correction and know exactly what he is doing, and I think it is a good thing...
I second all that ;)

pos, you are right that the midrange is the most difficult and I'm still not sure what is the best strategy for it. Averaging over an area is somewhat naive but I haven't got a better plan. :mad:

What I DO know is that you don't want to achieve 'anechoic' results.

Kirkeby

I've never used Kirkeby regularisation, Angelo Farina's favourite method, for speakers or rooms. But I have a lot of experience of it for microphones.

It's really a method to avoid trying to EQ 30dB dips in response. I was at the previous millenium IoA conference where it was first proposed.

Prof. Farina has an AGM/DPA4 soundfield microphone. He has never been able to get good EQ for it using Kirkeby regularisation. (A soundfield microphone has substantial EQ). In 2008, he visited me in Cooktown and brought his DPA4.

I used old fashion techniques, slightly updated with 21st century digits to devise an IIR EQ for him and for the first time, he got good sound from the mike. :)

He also thinks IIRs are evil while I think FIRs are evil. So I devised some evil FIRs for him based on my evil IIRs so everyone was happy. :D
________________

Coppertop, you are right about the cancelling of echoes. I first did this in the late 70's Use of Tapped Delay Lines in Speaker Work well before supa dupa digits were easily available.

The stability criteria is a bit complicated. IIRC, the best explanation is "Invertibility of a Room Impulse Response - Neely" or something. It's not in AES.
________________

Barley, I'm still interested in the Left & Right EQs for your system. Can you post a curve or two?
 
Last edited:
This I'm very interested in. In this document,

http://pcfarina.eng.unipr.it/Public/Presentations/audioprecision-workshop.pdf

the Kirkeby transform is described as:



(plus a couple of formulae that haven't copied across)

It is supposedly an automatic way to invert the impulse response without excessive gain at the driver's frequency extremities. Does this kind of thing work even if the system is non-minimum phase?

(I ask, because complete inversion of the impulse response does actual magic, doesn't it? A time domain echo can be cancelled out with an appropriately-timed further impulse from the speaker, and then the first cancellation impulse is cancelled out with another impulse, and then that impulse is cancelled out with a further impulse, and so on. In a non-minimum phase system it is possible for this to go unstable if the impulses are not diminishing - I think that's the general idea.)

@Barleywater, when using the Kirkeby transform, what do you personally use for ε(f) for the various drivers you're testing?

It is best to experiment and learn.

Pos: Sourceforge DRC is clogged up with author's prescriptions for all the over bandied words concerning linear phase: pre-ringinging. Years ago I started using computer as many; looking to assist in making passive crossovers behave better. Along the way Rod Elliot, Linkwitz, and Smith's dspguide.com got me going with DSP, after having already explored the more technical side of Cool Edit Pro to point of using own band pass bursts for speaker measurements. This lead to DCX2496, and continued study. I waded into muck of DRC, and already knew enough that the various published results, and all the tweaks and iterations of practitioners was fishy. I finally brushed up with Farina's various papers, and got to bending head around swept sine measurements, and saw Kirkeby in:

http://pcfarina.eng.unipr.it/Public/Papers/226-AES122.pdf

From here I explored Farina's Auro plugins, and also based on Farina's descriptions built swept sine pairs for measurements using his time domain approach. This always leads to measurement system with windowed sinc function representing the bandwidth, and pre-ringing as viewed in time domain. Faina's plugs also provided my first access to MLS, and puzzled over how it does DC to Nyquist.

DRC package uses measure like Farina's, and authors awareness failed to address this. They are not alone; other packages use this. Holm, Audiolense, REW, and others use better method, by building swept sine inverse in frequency domain, and using FFT to get time domain, but this is like Kirkeby.

Examination of DRC source code show a section with Kirkeby name. It's the nuts and bolts behind the curtain. Author's insisting on listening position solutions without full understanding of reciprocity lead to lots of kludges to get around krglee's fore mentioned "midrange" problem.

The whole correct the listing position thing lead to variety of "frequency dependent windowing" approaches, making some ease of application, and lots of sacrifice to potential.

Bodzio UE describes merging nearfield far field measurements to get around the midrange problem, this is JohnK's baby. Still, in terms or reciprocity something doesn't jibe quite right in my mind.

And you point to it again here with near field for woofer, and gated far field for >200Hz.

I go other way, near as possible for driver array integration of mains, and listening position for woofer, because as mentioned in recent posts, room becomes like point with decreasing frequency, and near field/far field distinction becomes lost with the narrowing phase margin.

All modes for room have origin at speaker. When measurement microphone placed outside of basis mode about speaker, locally applicable solution is obtained, often with "head in vise sweet spot". With microphone within 1/4 wave of speaker for all frequencies of interest, inverse is correct solution for modal peaks of given frequency throughout room... With microphone within inches of speaker, sonogram still shows distinct reflections withing room, and often from within speaker enclosure.

Mantra: Linear time invariant system is periodic and has an inverse.

Kirkeby may be used to generate inverse for swept sine measurements that lets swept sine effectively measure DC to Nyquist. Kirkeby may be applied to MLS to generate inverse too. Extremely powerful....solution of millions of floating point samples. 128bit encryption pales in comparison.

Convolve a piece of music with MLS signal and it sounds like noise. Convolve again with time reversed MLS, and out comes music, with lead in and lead out, not magic, information science.

In a transmission system with periodic IR, if it is sufficiently linear, then the mantra applies. For sampled system with given time bandwidth, all samples define arbitrary impulse response, thus possibility of IR recovery of same result using swept sine, MLS, and potentially any broad band signal.
 
I second all that ;)

pos, you are right that the midrange is the most difficult and I'm still not sure what is the best strategy for it. Averaging over an area is somewhat naive but I haven't got a better plan. :mad:

What I DO know is that you don't want to achieve 'anechoic' results.

Kirkeby

I've never used Kirkeby regularisation, Angelo Farina's favourite method, for speakers or rooms. But I have a lot of experience of it for microphones.

It's really a method to avoid trying to EQ 30dB dips in response. I was at the previous millenium IoA conference where it was first proposed.

Prof. Farina has an AGM/DPA4 soundfield microphone. He has never been able to get good EQ for it using Kirkeby regularisation. (A soundfield microphone has substantial EQ). In 2008, he visited me in Cooktown and brought his DPA4.

I used old fashion techniques, slightly updated with 21st century digits to devise an IIR EQ for him and for the first time, he got good sound from the mike. :)

He also thinks IIRs are evil while I think FIRs are evil. So I devised some evil FIRs for him based on my evil IIRs so everyone was happy. :D
________________

Coppertop, you are right about the cancelling of echoes. I first did this in the late 70's Use of Tapped Delay Lines in Speaker Work well before supa dupa digits were available.

The stability criteria is a bit complicated. IIRC, the best explanation is "Invertibility of a Room Impulse Response - Neely" or something. It's not in AES.
________________

Barley, I'm still interested in the Left & Right EQs for your system. Can you post a curve or two?


My room, as most has issues, and to keep perspective, I use same correction for left and right, and usually sum for sub. My goal is getting original signals into room, and letting it behave as such. Good room placement of corrected speaker yields excellent sound stage, and spectral content of reflections correlates well, leaving mind to concentrate on direct sound.
 
... Still, in terms or reciprocity something doesn't jibe quite right in my mind.
I don't understand your use of the word reciprocity. What does it have to do with measurement?

Mantra: Linear time invariant system is periodic and has an inverse.
Can you explain this mantra? It is a VERY strange LTI system that (has a) periodic (impulse response?)

Kirkeby may be used to generate inverse for swept sine measurements that lets swept sine effectively measure DC to Nyquist. Kirkeby may be applied to MLS to generate inverse too.
I'm not sure why you need Kirkeby to measure a system. It is an EQ method. In fact as you say
..thus possibility of IR recovery of same result using swept sine, MLS, and potentially any broad band signal.
No need for Kirkeby to do this.
__________________

My room, as most has issues, and to keep perspective, I use same correction for left and right,
I sorta suspected this from my own experience. Can you post the 2 frequency responses for left & right at your listening position?

Did you try getting both left & right 'perfect' using different EQ for the 2 channels? What did it sound like?
 
My intention is not to correct listening position, but speaker. Idea is to get original waveform into room. Entry point is not the listening position. A soloist performs in front of listener, and of course room affects sound. Soloists sound bad in anechoic space too.

Reciprocity: Same result would be obtained if microphone and speaker position are swapped. True for omni speaker and microphone, thus my interest in Pluto type speaker.

LTI has same output for same input every time. As length of stimulus signal is shorted to single sample, output is impulse response of system.

You don't need Kirkeby to measure system, it allows you to build convolution pairs that produce nearly perfect correlation result for desired number of samples. When this sample period is greater than a linear time invariant system's impulse response, the impulse response of the system is returned without aliasing.

Correction EQ with individual filters for each peak and dip requires best fit methods with iteration. Kirkeby inverse is simultaneous solution for phase and amplitude of all frequency bins in FFT of time domain IR, and happens to be great correction EQ.

If I didn't think sound at listening position is fantastic, I wouldn't be sharing this. My hope is that scientifically minded individual would attempt to duplicate my results so comparison and discussion of results are possible.
 
My intention is not to correct listening position, but speaker. Idea is to get original waveform into room. Entry point is not the listening position.
Can you clarify what you are doing?

Are you EQing the 'anechoic' response of the speaker using Kirkeby inversion?

Is this 'anechoic perfect' speaker than placed in your room with no further efforts to compensate for the room?
 
Hi,

Some comments on Kirkeby and Inverse HBT.

Kirkeby inverse equalization method involves frequency-dependent regularization parameter E(freq). The role of E(freq) is to maintain the inverse filter as calculated, but only within the frequency band of interest. Outside this frequency band, the E(freq) must have such value, that it extinguishes the inverting filter’s operation.

The effect of convolving loudspeaker response with such inverted filter produces flat response within the frequency band of interest, and returns to the original frequency response outside this band.

In order to accomplish this, the E(freq) function obviously must include transition bands on both sides of the frequency band of interest. There is possibly an infinite number of ways that E(freq) may transit from one level to another. Farina proposed logarithmic transition over 1/3 octave in AES convention paper “Implementation of a double StereoDipole system on a DSP board – Experimental validation and subjective evaluation inside a car cockpit”. Kirkeby in his AES Preprint 4916 gave some hints, but nothing specific.

The point here is, that in any case, within the transition bands, the E(freq) is a manually imposed function, that has nothing to do with the original loudspeaker frequency response.


In order to understand Inverted HBT difference, please have a look into http://www.bodziosoftware.com.au/Square_Wave.pdf and focus on Figure 5.

The dark-blue curve is the inverted filter SPL from 91Hz to 5220Hz. The orange curve is the corresponding inverted phase response. But something interesting happens on the high-side of 5220Hz and low-side of 91Hz.

In both instances, the phase response tapers-off to 0deg (as you would expect). The shape of phase response within both transition bands is mathematically related (calculated) to where the original loudspeaker frequency/phase response was at transition points, and also includes what happens on both sides of the transition points. So, at 91Hz, the phase response tapers off between 100Hz to 40Hz (more than 1octave). Above 5220Hz, the phase response tapers-off all the way to 50kHz (2.2 octaves). All this is based on the original loudspeaker frequency response – and is not arbitrary selected transition function.

This is the power and mathematical elegance of Inverted HBT. It gives you mathematically correct amplitude/phase response of the inversion filter across the whole bandwidth.

Best Regards,
Bohdan
 
Some comments on Kirkeby and Inverse HBT.

... In order to understand Inverted HBT difference, please have a look into http://www.bodziosoftware.com.au/Square_Wave.pdf and focus on Figure 5.
... This is the power and mathematical elegance of Inverted HBT. It gives you mathematically correct amplitude/phase response of the inversion filter across the whole bandwidth.
Bohdan, if I understand you correctly, you
  1. select a bandwidth to EQ and straighten the amplitude response outside these limits.
  2. apply the Hilbert Transform to this 'truncated' (in freq. domain) amplitude response to get phase. But this is also called getting the Minimum Phase !.
  3. You invert (in complex freq) this Minimum Phase response to get your EQ
  4. This leaves you with an amplitude response which is flat in the passband but exhibits the usual Minimum Phase behaviour. ie a non-flat (but minimum) phase response

  5. You take this non-flat phase response and use it to make an all-pass network with the same phase response but flat amplitude
  6. You take this resultant impulse response and time reverse it to get a further EQ which when applied to the Minimum Phase EQ in 2 turns it into a Linear Phase EQ
 
Bohdan, if I understand you correctly, you
  1. select a bandwidth to EQ and straighten the amplitude response outside these limits.
  2. apply the Hilbert Transform to this 'truncated' (in freq. domain) amplitude response to get phase. But this is also called getting the Minimum Phase !.
  3. You invert (in complex freq) this Minimum Phase response to get your EQ
  4. This leaves you with an amplitude response which is flat in the passband but exhibits the usual Minimum Phase behaviour. ie a non-flat (but minimum) phase response

  5. You take this non-flat phase response and use it to make an all-pass network with the same phase response but flat amplitude
  6. You take this resultant impulse response and time reverse it to get a further EQ which when applied to the Minimum Phase EQ in 2 turns it into a Linear Phase EQ

Not Bohdan, but to answer your questin that is basicly the idea, but since it is all done with FIR filters the minimum phase EQ and the phase linearization are both done with a single convolution in the frequency domain.
 
Can you clarify what you are doing?

Are you EQing the 'anechoic' response of the speaker using Kirkeby inversion?

Is this 'anechoic perfect' speaker than placed in your room with no further efforts to compensate for the room?

Direct response is what human hearing uses for direction and timbrel identification.



What I DO know is that you don't want to achieve 'anechoic' results.


No, I don't listen to live music in anechoic venue, and I don't listen in anechoic chamber at home.



At what point, and under what conditions is a speaker suppose to reproduce source waveform?
 
You cannot measure in anechoic situation at all frequencies in a room anyway, so you have to deal with multiple measurement methods for different frequency ranges, and deal with the different shortcomings...

The main grip I have with the automated inversion methods leading to a completly flat response (correcting even high Q disparities) as shown above, is that they entierly rely on the accuracy of the (single) measurement.
If the measurement is not spacially averaged over at least small angle around the listening axis it will have a lot of articats specific to its exact position, such as diffraction for example (coming from object as well as from the box, or the driver frame itself).
"Correcting" these artifacts is not a good idea IMHO...
I would be curious to see the look of the perfectly flat corrected curve with another mic position than the one used to construct the correction.

For automated corrections I find DRC-FIR, or PORC (using a finite number of biquads) much more "stable" approaches.
And I even prefer the manual "full-fledged" approach :D (which is the topic of that thread ;) )
 
Hi kgrlee,

Yes, as John commented, this is essentially the gist of Inverted HBT.
Please note, that this approach offers you the option of running your equalized system (loudspeaker+equalizer) in minimum-phase or linear-phase modes, and outside the equalized bandwidth, the whole system is still clearly defined as linear-phase or minimum-phase, depending on your choice of phase characteristics.

Now, back to Kirkeby. I do not own a device, that is based on this algorithm, therefore I am struggling to understand the implications of their method.


  • When you create a system (loudspeaker+Kirkeby equalizer) to equalize the loudspeaker between F1 and F2 – is the system between F1 and F2 “linear phase”, or “minimum phase”.


  • As I understand, such system has transition regions below F1 and above F2. So my next question is: within the transition regions, is the system “minimum-phase”, or “linear-phase” or “undetermined-phase”.


  • Do the transition regions fall into the audio bandwidth for any driver?. If yes are they audible?. How do you deal with this issue?.


  • Can you run Kirkeby equalizer in minimum-phase mode, or it imposes straight away linear-phase mode?.

Can anybody answer these questions?.

Best Regards,
Bohdan
 
Bohdan,

Just to make things perfectly clear. The UE allows the system to run in linear phase mode, or with just minimum phase EQ applied. When just the EQ is applied the system phase remains essentially that of the specified crossover. It will reduce to a minimum phase system only if the crossover does not introduce its own nonlinear phase, such as a 1st order acoustic crossover. I just wanted to make that clear.

POS, while I understand what you are saying about equalizing flat based on a single measurement, what happens off axis is largely dependent on the design of the speaker. If constant directivity is maintained off axis then the single point reference for eq is actually pretty good. If you don't want to EQ every small ripple you can start with a smoothed response, like 1/3 octave smoothing applied to the reference measurement.

In any event it is still possible to take any number of measurements and average them with the UE. The danger with that approach is that such an averaged response can actually result in a reference response that is worse than the measurement at any of the points contributing to the average. This is because you are not dealing with a simple average of amplitude but of a vector sum. This becomes very sensitive at higher frequency where the averaged response may exhibit comb-filtering effects.