A convolution based alternative to electrical loudspeaker correction networks

Impulse response correction can certainly achieve this, but I think it generally requires a higher degree of resolution which carries an increased risk of artifacts away from the measurement point.
I think that is one of the main reasons I find that impulse response averaging works better when the room is fairly live, you can use a longer window without getting the artefacts. What is left is the speaker and persistent room contribution. A single point measurement used for the basis of my current correction does not sound good to me.

I do hope to get some feedback on this..
Why do you think a single point measurement is needed for your new correction?

Are you referring to the number of cycles of correction, and if so how many do you think are required for an improved impulse response? Do the number of taps used have an effect?
The number of taps sets the overall length of filter (in time) and the resolution of the filter at low frequencies. A 65,536 tap filter is usually enough with a 44K sample rate. A lower tap count will make the correction less fine grained at low frequencies, which may or may not matter depending on the window lengths and other parameters. If the filter is 1 second long and the impulse is centred you can correct up to a 500ms window of time. So the two are interlinked. The downside to a longer filter is latency. You could also have a filter with lots of taps for high frequency resolution but low latency if you changed the impulse centre and used minimum phase filtering.
 
Hi, Spartacus:

When I used the term "resolution" I was referring to how smooth or detailed the response of the correction filter itself is. And yes, a more detailed response in the frequency domain means more resonance or time domain ringing - hence more freq cycles/periods the filter is potentially correcting over. Since impulse response inversion treats peaks and dips equally, achieving a flat freq response envelope will result in a more detailed filter than psychoacoustic based EQ.

BTW, I don't mean to imply that I feel the goal for everyone should be a flat freq response envelope or that psychoacoustic EQ is automatically superior to impulse response inversion. Response dips, while they may not "color" the timbre of a sound to the degree that peaks will, may still carry information about location, and at the very least can skew phase response. If someone has a highly unsymmetrical speaker/room setup, and is mainly concerned about the response at the sweet spot, than they may be better off with impulse response inversion. I haven't experimented with such a situation personally.

As for a sufficient resolution for impulse response cleanup, I don't personally feel that going beyond ~4.3 cycles or so (1/6 oct frequency resolution) is necessary (this would mean setting lower/upper window lengths to 19000/19 samples). For a minimum phase filter, this would require at least 8192 taps or so (for a tight squeeze and a slight loss of low freq resolution), and should be plenty for enjoying music; one could certainly push things further with the goal of having an out of body experience :).
 
Last edited:
Hi, fluid:

I don't think a single point measurement is absolutely needed for my config, I just don't think a multipoint avg is neccessary. The prefiltering removes a significant portion of the room's contribution, and the EQ stage ignores a significant part of what remains.

My goal is to optimize for the main listening position while minimizing filter resonance that might be noticed from other locations. I also want this process to be simple and repeatable. I'm ok with compensating for a problem that only occurs at the sweetspot as long as the compensation doesn't make things sound too bad elswhere in the room. Averaging requires more work, and may introduce resonances to the measurement that are foreign to the sweet spot.

I probably don't have as much experience with averaged measurements as you (I have tried it), however, I am able to (quasi) anechoically correct my speakers, and I can say that my current method gives me a subjectively better result with a similar overall level of filter resolution.
 
I found a new plugin that might be interesting.

It's called Pulse from Lancaster Audio.

It looks cool as you can load a right and left correction file.
It also does rate sampling on the fly.
And they mention it adds zero latency.

I haven't tested it much, but the price is just right for those who would like to try it... it's free!

Pulse - The only Free IR Loader for Pro Tools

Looks quite interesting, especially for Mac users who could use the plugin with a VST loader like Hosting AU, which lets you route the system sounds through it, so it is possible to measure before and after DSP.

Hosting AU
 
Hello!


This thread is very informative but i have some problems.


First of all, i'm not sure if drc is finding the impulse center properly. It's detected at 12000 because it's the highest peak, but i think it's maybe located at 11998. If someone with more experience could take a look at the impulse in audacity and tell me if drc is right and i'm overthinking it or wrong, i would be very grateful.


I've tried changing the parameters in the config file to M and 11998 but from what i understand from drc designer logs the left channel filter compile properly, but the right channel complains that the impulse center is not found.



I suspect its a bug of the drcdesigner frontend that is unable to pass the correct impulse center info to the right channel if impulse mode is set to manual instead of auto (i've looked at the bat files and it seems when sets to auto, the software use auto for the left channel and then in some way take that value and send it to the right channel bat which is set to M + the sent value.. at least this is what i understand opening the various bats and config files in notepad++, but i'm not software engineer).


Long story short, i had liked to listen to the correction with the changed impulse center to hear if it's better or not, but i can't, so i've come here to ask you all!


Dropbox - LeftSpeakerImpulseResponse44100.pcm - Simplify your life
Dropbox - RightSpeakerImpulseResponse44100.pcm - Simplify your life


The second problem i have is that in this thread there are various references in the first pages at config files that seems to have changed in the op post (for example the talk is about 4cycles.drc and now there is a prefilter.drc and psycho.drc). I would like to know how i can use the new files, possibly in the drcdesigner frontend, or the exact procedure i have to follow with drc in the command prompt to apply this two pass configuration properly.



Thanks to whoever will help and sorry for the thread resurrection!
 
Mindscapes,

I would agree that the first peak (11998) is probably the center. It can be tricky to assess sometimes due to reflections, xovers, ports etc...This is one reason I would suggest starting out with a minimum phase filter.

I'm not a DRC Designer user, so I'm afraid I can't offer much help there. You could always try using the provided scripts/outline...

The instructions for the current custom filter are included; the file serves as an addendum to the main instructions. No need to worry about impulse center location with this method. This filter is meant to be a sort of "room compensated speaker contour network" and has basically no resolution in the bass range. Currently I'm working on a 1/3 octave configuration; I could make the filter for you if you want to try it out...
 
Mindscapes...

Thanks for your reply!

In the end i made it work, used the original drc from cmd. The problem i had was that i used the wrong import options to import the pcm filter into audacity and this caused a clipped impulse, now importing it as raw 32bit float with no endianess fixed the problem! It seems it's better with the 11998 impulse center used!

For sure, i would like to try the filter, can you also explain your reasoning in making it? Helps me learn some new things..

I use PSNormFactor = 2.5 PSNormType = S instead of 1.0 and E in my generation of the standard soft filter because otherwise it clips, don't know if it matters in your filter!

Thank you for the help
 
The purpose of the custom (two pass) configuration is to generate a filter based on a single (listening position) measurement which brings the perceived direct sound into alignment, while minimizing (room induced) audible filter resonances away from the measurement location.

The prefilter stage discards the excess phase information, and applies a short (and fairly sharp) frequency dependent window. These combined steps remove a fairly significant portion of the "room sound".

The second stage calculates (and inverts) the frequency response envelope, and applies a target response. This further reduces the possibility of excessive room correction (room induced dips are largely ignored) while providing good perceived channel matching (due to the well matched response envelopes).

The current (Bark resolution) configuration is meant to clean up the speaker response, while optimizing frequency balance and channel matching. The 1/3 octave configuration I'm working on does a bit more "room EQ".

If you want me to make you a 1/3 oct filter, I would need L/R impulse responses and mic calibration file.
 
The purpose of the custom (two pass) configuration is to generate a filter based on a single (listening position) measurement which brings the perceived direct sound into alignment, while minimizing (room induced) audible filter resonances away from the measurement location.

The prefilter stage discards the excess phase information, and applies a short (and fairly sharp) frequency dependent window. These combined steps remove a fairly significant portion of the "room sound".

The second stage calculates (and inverts) the frequency response envelope, and applies a target response. This further reduces the possibility of excessive room correction (room induced dips are largely ignored) while providing good perceived channel matching (due to the well matched response envelopes).

The current (Bark resolution) configuration is meant to clean up the speaker response, while optimizing frequency balance and channel matching. The 1/3 octave configuration I'm working on does a bit more "room EQ".

If you want me to make you a 1/3 oct filter, I would need L/R impulse responses and mic calibration file.


Hi Gmad, and thanks for the explanation. I understand, basically you try to correct only the basic speaker frequency response, a bit like applying eq to a close miked measurement, but without the close miking part. In theory this should improve the correction to work in a large area, given that it becomes basically room agnostic/speaker focused,but should also have the drawback of being less precise in the sweetspot, or we could say it doesn't try to solve the problems of the room right?


I will have a go at it, i have a sweet spot in the room that i can't change

due to acoustic treatment in place that make it fixed, and i sit there when i have to do critical listening. But i also enjoy watching movies from my bed which is off center (sound system is in my bedroom) so i could appreciate a filter that preserve good quality even outside the sweet spot area.


Personally i've found that with the standard soft configuration FDW i get vastly improved performance even in that offcenter position outside the sweet spot, but maybe with this filter i can improve it even more.


Unfortunately i don't have calibration for the mic. I have some cals around (it's an ecm8000) but i figured that due to production variance was better to not use any of them. I plan to buy a dayton which comes already calibrated sometime in the future, but i suppose even without calibration the thing isn't that bad because the before and after correction is like night and day and obviously better, and the fr dosn't sound skewed. But obviously a calibrated mic will make it perfect! The impulses i attached them in my previous post, if you want to have a go at it with the new method you are developing.



In the meantime i will try the one attached in your original post and see how i like it!
 
I've tried the psycho filter in the op.


There's no doubt that it's much better at imaging and soundstaging than the standard soft filter.



the stage is enveloping and if i close my eyes i could very well be in the place depicted in the recordings. Bjork live albums are especially impressing, and when the crowd applaude or get excited and she's singing i can really hear the different distance between her and the public. Very impressed.


What i would like to try given than i listen mostly in the same spot, is finding out how much we can push the room correction amount before this clarity and imaging become worse or artifacts become apparent.


The frequency tilt of the filter is pheraps too much flat compared to the slanted one i used before, and i would like a bit more correction in the lows, but even like this, i prefer this spaciousness of sound compare to the more analitical filter i used before.
 
Mindscapes, I put a link in post #1 to a filter made with my 1/3 octave psychoacoustic configuration. Two target responses are used - one of which defines the high pass characteristics. In this case I chose 32Hz LR24. For the downward slope I used "PsychoTarget" from post #1. I'm guessing you might prefer a stronger slope here though...
 

Attachments

  • MindscapesPrefilteredFR.png
    MindscapesPrefilteredFR.png
    28 KB · Views: 179
  • MindscapesCorrectedFR.png
    MindscapesCorrectedFR.png
    26.1 KB · Views: 191
  • MindscapesWavelet.png
    MindscapesWavelet.png
    180.1 KB · Views: 185
Last edited:
Hi Gmad, and thanks for the explanation. I understand, basically you try to correct only the basic speaker frequency response, a bit like applying eq to a close miked measurement, but without the close miking part. In theory this should improve the correction to work in a large area, given that it becomes basically room agnostic/speaker focused,but should also have the drawback of being less precise in the sweetspot, or we could say it doesn't try to solve the problems of the room right?

That's basically it. After the prefiltering, much of the remaining room effect is seen as dips in the modal range, and we can ignore this to a degree since we're processing in the frequency domain. With the Bark resolution filter, the resolution rapidly diminishes in this range so there is especially very little room EQ taking place provided the speakers are not too close to a wall. Of course we're giving up some precision with such a smooth filter response, but the response is still optimized for the sweet spot as long as a single measurement is used.

BTW, the prefiltering window of the 1/3 oct config is 4.3 cycles (1/6 oct), and the EQ resolution is 1/3 oct over the whole range.
 
I did a bit of listening and i think i prefer the original psychofilter. The new one seem to lose a bit of that holographic quality. By the way, thanks for taking the time of making it for me.



I use another dsp after the correction which removes speaker crosstalk by sending out from each channel a phase inverted signal of the opposite channel delayed by the interaural time difference. This has the effect of removing the comb filtering inherent in stereo reproduction and greatly enhances realism, but i suspect it also makes the system very fussy about phase. Maybe that's why the changes in the filter don't work for me. It's by no way bad, but with the other one i find myself more immersed in the music. I will give another try tomorrow with fresh ears and after gain matching it with the older one.



What parameters do i have to change in the prefilter and psycho configs to reintroduce a bit more room correction for the bass frequencies? I would like to try!
 
OK, thanks for trying it.

We can take the correction from Bark resolution up to ERB with the following edits:

Prefilter config

MPLowerWindow = 19000
MPUpperWindow = 19

Filter config

PTReferenceWindow = 38000
PTBandwidth = -2

The increased low freq resolution may lead to more boosting of the low freq range.


Note: in my current secondary (filter) config files I have reduced MPLowerWindow/MPUpperWindow to 2205/2 to further reduce the influence of the inversion stage.
 
OK, thanks for trying it.

We can take the correction from Bark resolution up to ERB with the following edits:

Prefilter config

MPLowerWindow = 19000
MPUpperWindow = 19

Filter config

PTReferenceWindow = 38000
PTBandwidth = -2

The increased low freq resolution may lead to more boosting of the low freq range.


Note: in my current secondary (filter) config files I have reduced MPLowerWindow/MPUpperWindow to 2205/2 to further reduce the influence of the inversion stage.


Applying these settings causes a weird bump of the frequency correction and response at 2.5k... strange! This happens even if i change back to the bark scale, so it's definitely something that has to do with one of the other 3 parameters..

 
Great. I guess that setting was too aggressive for all cases. This is why feedback is important.

I have made some edits to the Bark package including incorporating the PsychoTarget.


Happy to give feedback ;)



I should try to change that parameter even with the default psycho config file, because the resulting fr curve without applying aggressive target correction was actually slanted updward instead of flat (basically i had to end the curve at - 12 instead of -5 to have some resemblance to the b&k slope). Maybe this will fix that problem (unless it was intended, but it had too much upward slant to be intentional i think, it was like a reversed b&k)



I have listened to the mp19000/9 38000 erb file and compared it to 16000/6 32000 bark. Unfortunately i have applied a slightly different target to the 16000 with more mid emphasis (it starts to slope down at 600 hz instead of sloping gradually from the beginning).


Anyway the impression is that with the erb scale file the sound maintain the holographic quality to an extent but become more laser focused, clinical and precise, but i don't know if i can say it's better. With the original config it almost seems sound is more diffuse and give the impression of being larger but slightly less precise.


I use this track to test for things like these YouTube. Also this has a little intruder that sounds very very real at the beginning if everything is tuned properly YouTube (obviously i listen to flac files and not youtube). I know it's not everyone cup of tea but the first part with the field recordings and electronic bits that fly around and front to back in the soundstage really helps to evaluate holographic imaging. It's a bit more precise with the erb file and sounds bigger and with better tonality in the bark scale.


I also use often mtvunplugged - placebo to evaluate voice and instruments "realness" and with the bark one i like the tonality more and i get more "in the crowd" feeling. But it could be the because of the different target curve here.


I have some questions:



1) What are the advantages of the Bark scale over the ERB? I've seen the graph in the drc manual but i'm more interested in subjective experience, do you find it more musical?


2) is 19000 a number that works well with the erb scale, or can i test the subjective differences beetwen the two while keeping the rest of the parameters the same? (so 16000-16 in the prefilter and 32000 in the second filter)



3) How do mpupperwindow and mplowerwindow interact between the two config files and what exactly happens if i lower the two to 2205/2 in the second filter?



4) How do you do the calculations to find out how many cycles at a given sample rate the filter correct and how many ms is the windowing for any given frequency? And for cycles, do we mean full waveform return to 0 here? What's the unit of measure of the windows numbers in the drc config files?


5) Last but not least, how do the overcorrection artifacts that are spoken off in the manual sound? I know how the transient smearing due to linear phase eqing sound in music production, is this what is referred to or is it something more apparent? It seems Sbragion speaks about something apparent at some frequencies, is it like some sort of resonance? With the standard config files i could hear a less natural sound for sure going from minimal to strong, that's why i choosed soft before trying your filters, but nothing very apparent, it was usually subtle and it manifested more as listening fatigue and unrealness after listening for a while.



Sorry for all the questions, i have read the drc manual many times but a lot of concepts are pretty dense and while i understand to some extent the process i would like to understand it more, otherwise i'm just changing things hoping for the best and it becomes tedious fast, not to mention with subtle differences in sound like this sometimes i tend to second guess myself a lot! Your process is also a bit different compared to the indication and tuning method given in the manual so bear with me ;)
 
Last edited: