FFT windowing and frequency response resolution

CharlieLaub · 2013-09-22 6:33 am

Due to some inadvertent thread hijacking of another thread, I thought I would move the topic here into its own thread. To recap, start reading at this link to the "other" thread and continue for pages and pages.

We've been discussing the process of windowing an impulse response taken on a loudspeaker in a room, where echos or reflections of the direct sound are observed some time (e.g. 5 ms) after the arrival of the initial impulse. These contaminate the results. In order to create a "quasi anechoic" frequency response from the impulse, it is first "windowed" using a start and end gate. Only the signal between the two gates is used to generate the frequency response.

Because of the properties of the inverse FFT used to go from the time (impulse response) to the frequency (frequency response) domain, the finite window of data used (e.g. 5ms) creates a "lower limit" below which no frequency information is contained in the signal, and even if your signal processing program (e.g. ARTA, HOLM, etc) spit out some number for the frequency response below the minimum frequency determined by the window, it is nonsense.

The debate is whether or not padding the end of the windowed data with zeros can increase the effective window length, giving you "more" resolution in the frequency domain, at least above the minimum frequency determined by the original window length.

Game on!

CharlieLaub · 2013-09-22 7:23 am

I thought I would post the results of an interesting experiment that I did to look into the windowing effect a little more deeply.

The set up is an outboard A/D-D/A converter and a MiniDSP 2x4. I connected an output channel of the converter to the input of the MiniDSP and then routed the MiniDSP output back to the input of the converter to loop through the MiniDSP. Then I used my Active Crossover Designer tools to create a set of filters that would represent a woofer, including HF rolloff at 5kHz, a breakup peak at 4kHz, 6dB of baffle step, and the ultimate LF rolloff at 50Hz. In this way I can model a driver's frequency response, and generate an impulse from a known response without the presence of reflections or echos.

Next I used ARTA to obtain the impulse response of the system. This is shown below:

impulse%20of%20simulated%20loudspeaker%20response.JPG

Using a fixed "start" gate at 2.8mS (just before the appearance of the impulse), I then varied the "end" gate to create several different window lengths. The windows used were:
2.5ms = 400 Hz cutoff
5.0ms = 200 Hz cutoff
12.8ms = 100 Hz cutoff
22.8ms = 50 Hz cutoff
120ms = ~10Hz cutoff

The different frequency responses are shown in the figure below.

There is not really too much that is surprising here. There is no change to the high frequency data, only the low frequency end of the response is experiencing changes.

What if we throw in a high Q peak? I used a Q=20 +6dB peak at 2.1k Hz and repeated the process. First up, the impulse response:

impulse%20with%202k%206dB%20Q10%20peak.JPG

Now we have some ringing in the impulse caused by the high Q peak.
Let's see what happens when we use the same set of window lengths:

effect%20of%20window%20size%20with%202k%20Hz%20high%20Q%20feature.jpg

Again, the same changes in the low frequency part of the response are seen, but now we have some changes to the high Q peak. The bandwidth of this peak is pretty narrow, about 200 Hz wide. The 2.5ms (400 Hz) window is only partly resolving the peak, but it still picks it up partly. The 5ms (200 Hz) window does a pretty good job of representing the high Q peak. Longer windows completely capture the peak. This is because enough of the impulse, including the part generated by the high Q peak, fall within the windows used here. Note that not all of the high Q ringing must be captured within the window for the peak to be adequately resolved. For instance, the 5ms window has an end gate at 7.8ms where the ringing is still present.

What if the peak is at a lower frequency, for instance close to the minimum resolved frequency? To test this I moved the Q=20, +6dB peak to 300 Hz and repeated the process. The impulse response is shown below:

impulse%20with%20300Hz%206dB%20Q10%20peak.JPG

Now we have a more challenging problem. The impulse is not dying out very quickly.
Let's see what happens if we repeat our windowing of this impulse:

effect%20of%20window%20size%20with%20300Hz%20high%20Q%20feature.jpg

Now the 2.5ms, 5ms, and 10ms windows do not result in the feature being captured adequately. Even a 20ms window isn't great and only the VERY long 100ms window really gets it. In this case the ringing in the impulse response continues for too long to make it possible to capture the feature in the frequency response, if the measurement was taken in this way. Probably only an outdoor ground plane measurement in a LARGE open area would make it possible to gather 100+ ms of reflection free impulse response data.

So, what does all of this mean?

Well, for one, we are lucky that we don't really ever encounter a high Q peak at low frequencies! Some larger woofers with metal cones might have a high Q breakup peak, but it is likely to be over 1-2kHz. But even if the peak is narrow, if it is at a relatively high frequency it's impulse will die out fast enough to be captured by a window that is a practical length even if the width of the peak is similar to the minimum resolved frequency.

In the vast majority of cases, a real world loudspeaker should behave similar to the initial example and the windowing effects will be very benign. In fact, here is the impulse response for a real loudspeaker taken in a small (4m x 4m x 2.5m) room, for which the window was about 5ms:

As you can see, the initial impulse has died out except some low frequency oscillation. The 5ms window is perfectly adequate for describing the data.

.

mabat · 2013-09-22 8:28 am

2.5ms = 400 Hz cutoff
5.0ms = 200 Hz cutoff
12.8ms = 100 Hz cutoff
22.8ms = 50 Hz cutoff
120ms = ~10Hz cutoff

Why do you still call it cutoff, when it's just the resolution of the whole data?
With 5 ms gate there is basically the same amount of information between 0 - 200Hz as is between 200 and 400Hz, 750 - 950Hz, just as between 2000 and 2200Hz, etc.

That's the reason why it is much less of a problem as the frequency rises - the resolution is the same as this absolute number (200Hz in case of 5ms gate) across the bandwidth. At 10 kHz it is just a very small fraction of an octave. Not so at 400Hz. This simply comes from the FFT and there's nothing you can do with that.

As if you zero pad to much larger data set, there will be no cutoff. If there are 1000 points for 20kHz bandwidth, the first point after DC (0 Hz) will be at 20Hz and this is not the cutoff, it's just the next point after DC (which is also computed by the FFT). The next is 40Hz, etc... It will just "approximate" the above 5ms data with higher data density, i.e. it will look smooth. Take 100,000 points and you have almost a line. But still coarse line, regarding the underlying data if they were truncated too soon.

mabat · 2013-09-22 9:24 am

One more thing to add. If the impulse response itself ceases to zero in 5ms, it just means that the system itself is very smooth and the inherent 200Hz-smoothing made by FFT with 5ms of this data will not further affect it's frequency response - you can smooth straight line how you want, with no effect. Take 100 ms with this impulse at the beginning (i.e. add zeroes) and the frequency response will look the same, of course. Than there's simply nothing to improve by the "higher resolution" of the larger data set.

- I don't feel I have something more to say here.

gedlee · 2013-09-22 5:09 pm

Charley - thanks very enlightening discussion and right on in every regard. Marcel still does not seem to "get it", but maybe he will some day.

mabat · 2013-09-22 5:18 pm

gedlee said:
I have gating, I don't need an anechoic chamber - that's the point!! An anechoic measurement would be identical to mine above 200 Hz.

You can't prove this claim without anechoic chamber, right?

gedlee · 2013-09-22 5:27 pm

I made this comment in the other thread but I will make it again here because it is important. As a speaker gets better and better, i.e. a more and more compact impulse response with a flatter and flatter frequency response, the measurement errors from gating become less and less. This means that poor speaker benefit from these errors, but better speakers do not.

Perhaps one should always show the actual impulse response before windowing so that users of the data can see if there is a potential for a bad window implementation. I would certainly be willing to do this. I'm not trying to hide anything.

gedlee · 2013-09-22 5:31 pm

mabat said:
You can't prove this claim without anechoic chamber, right?

I am not sure that is true, but in any case I am not interested in proving anything. I know the facts.

mabat · 2013-09-22 5:34 pm

No, if you KNEW, you could prove it, right away. But you DON'T really know, because you have only data from in-room response, gated to 5 ms. That's what this is all is about

d a o · 2013-09-22 6:21 pm

Gating was never promised to be perfect but
XX many studies have shown that it is what comes closest

Not many people have access to "Anechoic chamber" and very few are so large* that it can be used for sub bass

PS no people are living in a "Anechoic chamber"So why measure a speaker there?
Gating remove the room influence, in other words, probably the best choice* anyone can use

Thanks to R. Heyser

Juhazi · 2013-09-22 6:23 pm

Dear mabat,
before you make a complete fool of yourself, please read this about Dr. Geddes or perhaps read his book

Many of us know form own experience how difficult it is to make good indoor or even outdoor measurements. This discussion was however on the FFT method.

ra7 · 2013-09-22 6:43 pm

Juhazi said:
Dear mabat,
before you make a complete fool of yourself, please read this about Dr. Geddes or perhaps read his book

Many of us know form own experience how difficult it is to make good indoor or even outdoor measurements. This discussion was however on the FFT method.

This does not mean we should blindly accept everything that someone says.

Marcel is right. The resolution of a 5ms gate IS 200 Hz. We can debate whether resonances can exist that are narrow enough to not be detected by that resolution. Toole shows EXACTLY what happens with gating and a high Q narrow resonance and warns against exactly what is being said in this thread.

For your convenience, Markus even put the figure from Toole's book up on the other thread.
http://www.diyaudio.com/forums/mult...directivity-how-important-67.html#post3641163

The FFT process is clearly explained in ARTA and other software. It is quite well understood. The only argument here is whether 200 Hz resolution is enough from 200 Hz up.

Kindhornman · 2013-09-22 6:55 pm

Charlie,
Thanks for the great examples, they do show well what is happening. I'm not sure what mabat thinks you would see that you don't see in a windowed in room impulse response that would show in an anechoic chamber except at very low frequencies which are generally of little consequence in a small room anyway. The room will skew everything anyway at low frequencies whether the speaker has perfect impulse response or not.

Mabat,
Depending on your analysis setup with or without windowing if the length of the window is long enough what is it that you think is being hidden or missed? You can use a very long window or no windowing at all and the real result is that at some point the room reflections will become a part of the signal you are measuring. So if you want to see all of the hash from room addition just increase the windowing length or look at impulse response and try and determine what is what after a long enough interval. If you are using close miking techniques it is easy to separate the two, it becomes obvious when the room addition becomes part of the signal. Charlie's last chart in post #2 clearly shows this, the initial impulse response has approached zero and the room response becomes obvious after a set time. You could measure the response and for all practical purposes tell the distance to the wall for the first reflection knowing the impulse response that was originally generated.

I would say that Earl has nothing to hide and there is not much to prove by showing the difference between anechoic measurements and room measurements above a minimum frequency. If that be the case I think that all the computer based analysis equipment and software then is a misrepresentation, and that I do not believe. Show me one major audio company that does not rely on this type of measurement system, I know of non that would make that claim.

CharlieLaub · 2013-09-22 7:38 pm

ra7 said:
This does not mean we should blindly accept everything that someone says.

Marcel is right. The resolution of a 5ms gate IS 200 Hz. We can debate whether resonances can exist that are narrow enough to not be detected by that resolution. Toole shows EXACTLY what happens with gating and a high Q narrow resonance and warns against exactly what is being said in this thread.

For your convenience, Markus even put the figure from Toole's book up on the other thread.
http://www.diyaudio.com/forums/mult...directivity-how-important-67.html#post3641163

The FFT process is clearly explained in ARTA and other software. It is quite well understood. The only argument here is whether 200 Hz resolution is enough from 200 Hz up.

You and other are, I believe, still too swayed by the general case and not restricting yourself to thinking about loudspeaker/driver impulse responses - that is afterall what we are all trying to measure, not some arbitrary signal. I did mention in the other thread that some loose restrictions apply. These are that the impulse has sufficiently died out such that most or all of it is captured before the reflections begin. Earl has mentioned that in the case of a smooth frequency response (e.g. in the absense of a high Q resonance) this will be the case. I showed with my example in post #2 of this thread that a "typical" woofer response can be processed without a problem using the windowing technique. And yes one should look at the impulse to see if indeed enough of it exists before the reflections occur, but this is easy enough to see if you increase the gain to see the details around zero.

But perhaps this is a kind of chicken-versus-egg argument. I say that the impulse has to die out sufficiently in order for my argument to hold water. This implies that there are not high Q resonances, at least at a "low" frequency. Another way to say this is that there is a requirement for the response to be "sufficiently smooth", which you have mentioned is like saying that there are no features narrower than the minimum resolved frequency, e.g. 200Hz is typical. All of these imply the same thing in one domain or the other. The only caveat is that I showed that a 400Hz window could partially resolve a 200 Hz wide feature if it is a high (enough) frequency. Since the FFT operates on a linear (not log) frequency scale, 200Hz is 200Hz no matter where in the frequency spectrum the feature is found. So it would seem that this needs a little more study.

But back to the "loose restrictions" thing. I don't think that it is unreasonable to assume that no high Q resonances will be found in the "lower" frequency of the response, for loudspeakers anyway. In this case the criteria for a "smooth" response seems to be fulfilled.

Here is a proposal: if you or others think of a type of frequency response that you feel would cause the "smooth" assumption to be violated, if you post a description of what that frequency response would look like (please constrain it to be something that a loudspeaker could possibly generate) I will try to reproduce it using my method from post #2, generate the impulse and then try the various windowing to see what happens to the resulting frequency response. I think that we could all learn from this kind of exercise, since it will provide some more concrete data to talk about.

-Charlie

dumptruck · 2013-09-23 3:00 pm

The "how much is enough" question is why I posted a nearfield measurement in the other thread.

Kindhornman · 2013-09-23 3:51 pm

Dumptruck,
The simple answer to your question is really what type of detail you are looking for. If you are only looking for a general trend you can make the window rather short, but if you are designing a raw frame speaker you want to see everything, you want to see those high Q resonances that would be hidden in a short smoothed response curve. In room response of a finished speaker system would just look messy if you had every perturbation in a response curve, what are you going to do with that much detail if you are just trying to get the position of the cabinet in an optimum place? So I would say there is no one size fits all.

dumptruck · 2013-09-23 4:05 pm

Hey, I didn't ask the question.

gedlee · 2013-09-23 4:26 pm

Charlie has the right idea. It is the high Q resonances that can cause problems, but I need to point out once again that it is only high Q peaks that are problematic, not high Q dips. A dip would have to be very very sharp to have a tail of many ms.

Let me give an example of when this issue can be a problem but also how easy it is to resolve.

In one of my designs the woofer had an exceptionally high resonance well above where I would be using it. this resonance DID have a tail that exceeded my window length and as such it was not captured correctly. When I designed the crossover for this unit (using this measured data, of course) and then measured the result, it was not correct - owing to the incorrect nature of the windowed response.

However, by simply putting a 6 dB/ octave LP filter in series with the woofer, the tail was decayed sufficiently and the data was corrected. Using this new data in my crossover software, after accounting for the 6 dB/oct filter, I was able to get a crossover design that worked perfectly.

It is interesting to note that the final system easily decayed within the time window once the crossover was implemented, because, of course, the crossover did exactly what was required to eliminate the woofer resonance from the response.

So while "errors" can sometimes occur, they can always be detected and always compensated for. If windowing were indeed a resolution limiting effect then this would not be possible. It all comes down to knowing what you are doing versus not knowing what you are doing.

And as I said to Markus before; If I wanted to trick the measurements I could do so in ways that you could never detect. Anyone CAN do this. (Directly manipulate the impulse response in Cool Edit!! No one could ever detect that.) You either trust that the data is done correctly and is being shown in a fair manner or you don't. Maybe this is why I virtually never trust anyone's data but my own. But then I guess no one else should trust mine either. It's a real dilemma - leaves audio in a state of complete mistrust, total lack of any objective evidence and feeds the marketing peoples claims that "It's what you hear - measurements don't tell the story!" Measurements can tell the story, but just as with economics it all falls apart once trust is removed.

Kindhornman · 2013-09-23 5:04 pm

Sorry Dumptruck, just went off that last posting as if you asked the question.

ScottG · 2013-09-23 5:23 pm

CharlieLaub said:
..Well, for one, we are lucky that we don't really ever encounter a high Q peak at low frequencies!

In the midrange (not really low freq.s)..

How do you know?

Not a break-up resonance, but perhaps something relating to the suspension (particularly in-box operation).

(..of course I'd always look to the impedance trace first before questioning something like this.)

BTW, great post! :up:

FFT windowing and frequency response resolution

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member