Taps, FFT Length, Resolution, Latency, etc

This is a basic question, but I cannot find a good reference. Can someone point me to what is meant by and interactions between 'taps', FFT (Fast-Fourier Transform, I assume) length, resolution, and latency? The first two are configurable values used in constructing convolution filters in RePhase, and the latter two are from Equalizer APO's system analysis tab.

I'm aware that generally increasing the number of taps increases the 'accuracy' (trying to avoid incorrectly using resolution since I don't know the technical meaning) of a filter, but I don't know how what it actually is nor do I know how that interacts with the other three items above.

What I find particularly confusing is that Equalizer APO's analysis tab allows you to set different 'resolution values', and doing so seems to show that the relationship between resolution and CPU use is convex - it's highest at the lowest resolution settings and decreases for a bit before increasing again. I would have expected this to be closer to a linear relationship - CPU use decreases as resolution decreases, but again, not really know what is meant by resolution in a technical sense, I don't know how to make sense of this.

Finally, if Equalizer APO says channel A has latency of 700ms, does this roughly mean that the audio signal has a lag of 700ms? For example, if all other channels have a latency of 700ms, does this mean there's a delay of 700ms before audio starts but everything is processed and output 'in time' while if all other channels had a latency of 690ms, channel A would be 10ms behind the rest?

Thanks. I'm more than happy to read up on good references as this is something I'd like to understand anyway. I'm a mathematician but as an algebraic geometer, never had a need to learn things like Fourier Transform let alone study signal processing.
 
  • Like
Reactions: 1 user
This is a basic question, but I cannot find a good reference. Can someone point me to what is meant by and interactions between 'taps', FFT (Fast-Fourier Transform, I assume) length, resolution, and latency? The first two are configurable values used in constructing convolution filters in RePhase, and the latter two are from Equalizer APO's system analysis tab.

I'm aware that generally increasing the number of taps increases the 'accuracy' (trying to avoid incorrectly using resolution since I don't know the technical meaning) of a filter, but I don't know how what it actually is nor do I know how that interacts with the other three items above.

What I find particularly confusing is that Equalizer APO's analysis tab allows you to set different 'resolution values', and doing so seems to show that the relationship between resolution and CPU use is convex - it's highest at the lowest resolution settings and decreases for a bit before increasing again. I would have expected this to be closer to a linear relationship - CPU use decreases as resolution decreases, but again, not really know what is meant by resolution in a technical sense, I don't know how to make sense of this.

Finally, if Equalizer APO says channel A has latency of 700ms, does this roughly mean that the audio signal has a lag of 700ms? For example, if all other channels have a latency of 700ms, does this mean there's a delay of 700ms before audio starts but everything is processed and output 'in time' while if all other channels had a latency of 690ms, channel A would be 10ms behind the rest?

Thanks. I'm more than happy to read up on good references as this is something I'd like to understand anyway. I'm a mathematician but as an algebraic geometer, never had a need to learn things like Fourier Transform let alone study signal processing.

you ever found this stuff out?
 
I'm not an expert but:
https://en.wikipedia.org/wiki/Digital_filter
The analog signal is converted to a FIFO list of sample values, and a calculation is done that sums the values from certain points along that list, called "taps", each multiplied by some (tap) coefficient. The output is (usually) based on the sample currently in the middle of the list. For example, if a sample from time T from the center is a tap, the frequency where that time is 180 degrees is attenuated. "Old" sample taps produce a phase lag and "future" taps produce a phase lead. Mixing both old and future sample cancels the lead-lag effects leaving only amplitude changes. The time the stream spends in the list causes a delay, latency, typically half the memory. The sample rate limits the upper frequency. The length limits the lowest frequency that can be processed.

An FFT is related but it's used to analyze a signal, not filter it. The expression FFT dates from a time when computers were slow and so sample multiplication was re-used to save time, since each sample is, for example, 45 degrees at frequency f, 90 degrees at frequency 2f, 180 degrees at frequency 4f, etc. Today, many applications simply use DFT (discrete Fourier transforms) to simplify memory management. Like an oscilloscope, a spectrum analyzer may capture a series of samples of the signal, analyze and display it, then go get another sample, ignoring the signal in the meantime, unless the processing can be done faster than the sample period.
A Fourier transform is based on the fact that the product of two sine waves produces an offset/DC component when the frequency of the two sine waves matches. This is also true for other patterns and is used in digital "correlators" used in CDMA radio, and image/ pattern recognition. See "product detector" and "synchronous rectifier". So, when you multiply a signal sample times a sine wave, you find the amount of that frequency component in the signals. This has to be done twice for each frequency, at 0 and 90 degrees. https://en.wikibooks.org/wiki/Trigonometry/Graph_of_Sine_Squared
https://en.wikipedia.org/wiki/Fast_Fourier_transform
 
Last edited:
If talking about a linear phase FIR filter, there are a few equations to figure out the filter delay, frequency resolution, etc.

https://dspguru.com/dsp/faqs/fir/properties/ - see section: What is the delay or a linear phase FIR filter

https://www.minidsp.com/support/for...like-opendrc-has-some-new-competition?start=6
- number of taps versus low frequency resolution and interval spacing.

https://www.minidsp.com/images/documents/fir_filter_for_audio_practitioners.pdf
- more on number of taps and frequency resolution.

Using a 65,536 tap linear phase FIR filter, at 48 kHz sample rate, as an example, the delay through the filter would be using the formula from dspguru:
(65,536 - 1) / (2 x 48,000) = 682 milliseconds.

A 65,536 tap linear phase FIR filter, at 48 kHz sample rate would have a frequency resolution of: 48,000/65,536 = 0.732 Hz

Putting some context around that, the frequency range spans from 0 Hz to 24 kHz (fs/2). Thinking of a FIR filter as a graphic equaliser, then 24,000/0.732 = 32,768 eq sliders for our FIR equaliser, with each eq slider spaced at 0.732 Hz. That is a high resolution filter.

A rule of thumb to predict the low frequency limit of the FIR filter would be 3 x the frequency resolution, so 3 x 0.732 Hz = 2.2 Hz.

I used a 65,536 tap linear phase FIR filter as an example as that is typically what is used for Digital Room Correction (DRC). Assuming one is using a linear phase FIR filter plus a software convolver and not h/w DSP as the latter is limited by the number of taps as described in the opendrc link above.

So depending on the convolver used, yes, the filter would delay the audio signal by 682 milliseconds. Some convolvers are 0ms latency meaning the convolver itself does add not any additional latency.

One can also have a 65,536 tap "minimum phase" FIR filter, which means there is no delay if using a 0ms latency convolver. The minimum phase FIR filter peak starts at sample 0. Whereas a linear phase FIR filter peak starts at sample 32,768.

Typically linear phase FIR filters are used for Digital Room Correction because they have the ability to independently control the frequency and phase responses. So one can correct the frequency response of a minimum phase system, but most rooms have non-minimum phase behaviour, so independent excess phase correction is applied to correct for the rooms low frequency non-minimum phase response. A good explanation is John Mulcahay's paper on minimum phase: https://www.roomeqwizard.com/help/help_en-GB/html/minimumphase.html

PS. Sometimes folks get concerned about linear phase FIR filter "preringing" effects, but all modern DRC software has preringing compensation built-in and is no longer an audible issue.

Hope that helps.
 
Typically linear phase FIR filters are used for Digital Room Correction because they have the ability to independently control the frequency and phase responses.

This doesn't make sense. By definition, a linear phase FIR filter does not equalize the phase, it just adds a constant delay. To (sort of) independently control magnitude and phase, you need to make a FIR filter that's neither linear nor minimum phase - which is very well possible, you can make any phase response you like as long as it fits with the available taps.

PS. Sometimes folks get concerned about linear phase FIR filter "preringing" effects, but all modern DRC software has preringing compensation built-in and is no longer an audible issue.

Interesting, do you know any description how that works?
 
This doesn't make sense. By definition, a linear phase FIR filter does not equalize the phase, it just adds a constant delay. To (sort of) independently control magnitude and phase, you need to make a FIR filter that's neither linear nor minimum phase - which is very well possible, you can make any phase response you like as long as it fits with the available taps.

This is in the context of using Digital Room Correction software to create the excess phase correction that then is packaged in the linear phase FIR filter.

Re: preringing, again in the context of DRC. I can't find the exact paper I have at the moment, but I think this will give you an idea:
http://pcfarina.eng.unipr.it/Public/Presentations/aes122-farina.pdf

There is a thread on FIR preringing here on diyAudio, mostly about has anyone heard it...
 
I see, so you do make a mixed-phase filter, but in two steps.

Regarding the pre-echo, I think I get the picture, though I'm not entirely sure about that yet. I was under the impression that filters with enormous pre-echoes are used for it, but that was probably wrong. Ideally the measured impulse response starts with the direct response, which you want to keep with room correction, and then the reflections. The room correction filter has to (partly) correct for the reflections, not for the direct response, so it doesn't have to change anything until after the peak corresponding to the direct response. I could imagine that leads to a correction filter impulse response that has its main peak straight at the start, even though it's not minimum phase.