Dr. Lee, I hope you can clear some things for me.
My understanding of early reflections <1ms, Box Diffraction, and HOM primarily effect Binaural localization. This localization method typically works in the midrange with decreasing accuracy from 500 to 4000 Hz. Is it your contention that these phase errors, produce localization confusion undermining pinnae and intensity difference mechanisms, and these errors undermine sound quality the most? Since Interaural time difference is primarily effective <2KHz, Would raising the crossover on a Horn effectively lessen the perception of HOM?
My general belief is that both Toole and Linkwitz both consider the phase errors/group delay of 24db LR crossovers inaudible on loudspeakers in real rooms. Do you believe 24db LR filter have inaudible group delay? Most lay people use the precedence (Hass) effect to claim phase effects of 24db LR filters are inaudible, but the Olive and Toole Graph show below 5ms, the masking is less effective. Do you think using precedence to explain away phase effects in crossovers is correct?
My understanding of early reflections <1ms, Box Diffraction, and HOM primarily effect Binaural localization. This localization method typically works in the midrange with decreasing accuracy from 500 to 4000 Hz. Is it your contention that these phase errors, produce localization confusion undermining pinnae and intensity difference mechanisms, and these errors undermine sound quality the most? Since Interaural time difference is primarily effective <2KHz, Would raising the crossover on a Horn effectively lessen the perception of HOM?
My general belief is that both Toole and Linkwitz both consider the phase errors/group delay of 24db LR crossovers inaudible on loudspeakers in real rooms. Do you believe 24db LR filter have inaudible group delay? Most lay people use the precedence (Hass) effect to claim phase effects of 24db LR filters are inaudible, but the Olive and Toole Graph show below 5ms, the masking is less effective. Do you think using precedence to explain away phase effects in crossovers is correct?
I've been scratching my head at the "revelation" of level dependence.
Haven't we known forever about thresholds of perceptibility?
Yes, but how often have you actually seen such understanding actually applied to, for instance, speaker design, or more generally, to music reproduction?
The single greatest burst of application of psychoacoustics to sound reproduction has been by cell phone and mp3 player manufacturers. Their interests aren't the same as ours.
Previously it was the old line phone companies but once they got their intelligibility problems solved I think they mostly moved on to other problems post WW2.
I almost forgot, the old time radio networks and manufacturers - NBC/RCA, BBC, ORTF, they probably did some work also. But I bet most of it was empirical.
The other area of research was academic and until recently was mostly, I gather from my skimpy research, devoted to research aimed at possible medical or defence applications.
The only speaker manufacturers I know of that actually do pyschoacoustic research with an aim to applying it to product are Harmon and Earl Geddes!!
There must be others, I hope.
(I seem to remember reading about psychoacoustic researchers working for Japanese manufacturers such as Sony and Panasonic but I can't dig up a specific memory).
Sorry, but I don't see the relevance of your answer.
Let me put it this way:
Why is it surprising that a distortion product/HOM/whatever that is tens of dB below the signal would need to reach a certain level before it's audible?
I must have missed some key point.
Let me put it this way:
Why is it surprising that a distortion product/HOM/whatever that is tens of dB below the signal would need to reach a certain level before it's audible?
I must have missed some key point.
The existence of some threshold is not surprising- the level at which it sits is. If you look at his paper, at the highest levels they were testing at the THD at some frequencies was over 10%. I used to think this MIGHT be OK for a subwoofer so long as it was mostly low order, but I don't think many people thought this much nonlinear distortion was acceptable, let alone inaudible, in the midrange and high frequencies.
Paul W said:
What causes HOMs to be perceived only at higher levels?
What causes this is complexe and I'm not sure that I know. Its buried deep into the nurology and physiology of the ear. Deeper than I care to go thats for sure. Why the ear is so accute to time related things and so imune to frequency ones is very interesting.
noah katz said:I've been scratching my head at the "revelation" of level dependence.
Haven't we known forever about thresholds of perceptibility?
You don't seem to follow the discussion (perhaps this is why you are scatching your head). If the "thresholds of perceptibility" differ with SPL level, how has that ever been "known" before. I think that you are confused.
If at 70 dB SPL I cannot hear .2 ms. of group delay, or lets say the threashold is 1 ms. while at 90 dB SPL the threshold is .2 ms. How has this been "known" before? I know of no references to this fact - perhaps you could enlighten me.
noah katz said:Sorry, but I don't see the relevance of your answer.
Let me put it this way:
Why is it surprising that a distortion product/HOM/whatever that is tens of dB below the signal would need to reach a certain level before it's audible?
I must have missed some key point.
You are missing a key point.
The thresholds are SPL dependent. Thresholds of audibility of some distortion (THD, group delay, etc.) are easilily determined, but what if these thresholds differ with absolute signal SPL? Then what do these thresholds even mean? The effect is nonlinear in perception, even for a linear distortion like diffraction. Its far more complex than there simply being a threshold.
mbutzkies said:Dr. Lee, I hope you can clear some things for me.
This localization method typically works in the midrange with decreasing accuracy from 500 to 4000 Hz. ... these errors undermine sound quality the most?
My general belief is that both Toole and Linkwitz both consider the phase errors/group delay of 24db LR crossovers inaudible on loudspeakers in real rooms.
I think that this was meant for me since Lidia (Dr. Lee) doesn't visit these sights (and can't see why I do!)
I hope that you mean that the accuracy of localization INCREASEs in the 500 to 4000 Hz range because this is what happens and thus, yes, errors in this range most strongly affect sound quality judgements.
There is a big difference in group delay and non-minimum phase group delay. What I am talking about with HOM and diffraction is the later. The former is most likely far less audible than the later.
Why the ear is so accute to time related things and so imune to frequency ones is very interesting.
That is fairly easy to come by. The sensory cells of the ear fire only at the in-stroke of the zero crossing of a pressure gradient. The frequency etc are then derived from this original signal input. Therefore, the time domain is the original data domain that the ear is sampling.
I have a reference for this but it's on a crashed harddisk

Oh yeah and that easily implies why harmonics may be fairly harmless: they're just the same input to the ear as the fundamental, modulo (order of the harmonic).
Ummm... folk haven't thought about it much?
The model people may have had in mind is distortion in frequency domain and that is generally masked by higher SPL which is the opposite of diffraction, a time domain product, which is progressively unmasked by greater SPL, as was discussed a few posts back.
Airy fairy speculation:
Also, I suspect, based on my own listening, and some stuff I read a few months back, that high SPLs in live acoustic performances possibly cause our ears to generate some kind of "distortion artifacts". The memory of that sort of experience may well lead listeners of recorded music to think distortion created by say, speaker box diffraction, is just normal high level musical sound.
The model people may have had in mind is distortion in frequency domain and that is generally masked by higher SPL which is the opposite of diffraction, a time domain product, which is progressively unmasked by greater SPL, as was discussed a few posts back.
Airy fairy speculation:
Also, I suspect, based on my own listening, and some stuff I read a few months back, that high SPLs in live acoustic performances possibly cause our ears to generate some kind of "distortion artifacts". The memory of that sort of experience may well lead listeners of recorded music to think distortion created by say, speaker box diffraction, is just normal high level musical sound.
MBK said:
That is fairly easy to come by. The sensory cells of the ear fire only at the in-stroke of the zero crossing of a pressure gradient. The frequency etc are then derived from this original signal input. Therefore, the time domain is the original data domain that the ear is sampling.
I have a reference for this but it's on a crashed harddisk.
Except that its not that simple. The nueral firings are sychronous only up to about 500 Hz or so. The nuerons have about a 1 ms recharge rate and can't keep up with a sine wave above about 500 Hz. At that point the nural firings start to become random and the pitch is detected by place along of Cochlea and not by the nueral firing rate. This is why our hearing changes character at about 500 Hz. being completely different above and below that frequency.
"You don't seem to follow the discussion (perhaps this is why you are scatching your head)."
Yes, I skimmed through while some analyses were running at work.
"If the "thresholds of perceptibility" differ with SPL level, how has that ever been "known" before."
"If at 70 dB SPL I cannot hear .2 ms. of group delay, or lets say the threashold is 1 ms. while at 90 dB SPL the threshold is .2 ms. How has this been "known" before? I know of no references to this fact - perhaps you could enlighten me."
Ah. I was thinking that thresholds *are* SPL, i.e., the threshold at which x% of distortion is audible.
I missed the part about the SPL required to make a time delay audible.
Thanks for clearing that up.
Yes, I skimmed through while some analyses were running at work.
"If the "thresholds of perceptibility" differ with SPL level, how has that ever been "known" before."
"If at 70 dB SPL I cannot hear .2 ms. of group delay, or lets say the threashold is 1 ms. while at 90 dB SPL the threshold is .2 ms. How has this been "known" before? I know of no references to this fact - perhaps you could enlighten me."
Ah. I was thinking that thresholds *are* SPL, i.e., the threshold at which x% of distortion is audible.
I missed the part about the SPL required to make a time delay audible.
Thanks for clearing that up.
gedlee said:I think that this was meant for me since Lidia (Dr. Lee) doesn't visit these sights (and can't see why I do!)
Half of the world's wealth is concentrated in 2% of the population;
half of the world's loudspeaker knowledge is concentrated in 2% of the world's loudspeaker designers.
Except that its not that simple. The nueral firings are sychronous only up to about 500 Hz or so. The nuerons have about a 1 ms recharge rate and can't keep up with a sine wave above about 500 Hz. At that point the nural firings start to become random and the pitch is detected by place along of Cochlea and not by the nueral firing rate. This is why our hearing changes character at about 500 Hz. being completely different above and below that frequency.
What the neurons are doing is a different matter, yes, there sure is a lot of not so well studied stuff going on (supposedly parallel processing etc, which is probably shorthand for "we don't quite know").
My point was about the sensory cells. Yes, the location in the cochlea determines frequency by path length of the resonance there. But ultimately it's still a firing pattern of sensory cells there which is then interpreted by the neural network and the brain.
Dr Geddes, sorry for the wrong salutation, Brain freeze.
I think you misunderstood me a little bit. I will try to explain myself a little bit better. I am repeating other peoples’ theories, so feel free to correct if you think my interpretation is wrong.
My primary understanding of HOM is of phase ambiguity.
Localization is done by three mechanisms ITD, ILD and pinnae. ITD, or phase localization is primarily a mid frequency localization, while both Intensity (level) difference and pinnae mechanisms are primarily High frequency mechanisms. Most research that I have seen claim ITD produces phase ambiguity above a certain frequency, let’s say 1800Hz, and then the brain used ILD and pinnae for localization. If the brain shuts off ITD functionality, wouldn’t just moving a horn’s crossover over this frequency reduce HOM perception?
People who dismiss phase use precedence as its underpinning, but as I stated before precedence masking is clearly more effective at 10ms than at 1ms. Do you think people misuse the precedence effect when discussing very early reflection <2ms phase errors?
Thanks
I think you misunderstood me a little bit. I will try to explain myself a little bit better. I am repeating other peoples’ theories, so feel free to correct if you think my interpretation is wrong.
My primary understanding of HOM is of phase ambiguity.
Localization is done by three mechanisms ITD, ILD and pinnae. ITD, or phase localization is primarily a mid frequency localization, while both Intensity (level) difference and pinnae mechanisms are primarily High frequency mechanisms. Most research that I have seen claim ITD produces phase ambiguity above a certain frequency, let’s say 1800Hz, and then the brain used ILD and pinnae for localization. If the brain shuts off ITD functionality, wouldn’t just moving a horn’s crossover over this frequency reduce HOM perception?
People who dismiss phase use precedence as its underpinning, but as I stated before precedence masking is clearly more effective at 10ms than at 1ms. Do you think people misuse the precedence effect when discussing very early reflection <2ms phase errors?
Thanks
mbutzkies said:
My primary understanding of HOM is of phase ambiguity.
Localization is done by three mechanisms ITD, ILD and pinnae. ITD, or phase localization is primarily a mid frequency localization, while both Intensity (level) difference and pinnae mechanisms are primarily High frequency mechanisms. Most research that I have seen claim ITD produces phase ambiguity above a certain frequency, let’s say 1800Hz, and then the brain used ILD and pinnae for localization. If the brain shuts off ITD functionality, wouldn’t just moving a horn’s crossover over this frequency reduce HOM perception?
People who dismiss phase use precedence as its underpinning, but as I stated before precedence masking is clearly more effective at 10ms than at 1ms. Do you think people misuse the precedence effect when discussing very early reflection <2ms phase errors?
Thanks
I'm still having trouble with the "moving the horn's crossover" part, but you do have a pretty good understanding of localization. In my discussions I like to use the words Sound Quality which I feel encompases both localization, coloration and distortion. These are all maifsted differently in the hearing mechanism.
There is no question that people misuse the precidence effect for delays less than about 2 ms. Blauert and Kuttruff both make a clear distinction in delay effects at about this time scale as the same rules don't apply above and below. Blauert discusses these very early delays, and Kutruff just ignores them because they don;t happen in concert halls. But he is very clear that the rules for larger delays don't apply at these short time intervals.
In my discussions I like to use the words Sound Quality which I feel encompases both localization, coloration and distortion.
Sorry if this has been mentioned but I have only casually followed this thread.
Did you or someone else write a paper on this ? I know that there are many papers/articles dealing with perception but what I am interested in is a discussion of the level-depence of the audibility thresholds of the different distortion types that was mentioned a few posts before.
Regards
Charles
check your own thresholds with distorderI am interested in is a discussion of the level-depence of the audibility thresholds of the different distortion types that was mentioned a few posts before.
but I should add a kind of a virtual source with adjustable level, delay and position to really test precedence audibility (in fact ILD and ITD are already in the soft)
phase_accurate said:
Sorry if this has been mentioned but I have only casually followed this thread.
Did you or someone else write a paper on this ? I know that there are many papers/articles dealing with perception but what I am interested in is a discussion of the level-depence of the audibility thresholds of the different distortion types that was mentioned a few posts before.
Regards
Charles
Lidia and I wrote a paper recently on the SPL level dependence of the perception of group delay, which is a linear effect. I know of no one who has done a reasonable study on the SPL dependence of nonlinear distortion - which it clearly has to be.
- Home
- Loudspeakers
- Multi-Way
- Geddes on Waveguides