Geddes on Waveguides

Bohdan

just turning ones head or a slight shift sideways would completely change the "image" - which does not happen.


Hi Earl,

I am not sure, if I understand you correctly, but I think it actually does. Relative to your line-of-sight, of course, the image has shifted accordingly to your head rotation.

So, if the origin of sound was exactly between two speakers in stereo setup, and you turned your head, are now looking at the right speaker, the image has shifted to your left, relative to your current head position.

Best Regards,
Bohdan
 
Bohdan

That is all completely out of the range of what I was talking about. I was not referring to inter-channel time differences, and 1 ms. as quoted above, is certainly believable.

I was referring to the phase within a single signal. I can completely accept Preis contentions above. I don't accept that anyone can hear a few degrees of phase shift or the reversal of absolute phase on a single channel. These are entirely different things. 1 ms. is about 3000 degrees of phase shift at 10 kHz. Its about 500 degrees at 2 kHz. That's a lot of phase shift. As a threshold of audibility in the midrange (where even I claim the ear has its peak in resolution) these are reasonable numbers at these frequencies. < 500 Hz might be a different factor because now we are only getting a few waves and detection becomes more difficult, but Pries acknowledges that.

The kinds of things that we have been talking about are very small compared to these numbers.

We also know that Group Delay is audible under the right circumstances (Moore has shown this) and it is SPL level dependent (we showed the effects of that), which makes the whole issue much more complex. But these are time delays in the > .1 ms range and that's not just a "phase shift".
 
SPL.jpg


If I understand your post correctly, you are doing the following:

1) You're using a shallow high-pass filter, possible as shallow as 6db/octave, and you're setting the xover point at a very high frequency. Perhaps as high as 12khz.
2) Technically, some might consider this a 12khz xover point. But that's not really true, as the high pass filter complements the compression drivers response shape. Basically, as you get closer and closer to 1khz, the output of the compression driver is rising. So the highpass filter isn't really 'highpassing' the compression driver. It's flattening it.

3) The net effect is that the output level of the compression driver is lower, but it's flatter and it's more extended.

This is how I've done the majority of my waveguide/ CD horn crossovers as well. It's only logical to try to get the "highpass" and the CD EQ to coincide, if possible. More recently, I've been using an overdamped second pole on the filter. Some drivers make this easier than others- the JBL 2426h has a very complex impedance and as such required a couple notches- brute force parallel resistance would have required a LOT of resistive loss throughout the range, and absent thorough correction either via notches or brute force, one could not properly EQ the power response without significant FR nonlinearity.
 
Hi Earl,

I am not sure, if I understand you correctly, but I think it actually does. Relative to your line-of-sight, of course, the image has shifted accordingly to your head rotation.

So, if the origin of sound was exactly between two speakers in stereo setup, and you turned your head, are now looking at the right speaker, the image has shifted to your left, relative to your current head position.

Best Regards,
Bohdan

What about a simple lateral movement? In my system the image does not change much for almost three feet of lateral movement. That's a lot of phase shift.

But seriously, I am loosing sight of what we are talking about here. We started talking about the phase of an Abbey and all of a sudden we are talking about inter-channel phase distortions of 1 ms. or more. I don't see the connection.

Did anyone ever do anything with the individual driver IRs?
 
Last edited:
Bohdan,
Keep bringing it on. I think that this is very informative information. Could you also show a link to this information? This does still go with my thinking about phase response. I do not think we can look at steady state phase response across the frequency band for anything that is useful. It is just to far from what real music is, it is in no way a steady state condition. When you add in amplitude modulation and frequency modulation along with harmonic frequency I think that phase starts to look a lot more important than early testing has born out.

Another way to look at this is the simple example of Doppler shift and how that will modulate a frequency. That is in affect what would seem to be another example of a phase shift modulating a multiplex signal. The simple motion of the cone in and out with low frequency affecting the upper frequency response. I am all ears to someone showing me where I can read a vetted paper on this and not just simple explanations using musical signal or even voice.
 
What about a simple lateral movement? In my system the image does not change much for almost three feet of lateral movement. That's a lot of phase shift.


Hi Earl,


I honestly do not know how this is possible.



Source: Sound localization - Wikipedia, the free encyclopedia

Lateral information (left, ahead, right)
For determining the lateral input direction (left, front, right) the auditory system analyzes the following ear signal information:
• Interaural time differences Sound from the right side reaches the right ear earlier than the left ear. The auditory system evaluates interaural time differences from
o Phase delays at low frequencies
o group delays at high frequencies
• Interaural level differences Sound from the right side has a higher level at the right ear than at the left ear, because the head shadows the left ear. These level differences are highly frequency dependent and they increase with increasing frequency.
For frequencies below 800 Hz, mainly interaural time differences are evaluated (phase delays), for frequencies above 1600 Hz mainly interaural level differences are evaluated. Between 800 Hz and 1600 Hz there is a transition zone, where both mechanisms play a role.

Localization accuracy is 1 degree for sources in front of the listener and 15 degrees for sources to the sides. Humans can discern interaural time differences of 10 microseconds or less.[5][6]

Best Regards,
Bohdan
 
Bohdan,
Now I have to shift over towards Earl's statement. If the device in question has a very even polar response across a great enough area you can move side to side without noticing much change in response. Yes at very high frequencies where it becomes increasingly hard to keep the polar response even very far off axis this you can hear. But like Earl I have made horns with very even response across a fairly wide angularity. It really comes down to how well you can keep the waveform attached to the waveguide and the included angle of the horn how wide a usable listening angle you can produce. I can walk a far bit across a room and still hear the opposite speaker before I hear a drop in high frequency response. That does take an exceptional design to accomplish, and I wouldn't expect that from most horns or even the majority of dome tweeters that fail that test at a fairly small included angle.
 
Bohdan,
Keep bringing it on. I think that this is very informative information.


Hi Kindhornman,

No problem.


Importance of Phase in Transients

Source: http://sound.media.mit.edu/Papers/kdm-phdthesis.pdf

Page 44
“….Since Helmholtz, there has been a figurative tug-of-war between proponents of his “spectral theory” of musical sound and researchers who recognized the importance of sound’s temporal properties. Analysis-by-synthesis research, by trying to discover methods for synthesizing realistic sounds, has revealed several critical limitations of purely spectral theories. Clark demonstrated that recordings played in reverse—which have the same magnitude spectra as their normal counterparts—make sound-source identification very difficult. Synthesis based on Fourier spectra, with no account of phase, does not produce realistic sounds, in part because the onset properties of the sound are not captured (Clark et al., 1963). Although most musical instruments produce spectra that are nearly harmonic—that is, the frequencies of their components (measured in small time windows) are accurately modeled by integer multiples of a fundamental—deviations from strict harmonicity are critical to the sounds produced by some instruments. For example, components of piano tones below middle-C (261Hz) must be inharmonic to sound piano-like (Fletcher et al., 1962). In fact, all freely vibrating strings (e.g., plucked, struck, or released from bowing) and bells produce inharmonic spectra, and inharmonicity is important to the attack of many instrument sounds (Freedman, 1967; Grey & Moorer, 1977). Without erratic frequency behavior during a note’s attack, synthesized pianos sound as if they have hammers made of putty (Moorer & Grey, 1977).

So Helmholtz’s theory is correct as far as it goes: the relative phases of the components of a purely periodic sound matter little to perception. However, as soon as musical tone varies over time — for example, by turning on or off — temporal properties become relevant. In the real world, there are no purely periodic sounds, and an instrument’s magnitude spectrum is but one of its facets…..”

There is more.

Best Regards,
Bohdan
 
Bohdan,
Keep bringing it on. I think that this is very informative information.

Hi Kindhornman,

Here is another one.


Confirmation of two-stage processing by the ear.

Source: http://www.hauptmikrofon.de/theile/ON_THE_LOCALISATION_english.pdf

4.3.1 The “law of the first localisation stimulus”

“….For a conventional stereo-up, a phantom source shifts from  alpha= 0° to  alpha= 30° if the time difference between two broadband loudspeaker signals is increased from zero to about 600 μs. The association model could explain this phenomenon (time- as well as level-based stereophony) by means of psychoacoustic principles of the gestalt association stage. The localisation stimulus arriving at the gestalt association stage first has a greater weight compared to the second stimulus (the equivalent for level based stereophony would be the localisation stimulus with the higher level). Despite their identity and relative time delay, the localisation stimuli can be discriminated, since each of them is present in the binaural correlation pattern in a complete and discriminable form (see Section 4.1).
Yet, a further increase in the inter-channel time difference leads to an exceedance of
the maximal time delay τmax. For stationary broadband signals (continuous noise), this causes a disruption of the localisation stimulus selection, which manifests itself in the form of a reduced suppression of the comb filter effect, for example. In this particular sound field constellation, the law of the first wavefront cannot be observed in accordance with the association model. Analysable wavefronts that would allow for a localisation stimulus selection of the impinging sound components do not exist.
In contrast, for non-stationary impulsive signals (clicks, speech, impulsive tones) an increase in the inter-channel time difference has a different effect. In the association model, evaluation of the amplitude envelope ensures that the primary and the delayed sound (reflection) can be discriminated as localisation stimuli. According to a hypothetical function of the gestalt association stage, the primary localisation stimulus determines the auditory event. It does this even more so the larger the time difference between the arriving localisation stimuli gets. Only when a time difference of about 10 … 30 ms is exceeded will the subsequent localisation stimulus gain in perceptual weight.

Beyond the echo threshold (for a definition see BLAUERT 1974), it will be perceived as a separate auditory event. It appears that the “law of the first wavefront” can be interpreted as the “law of the first localisation stimulus”…..”

“…..6. Summary
According to the association model presented in the preceding chapters, the functioning of the auditory system with respect to spatial hearing is due to two different processing mechanisms. Each of these two processing mechanisms manifests itself in the form of an associatively guided pattern selection.

A current stimulus stemming from a sufficiently broadband sound source gives rise to a location association in the first and to a gestalt association in the second, higher-level processing stage because of auditory experience. Although the two stages work independently of each other, they always determine the properties of one or multiple simultaneous auditory events in a conjoint manner.

The rigorous differentiation of these two stimulus evaluation stages corresponds entirely to the two elementary areas of auditory experience. The received ear signals can be attributed to the two sound source characteristics of “location” and “signal”, which are independent of each other but always occur in a pair-wise fashion. Therefore, the presented association model is in agreement with many phenomena related to localisation in the superimposed sound field……”


As you can see, Theile equates 600us to 30 deg shift, which means 20us -> 1deg shift. These numbers tie up with the previous statements very well.

More importantly, the above tie-up with the previously posted comments on the importance of providing undistorted waveforms to the ear during the"location" stage.



Best Regards,
Bohdan
 
The phase tests were well done - you can't just wave away the results by claiming that they weren't.

If you still believe that you are right then do a well designed blind test and Wow the whole rest of the world if it comes out as you say. But for now, the data that is available, perfect or not, does not support your contention.
I very seldom if ever, refer to specific documents to support my views. As a matter of fact, when my patent attorney did that, it caused lots of trouble incurring unnecessary cost. The reason is that if I refer to such documents, I would also have to explain all the holes in it, which I have no interest in doing. So I had to instruct my patent attorney to refrain from referencing documents other than may have been brought up by the patent office.

Listening tests conducted by sources you may wave, carry the same weight as may seem to have been peer reviewed until my questions on these reviews have been fully answered.

As for the other phase references in the discussion going on, they all seem to ignore the learning factor in humans. Some may ignore phase variation ques, and some may not. So you will always get different results depending on how research is conducted.

Man, all this debate going on reminds me of "Of Studies" by Francis Bacon.
 
Last edited:
all of a sudden we are talking about inter-channel phase distortions of 1 ms. or more.


Hi Earl,

No, we are talking about inter-channel timing difference of 20usec.

"....channel-to-channel time offset equal to one sample period at 48 kHz is audible. This equates to 20 μsec of inter-channel phase distortion across the entire audio band. Holman [10] also mentions, “one just noticeable difference in image shift between left and right ear inputs is 10 μsec”.

BAS was talking about exactly the same hing.


Best Regards,
Bohdan
 
Bohdan,
Keep bringing it on. I think that this is very informative information.

Hi Kindhornman,


You may know this one. It talks about the same things as post #5831, and it almost reads as a continuation of it.


Source: http://www.audiophilerecordingstrust.org.uk/articles/speaker_science.pdf

"…..Another area in which loudspeakers are disreputable is in the neglect of the time domain. The traditional view is that all that matters is to be able to reproduce continuous sine waves over the range of human hearing.

A very small amount of research and thought will reveal that this is a misguided view. Frequency response is important, but not so important that the attainment of an ideal response should be to the detriment of realism. One tires of hearing that "phase doesn't matter" in audio or "the ear is phase deaf". These are outmoded views which were reached long ago in flawed experiments and which are at variance with the results of recent psychoacoustic research.

The ear works in two distinct ways, which it moves between in order to obtain the best outcome from the fundamental limits due to the Heisenberg inequality. The Heisenberg inequality states that as frequency resolution goes up, time resolution goes down and vice versa. Real sounds are not continuous, but contain starting transients. During such transients, the ear works in the time domain. Before the listener is conscious of a sound, the time domain analysis has compared the time of arrival of the transient at the two ears and established the direction. Following the production of a transient pressure step by a real sound source, the sound pressure must equalise back to ambient.

The rate at which this happens is a function of the physical size of the source. The ear, again acting in the time domain, can measure the relaxation time and assess the size of the source. Thus before any sound is perceived, the mental model has been told of the location and size of a sound source.

In fact this was the first use of hearing, as a means of perceiving a threat in order to survive. Frequency analysis in hearing, consistent with the evolution of speech and music came much later. After the analysis of the initial transient, the ear switches over to working in the frequency domain in order to analyses timbre. In this mode, the mode that will be used on steady state signals, phase is not very important. However, the recognition of the initial transient and the relaxation time are critical for realism. Anything in a sound reproduction system which corrupts the initial transient is detrimental.

Whilst audio electronics can accurately handle transients, the traditional loudspeaker destroys both the transient and the relaxation time measurement. Lack of attention to the time domain in crossover networks leads to loudspeakers which reproduce a single input step as a series of steps, one for each drive unit at different times..."


Best Regards,
Bohdan
 
Bohdan,
Keep bringing it on. I think that this is very informative information.


Hi Kindhornman,


My previous post mentioned recent psychoacoustic research. Here is one example.

Confirmation of a need to process timing information:

Source: http://arxiv.org/pdf/1208.4611v2.pdf gave the following summary:

"..The time-frequency uncertainty principle states that the product of the temporal and frequency extents of a signal cannot be smaller than 1/(4PI). We study human ability to simultaneously judge the frequency and the timing of a sound. Our subjects often exceeded the uncertainty limit, sometimes by more than tenfold, mostly through remarkable timing acuity. Our results establish a lower bound for the nonlinearity and complexity of the algorithms employed by our brains in parsing transient sounds, rule out simple "linear filter" models of early auditory processing, and highlight timing acuity as a central feature in auditory object processing…."

And further:

"…In many applications such as speech recognition or audio compression (e.g. MP3 [18]), the first computational stage consists of generating from the source sound sonogram snippets, which become the input to latter stages. Our data suggest this is not a faithful description of early steps in auditory transduction and processing, which appear to preserve much more accurate information about the timing and phase of sound components [12, 19, 20] than about their intensity…."

And finally:

"…Early last century a number of auditory phenomena, such as residue pitch and missing fundamentals, started to indicate that the traditional view of the hearing process as a form of spectral analysis had to be revised. In 1951, Licklider [25] set the foundation for the temporal theories of pitch perception, in which the detailed pattern of action potentials in the auditory nerve is used [26, 28], as opposed to spectral or place theories, in which the overall amplitude of the activity pattern is evaluated without detailed access to phase information. The groundbreaking work of Ronken [22] and Moore [23] found violations of uncertainty-like products and argued for them to be evidence in favour of temporal models. However this line of work was hampered fourfold, by lack of the formal foundation in time-frequency distributions we have today, by concentrating on frequency discrimination alone, by technical difficulties in the generation of the stimuli, and not the least by lack of understanding of cochlear dynamics, since the active cochlear processes had not yet been discovered.

Perhaps because of these reasons this groundbreaking work did not percolate into the community at large, and as a result most sound analysis and processing tools today continue to use models based on spectral theories. We believe it is time to revisit this issue….."


I am sorry, I have to go now. I may be able to join tomorrow.


Best Regards,
Bohdan
 
Hi Earl,


I honestly do not know how this is possible.


Localization accuracy is 1 degree for sources in front of the listener and 15 degrees for sources to the sides. Humans can discern interaural time differences of 10 microseconds or less.

Best Regards,
Bohdan

You have to be careful with simple explainations like in Wiki. These numbers, from what I understand, might be best case thresholds under lab conditions using headphones. For typical listeners to two speakers playing music in a reverberant room they could be much different. The 10 uS ITD does seem a bit extreme. I'd like to see where that number came from.
 
Hi Earl,

No, we are talking about inter-channel timing difference of 20usec.

"....channel-to-channel time offset equal to one sample period at 48 kHz is audible. This equates to 20 μsec of inter-channel phase distortion across the entire audio band. Holman [10] also mentions, “one just noticeable difference in image shift between left and right ear inputs is 10 μsec”.

BAS was talking about exactly the same hing.


Best Regards,
Bohdan

Pries (in your own post) did not agree with this assesment and I agree with Pries. His study was much more complete. I do not know the conditions of the Holman claim. I'll stick with Pries and the 1 ms finding.
 
Bohdan

You do not seem to be cognicent of my work on the perception of group dealy because I do not disagree with the importance of time and transient response. What I disagree with is that the ear is capable of phase detection at mid to high frequencies. This is because the nueral firings start to become random above about 1 kHz and by 5 kHz they6 have no temperal correlation to the stimulus. This means that phase is not detectable at all. That does NOT mean that the ear is not capable of detecting ITD, it is, but phase and ITD are differnt things.

That "audiophile" rant you posted is misleading and not very accurate. Time is very important and a compact impules response is essential, but that IS NOT phase. Lets not mix up the two things.
 
Source: http://arxiv.org/pdf/1208.4611v2.pdf gave the following summary:

Best Regards,
Bohdan

Th8s is a reliable reference and I do not disagree with any of it, if you read it correctly. You have to know at what frequency they are talking about because what is true at LFs is the exact opposite at HFs, and they (or you) do not seem to highlight that fact. Localization occurs with both ITD and ILD but in different frequency ranges.

If you really want to understand this stuff then read Blauert "Spatial Hearing" - that is the most highly regarded reference. There is another book that I like and that is "The Cochlea", which gets into the details of the cochlear mechanics, phsiology and nuerology of hearing, but it can be very difficult to translate that into "stereo imaging". Even Blauert can be a difficult transfer of data to what we are talking about.

I have not been too impressed with your Wiki posts as they have made some errors.

At any rate, what does any of this have to do with waveguides? Should it be in another post? I'd prefer that this one attempt to stay on topic.