Sound Localization Discussion

Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.
I'm pretty new to diyaudio.com, but thought it was about time i started my own thread. I've browsed quite a few threads and havent found one truely dedicated to sound localization yet so i thought i might start one. Here's what i know. Sound localization occurs due to a likely 2 things. Intensity differences between ears, and phase differences. well intensity is a for sure, i havent exactly read anything that talked about phase though. Lower frequencys i dont believe can be localized audibly at all, but perhaps there is some physical localization. if bass gets intense enough you can feel it and perhaps where the force is coming from? I believe the secrets of the ear are truely what will lead us on to making better sound reproduction equipment. I've searched around a bit on the internet about the subject and havent exactly found an abundance of information, but i did once read a bit of a study done by the air force that assesed localization of sounds while wearing protective ear muffs. The air force has an awesome sound lab from what i saw. They had a geodesic sphere with like 4" bose full range speakers at each of the many many verticies. the whole thing was located in a anechoic chamber. It seemed very cool. a True version of surround sound perhaps. what does anyone know about localization?



I like to figure stuff out
 
As food for thought consider binaural recordings over headphones.

I disliked these intensely because everything seemed to be
coming from behind me, even a recording of a plane passing
overhead changed direction over my head.

I think subtle head movement in real life quickly sorts
out back to front along with of course visual cueing.

:)sreten.
 
I think subtle head movement in real life quickly sorts

Additionally there is the HRTF (head reletad tranfer function) that helps with sound localistion, somrthing that is out of service when headphones are worn.

Sound localisation was discussed in a great series of articles in EW+WW, written by John Watkinson. The name of the series was "Stereo from all angles".
Some of the thoughts and conclusions can be found at :
http://www.celticaudio.com/

A common misunderstanding is that the ear uses phase-differences to determine the direction of a sound source. It is in fact time-delay that is used for that purpose. They are however related to each other. But one has to bear in mind that it is exactly possible to calculate phase difference for a given frequency and time-delay but the result will be ambiguous when the calculation of time-delay from a phase-difference is attempted.


Regards

Charles
 
I remember reading from Yakov Perelman's "Physics For Fun" in my childhood, that there is something about moving one's head in the direction of the sound, and sometimes one may actually get acoustically disoriented by moving his head, i.e., the source suddenly can no longer be pin-pointed. Yakov Perelman did this experiment with a chirping cricket. Anybody remembers that experiment?
 
phase_accurate said:



A common misunderstanding is that the ear uses phase-differences to determine the direction of a sound source. It is in fact time-delay that is used for that purpose. They are however related to each other. But one has to bear in mind that it is exactly possible to calculate phase difference for a given frequency and time-delay but the result will be ambiguous when the calculation of time-delay from a phase-difference is attempted.


So, if there is only one frequency played, and there is a phase difference between the ears, how does the brain figure out the time delay? I mean, the only thing that is available at the ears is the phase difference (for the stationary sinusoid).
 
Diffraction is one more factor.
At very low frequencies the sound waves diffract around our head so we cannot locate the sound source. This leads to near equal intensities at both ears.

Path length difference is not a factor for localization. Difference in path lengths lead to phase differences which become apparent as the frequency rises(wavelength reduces). Added to this is the diffraction reduction at mid and highs around our head which leads to intensity difference.

A 35ms time lag can be perceived by human ears according to Haas effect. It means a path length difference of abt 35 feet. Our ears are just abt 6-8 inches apart, i suppose.
 
A 35ms time lag can be perceived by human ears according to Haas effect. It means a path length difference of abt 35 feet. Our ears are just abt 6-8 inches apart, i suppose.

The Haas effect is dealing with the perception and localisation of delayed sound sources, it has nothing to do with the inter-aural delay.

The accuray of interaural-perception is in the order of 16 us !!!

So, if there is only one frequency played, and there is a phase difference between the ears, how does the brain figure out the time delay? I mean, the only thing that is available at the ears is the phase difference (for the stationary sinusoid).

Some good thoughts. If you think a little further then you might come to a conclusion what implication this has upon spatial reception.
Just a hint: Ever tried to localise a continuous sinusoidal sound source and compared this to the localisation properties of human speech ?

Regards

Charles
 
Svante said:


So, if there is only one frequency played, and there is a phase difference between the ears, how does the brain figure out the time delay? I mean, the only thing that is available at the ears is the phase difference (for the stationary sinusoid).

The answer is not very well at all. Localisation of undistorted
single frequencies is very poor, the ear does work on time
differences not phase. The sound of a match being struck
for example is extremely easy to loacate, due to a rich
spectrum and consistent time delay between the ears.

If interested read about autocorrelation functions, as far
as I understand it this is a good analogy for starters as
to how the brain processes timing information, and a
pointer as to why the ear canal is a tranmission line.

:) sreten.
 
The candidate has 100 points !:D

The ear does indeed use correlation between the different spectral parts of an acoustical event in order to achieve it's high resolution regarding the measurement of interaural time-delay.

It is also the starting transient that we use to perceive direction. So the START of a sinusoid would give us some hint about the direction of the source while a stationary sinusoid would give us the least information one could think of.
Ever tried to locate a sinusoid soundsource in a room ? This gets even more difficult in a reverberant environment, while it is a fairly easy task when the source is human speech (having a lot of transients over a wide spectrum) for instance, even in a heavily reverberant environment.

The correlation function has additional effect: Small sources (the match is a very good example) generate every spectral part at places that are closely located in space. Therfore all initial spectral content is spaced closely as well in terms of arrival time. Take a large instrument like a double-bass and things look different.
We actually use this temporal distribution of the initial spectral content (amongst others like loudness, fundamental frequency etc) to determine the size of a sound source.
Therefore the acoustic image of sound reproduction is blurred if insufficient care of transient reproduction is taken.

Regards

Charles
 
Delay is the more correct term to use in this case, not phase, although the two are unfortunately used interchangably.

You can speak of delay with respect to a pathlength difference scenario, because all frequencies will be delayed in arrival time by whatever the speed of sound dictates, given a particular distance.

Or, in phase terms...
All frequencies travel at the speed of sound - regardless of wavelength.
180 degrees of a high-frequency wave (1/2 of a wave) would take less time to pass than 180 degrees of a low-frequency wave.

This thread is interesting, in that it's come down to all the dynamics that the ear uses to localize sound...
It's not just "phase or amplitude", but all the things related to the geometry of our own ears, and a lifetime of our brain taking in sounds, processing them, and learning about them.

There's a very interesting article that I read a while back, a "loudspeaker imaging theorum" that really seemed to nail it from the standpoint of recordings...
...basically, that since the ear used so many low level cues... even lower level cues than one might imagine - that it was important to record at this fine level of detail, play back at that fine level of detail, and have audio gear that could reproduce that fine level of detail... to really get that "is it live... or is it memorex?" effect that we're obviously trying to get to, here. :cool:

The article seems very much on-topic with respect to what we are talking about here, albeit with that slightly different bent:
http://diyaudiocorner.tripod.com/imaging.htm
 
A few questions and comments i have. If the ears have no perception of phase then why does everyone go so psycho about it when it comes to crossovers??? I guess crossovers exibit a time delay also that everyone just calls phase perhaps. I guess some people might be right to say crossover phase changes could effect intregation but ive read some stuff from audiophile magazines that said that changing the phase can totally effect the sound. I guess they might have been on crack. Anyways the head movement seems like it would be our #1 factor in localization. So my idea is this, we all say ****** it to 5.1 and switch to 2.1 using headphones and a subwoofer channel. we coulc then equip the headphones with some sort of a device that monitored head position and could accurately change the sound mix coming through the phones to adjust for head position so that the sounds would not come from directions in respect to our head but in respect to our environment. The subwoofer would cover all of the low frequencys and give the all over feeling that headphones lack. We need some GOOD heaphones though with great transient resp.
 
the SHAPE of the ear can by itself allow

us to determine front to back localizationCASE in point . A friend mounted 2 mikes pointing away from each other with the hot ends about 8 inches apart. On to these mikes he mounted silicone rubber mouldings of his wifes ears, the outsides as well as inside into the "tunnel" While I was wearing earphones, he , hidden behind others shook keys behind the array . i was not paying attention to what he was doing but none the less turned completely around and looked behind me trying to see what the sound was. Note! I had already LOCATED it , I was trying to find out WHAT it was. There was NOTHING behind me , so i suddenly realized what had occured and looked for him ! And yes , i found that he was shaking keys behind the array , but was , in fact well in front of me. we later made extremely good recordings using his array. this was, as i recall in the late 60s
 
the SHAPE of the ear can by itself allow

us to determine front to back localizationCASE in point . A friend mounted 2 mikes pointing away from each other with the hot ends about 8 inches apart. On to these mikes he mounted silicone rubber mouldings of his wifes ears, the outsides as well as inside into the "tunnel" While I was wearing earphones, he , hidden behind others shook keys behind the array . i was not paying attention to what he was doing but none the less turned completely around and looked behind me trying to see what the sound was. Note! I had already LOCATED it , I was trying to find out WHAT it was. There was NOTHING behind me , so i suddenly realized what had occured and looked for him ! And yes , i found that he was shaking keys behind the array , but was , in fact well in front of me. we later made extremely good recordings using his array. this was, as i recall in the late 60s
 
For phase try Rod's pages here:
http://sound.westhost.com/pcmm.htm
and here:
http://sound.westhost.com/ptd.htm

Interestingly, (depending on the recording) absolute phase is quite audible on the output of a minimum phase system. (Note that Rod's statements with regard to absolute phase were in the context of a reproducing a square wave.) This of course seems to contradict the notion that the ear only perceives time-based discrepencies. Also consider the notion of phase shifting with processes like Q sound and SRS.. Is the phase shift simply providing an audible time-delay to alter the sound, or is their something audible to the altered phase response irrespective of time-delay?

I'll also add in my own theory on the major reason why 1st order crossovers seem to reproduce better localized sound.

First of all - have you noticed that most speakers with higher-order crossovers tend to "fall apart" sonically, particularly with regard to imaging, when you increase the spl's? On the other hand, 1st order designs don't seem to do this as badly (to a point), moreover they often image better when cranked up a little. Additionally, fullrange drivers tend to behave similarly (though they don't get better as the signal gets louder).

1. Remember that the ear is VERY sensetive to time-based deviations - particularly in the 1-6 kHz range.

2. Now look at the excursion levels of the tweeter vs. the midrange or woofer for a given output. Because a 1st order design has a shallower slope, the tweeter will be passed greater low freq. levels, and as a result will try to increase its excursion. For a moderate increase of spl in a first order design, the tweeter and the midrange/woofer may well have excursion levels that are comparable. Of course if you increase the spl's even higher, the midrange/woofer will likely exceed the excursion of the tweeter - depending on how low the midrange will operate.
Similarly, because higher order crossover's limit low freq. output to the tweeter - the tweeters utilizing these slopes will have less excursion (for equal cross-points), and therefor will likely NOT be comperable past anything but a VERY low spl.
Fullrange drivers on the other hand have their "tweeter" and "midrange/woofer" with nearly comperable excursion levels at all times. (.. I say "nearly" here because most do not behave as pure pistons as freq. increases.)
Of course what I'm getting at here is that this difference in excursion levels is creating a very small time-domain error (freq. dependent) when spls are increased on some higher-order crossover loudspeakers. The way to reduce this problem is by having a dedicated midrange with a limited low freq. bandwidth and a sizeable surface area (though large surface areas create problems of their own).
 
I read John Watkinson's articles; he states that the ear is very sensitive to localisation, much more than for distortion, and that loudspeakers generally do not reproduce this information accurately enough. Localisation is important from a survival point of view, when we had predators.

I had the idea of using some signal processing so that, when wearing headphones, the image is in front. The disturbing thing I find about headphones is that the music appears to originate inside the head, and feels unnatural. It puts me off using headphones.
 
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.