Measurement of phantom source localisation

Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.
Can I use one sound card as output and another sound card as input with this software?
You need
- two L and R outputs for playing the sweep file
- one mic input for recording

There is only one sweep file with L and then R, to be sure of the real timings between L and R, so to keep true PITD. You may use IR recorded with other softwares but you will never be sure of the timings : generally software automatically align to a certain peak of the IR.
 
I am curious though, how would one distinguish between a phase problem vs amplitude problem?
I have now separated views of PITD and PILD, so you can see both effects.

IE here I moved the mic only 10cm to the left, you see that amplitude has minimal changes but time (PITD) moves quite a lot the phantom image to the left.
Blue is the amplitude, red is time and green is combination of both.
pops110.png

I think you really need very high directivity speakers to keep the image central ! and frequency stable ;)
 
Last edited:
Clearly bias exists in your setup for high directivity speakers. Your bias is based on previous experiences with toe-in for your speakers.

Rest of experiment is crude reproduction of HRTF data collection. Image location in recordings is well defined as head tracking setups with HRTF corrections demonstrate.

Fuzziness in image is all about the speakers ability to act as single source in space and time. First approximation demonstrating this is tendency of smaller speakers producing more image detail than larger speakers. Smaller speaker inherently has smaller footprint in space and time than larger speaker. For same speaker locations and listener location smaller speaker subtends smaller angle.

When listening triangle is expanded so that larger speaker subtends same angle at listening position as smaller speaker with smaller listening triangle imaging detail of two systems becomes more consistent. However, bigger speaker with bigger listening triangle implies bigger listening space.

Much better approach is assessing angular separation of two elements within a phantom image is at what point of angular separation do two vocalists sound like they are coming from two locations instead of one? This becomes blur factor of speaker system. Instead of two vocalists, various signals may be substituted, and synthesis of locations may be investigated as to panning techniques used to place virtual sources into phantom image. All results come from perceptual domain of listeners, and leads to credible results.

A variation of approach: Instead of just two speakers, each is represented as pair of miniature monitors placed side by side. Starting with arrayed monitors touching, the pair of arrays are set up in listening triangle, and may subjectively be toed in/out to get perceived best image in conjunction with optimizing triangle size and location in listening space. From such a baseline the arrays are modified by introducing spacing in arrayed speakers and listening results obtained.

What we hear listening to speaker is convolution of signal with space/time signature of speaker.
 
I have now separated views of PITD and PILD, so you can see both effects.

IE here I moved the mic only 10cm to the left, you see that amplitude has minimal changes but time (PITD) moves quite a lot the phantom image to the left.
Blue is the amplitude, red is time and green is combination of both.
pops110.png

I think you really need very high directivity speakers to keep the image central ! and frequency stable ;)
This seems consistent with toe in findings of many. If the speakers are toed in the right way, it seems that imaging and depth improve. But I also find that some people do not prefer this.
 
Administrator
Joined 2004
Paid Member
I had a strange phantom image experience yesterday. I was listening to Firesign Theater's Everything You Know Is Wrong on vinyl. In the part where Nino the Mind Boggler moves from the telephone to the TV, his voice was very precisely 2 feet to the right of my right speaker. Solid.

When Nino was in the telephone on the left his voice was pegged to the left speaker, as I might expect because it's panned hard left. But over on the right - 2 feet past the speaker, as clear and solid as you could want. Weird..... (They don't call him the Mind Boggler for nothing).

I do sometimes hear music that extends past the speakers, but it's nebulous. To hear a voice pegged precisely 2 feet past the speaker was freaky.
 
Here comes my question : how can we objectively measure subjective localisation ?

Hi jlo, here are some previous experiments with subjective experiments:
In rooms:
http://www.pa.msu.edu/acoustics/rooms1.pdf
https://secure.aes.org/forum/pubs/journal/?elib=6050

In anechoic chamber:
AES E-Library Horizontal Plane Localization Ability and Response Time as a Function of Signal Bandwidth

Here was an attempt at purely objectively essessing localization with a "virtual listener":
AES E-Library Objective Assessment of Phantom Images in a 3-Dimensional Sound Field Using a Virtual Listener

Dave
 
Administrator
Joined 2004
Paid Member
Some years ago I used a bowling ball to get decent data for head diffraction. It was actually reasonably close to the real data, but with a bowling ball the results could also be compared to numerical models. Used bowling balls are dirt cheap, almost free. Any object in the middle of the two IRs would be better than nothing.
I've been wanting to do some measurements similar to this and figured the bowling ball to be a great idea. :up: Or maybe a soccer ball. But then I got to looking at the dimensions and saw that the diameter of a bowling ball or #5 football (soccer) is larger than the typical distance between the ears by at least 50%. More for the soccer ball.

How much would this matter?
 
- should we also measure with non central phantom sources ?
- and the most important, does this analysis correlate to perception ? how to precisely check it ?


To measure a phantom image you first have to generate it. To generate a phantom image you have to assume a panning law. Which law to use ?

How do you know which panning law was used in which recording ? You don't know it. Can allways guess but how precise a guess can be ...

For stereo only three positions can be defined precisely: Center, and extreme left and right. But left and right images are real sources. So for stereo only center image can be defined precisely.

But now even the center phantom image is obsolete since we have the center speaker.

It's a lot of fun, sure, but ... :D


- Elias
 
Administrator
Joined 2004
Paid Member
And it can work very, very well. I know a lot of people don't believe it, haven't heard it or won't admit it - but 2 channel stereo can work stunningly well to create a 3D image.

The fun here is in finding out what allows it to do that in some cases, but not in most.
 
And it can work very, very well. I know a lot of people don't believe it, haven't heard it or won't admit it - but 2 channel stereo can work stunningly well to create a 3D image.

The fun here is in finding out what allows it to do that in some cases, but not in most.

You're chasing a ghost. Those are binaural artefacts. This is not stereo. Stereo is two speakers in a (acoustically controlled) small room.
 
Here comes the cavalry ... :)

Step one is to get the images to step clear of the speakers, this is the sufficient quality problem. Trying to measure image localisation otherwise is like trying to test whether tyres rated at 150mph are up to scratch, by using a car with a 100mph top speed ...

Step 2 is having a stereo audio system good enough in a room - no other speakers, people who have no prior knowledge of that room are brought in blindfolded and are asked to point to where they perceive, think a particular sound is coming from. Perhaps they can be set up with "expectation bias", by the false prior statement that the room will contain a very large array of speakers in all the locations that they will hear sounds coming from, to see if that makes a difference ...
 
You're chasing a ghost. Those are binaural artefacts. This is not stereo. Stereo is two speakers in a (acoustically controlled) small room.

Or he knows how to pick up a book.

" pertaining to a system of sound recording or reproduction using two or more separate channels to produce a more realistic effect by capturing the spatial dimensions of a performance (the location of performers as well as their acoustic surroundings), used especially with high-fidelity recordings and reproduction systems"
 
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.