You needCan I use one sound card as output and another sound card as input with this software?
- two L and R outputs for playing the sweep file
- one mic input for recording
There is only one sweep file with L and then R, to be sure of the real timings between L and R, so to keep true PITD. You may use IR recorded with other softwares but you will never be sure of the timings : generally software automatically align to a certain peak of the IR.
I have now separated views of PITD and PILD, so you can see both effects.I am curious though, how would one distinguish between a phase problem vs amplitude problem?
IE here I moved the mic only 10cm to the left, you see that amplitude has minimal changes but time (PITD) moves quite a lot the phantom image to the left.
Blue is the amplitude, red is time and green is combination of both.

I think you really need very high directivity speakers to keep the image central ! and frequency stable 😉
Last edited:
Clearly bias exists in your setup for high directivity speakers. Your bias is based on previous experiences with toe-in for your speakers.
Rest of experiment is crude reproduction of HRTF data collection. Image location in recordings is well defined as head tracking setups with HRTF corrections demonstrate.
Fuzziness in image is all about the speakers ability to act as single source in space and time. First approximation demonstrating this is tendency of smaller speakers producing more image detail than larger speakers. Smaller speaker inherently has smaller footprint in space and time than larger speaker. For same speaker locations and listener location smaller speaker subtends smaller angle.
When listening triangle is expanded so that larger speaker subtends same angle at listening position as smaller speaker with smaller listening triangle imaging detail of two systems becomes more consistent. However, bigger speaker with bigger listening triangle implies bigger listening space.
Much better approach is assessing angular separation of two elements within a phantom image is at what point of angular separation do two vocalists sound like they are coming from two locations instead of one? This becomes blur factor of speaker system. Instead of two vocalists, various signals may be substituted, and synthesis of locations may be investigated as to panning techniques used to place virtual sources into phantom image. All results come from perceptual domain of listeners, and leads to credible results.
A variation of approach: Instead of just two speakers, each is represented as pair of miniature monitors placed side by side. Starting with arrayed monitors touching, the pair of arrays are set up in listening triangle, and may subjectively be toed in/out to get perceived best image in conjunction with optimizing triangle size and location in listening space. From such a baseline the arrays are modified by introducing spacing in arrayed speakers and listening results obtained.
What we hear listening to speaker is convolution of signal with space/time signature of speaker.
Rest of experiment is crude reproduction of HRTF data collection. Image location in recordings is well defined as head tracking setups with HRTF corrections demonstrate.
Fuzziness in image is all about the speakers ability to act as single source in space and time. First approximation demonstrating this is tendency of smaller speakers producing more image detail than larger speakers. Smaller speaker inherently has smaller footprint in space and time than larger speaker. For same speaker locations and listener location smaller speaker subtends smaller angle.
When listening triangle is expanded so that larger speaker subtends same angle at listening position as smaller speaker with smaller listening triangle imaging detail of two systems becomes more consistent. However, bigger speaker with bigger listening triangle implies bigger listening space.
Much better approach is assessing angular separation of two elements within a phantom image is at what point of angular separation do two vocalists sound like they are coming from two locations instead of one? This becomes blur factor of speaker system. Instead of two vocalists, various signals may be substituted, and synthesis of locations may be investigated as to panning techniques used to place virtual sources into phantom image. All results come from perceptual domain of listeners, and leads to credible results.
A variation of approach: Instead of just two speakers, each is represented as pair of miniature monitors placed side by side. Starting with arrayed monitors touching, the pair of arrays are set up in listening triangle, and may subjectively be toed in/out to get perceived best image in conjunction with optimizing triangle size and location in listening space. From such a baseline the arrays are modified by introducing spacing in arrayed speakers and listening results obtained.
What we hear listening to speaker is convolution of signal with space/time signature of speaker.
This seems consistent with toe in findings of many. If the speakers are toed in the right way, it seems that imaging and depth improve. But I also find that some people do not prefer this.I have now separated views of PITD and PILD, so you can see both effects.
IE here I moved the mic only 10cm to the left, you see that amplitude has minimal changes but time (PITD) moves quite a lot the phantom image to the left.
Blue is the amplitude, red is time and green is combination of both.
![]()
I think you really need very high directivity speakers to keep the image central ! and frequency stable 😉
I had a strange phantom image experience yesterday. I was listening to Firesign Theater's Everything You Know Is Wrong on vinyl. In the part where Nino the Mind Boggler moves from the telephone to the TV, his voice was very precisely 2 feet to the right of my right speaker. Solid.
When Nino was in the telephone on the left his voice was pegged to the left speaker, as I might expect because it's panned hard left. But over on the right - 2 feet past the speaker, as clear and solid as you could want. Weird..... (They don't call him the Mind Boggler for nothing).
I do sometimes hear music that extends past the speakers, but it's nebulous. To hear a voice pegged precisely 2 feet past the speaker was freaky.
When Nino was in the telephone on the left his voice was pegged to the left speaker, as I might expect because it's panned hard left. But over on the right - 2 feet past the speaker, as clear and solid as you could want. Weird..... (They don't call him the Mind Boggler for nothing).
I do sometimes hear music that extends past the speakers, but it's nebulous. To hear a voice pegged precisely 2 feet past the speaker was freaky.
Hi Pano
Well that is interesting umm, we have been to the old same place.
Did you notice that when Nino moved to the Tee&Vee that you also heard the old time flyback transformer ring up high?
Best,
Tom
Long time fan of fst
Well that is interesting umm, we have been to the old same place.
Did you notice that when Nino moved to the Tee&Vee that you also heard the old time flyback transformer ring up high?
Best,
Tom
Long time fan of fst
Ha! No Tom, I didn't. But my hearing up that high is gone. 15.75KHz is a distant memory. I wouldn't hear a flyback now if it bit me in the ear. 😉 I'll listen again.
Here comes my question : how can we objectively measure subjective localisation ?
Hi jlo, here are some previous experiments with subjective experiments:
In rooms:
http://www.pa.msu.edu/acoustics/rooms1.pdf
https://secure.aes.org/forum/pubs/journal/?elib=6050
In anechoic chamber:
AES E-Library Horizontal Plane Localization Ability and Response Time as a Function of Signal Bandwidth
Here was an attempt at purely objectively essessing localization with a "virtual listener":
AES E-Library Objective Assessment of Phantom Images in a 3-Dimensional Sound Field Using a Virtual Listener
Dave
I've been wanting to do some measurements similar to this and figured the bowling ball to be a great idea.Some years ago I used a bowling ball to get decent data for head diffraction. It was actually reasonably close to the real data, but with a bowling ball the results could also be compared to numerical models. Used bowling balls are dirt cheap, almost free. Any object in the middle of the two IRs would be better than nothing.

How much would this matter?
Pano
It would matter some, but options are quite limited unless you get an actual dummy head. They are available for hearing research and not too expensive. I recently used one for Lidia, placing mics in the ears.
It would matter some, but options are quite limited unless you get an actual dummy head. They are available for hearing research and not too expensive. I recently used one for Lidia, placing mics in the ears.
Thanks, I'll look around for one. Or maybe one of the kid size (#3) soccer balls could do the trick.
Will report what I find.
Will report what I find.
Mounting a mic in the ball could be tough, but the size is right.
The heads are cheap, but need some work to install ears and mics.
If you cannot find the head let me know and I'll ask Lidia where she got it.
The heads are cheap, but need some work to install ears and mics.
If you cannot find the head let me know and I'll ask Lidia where she got it.
- should we also measure with non central phantom sources ?
- and the most important, does this analysis correlate to perception ? how to precisely check it ?
To measure a phantom image you first have to generate it. To generate a phantom image you have to assume a panning law. Which law to use ?
How do you know which panning law was used in which recording ? You don't know it. Can allways guess but how precise a guess can be ...
For stereo only three positions can be defined precisely: Center, and extreme left and right. But left and right images are real sources. So for stereo only center image can be defined precisely.
But now even the center phantom image is obsolete since we have the center speaker.
It's a lot of fun, sure, but ... 😀
- Elias
And it can work very, very well. I know a lot of people don't believe it, haven't heard it or won't admit it - but 2 channel stereo can work stunningly well to create a 3D image.
The fun here is in finding out what allows it to do that in some cases, but not in most.
The fun here is in finding out what allows it to do that in some cases, but not in most.
And it can work very, very well. I know a lot of people don't believe it, haven't heard it or won't admit it - but 2 channel stereo can work stunningly well to create a 3D image.
The fun here is in finding out what allows it to do that in some cases, but not in most.
You're chasing a ghost. Those are binaural artefacts. This is not stereo. Stereo is two speakers in a (acoustically controlled) small room.
Siegfried Linkwitz has many papers and links about phantom imaging on his blog.
Linkwitz Lab
Linkwitz Lab - Loudspeaker Design
Linkwitz Lab
Linkwitz Lab - Loudspeaker Design
Here comes the cavalry ... 🙂
Step one is to get the images to step clear of the speakers, this is the sufficient quality problem. Trying to measure image localisation otherwise is like trying to test whether tyres rated at 150mph are up to scratch, by using a car with a 100mph top speed ...
Step 2 is having a stereo audio system good enough in a room - no other speakers, people who have no prior knowledge of that room are brought in blindfolded and are asked to point to where they perceive, think a particular sound is coming from. Perhaps they can be set up with "expectation bias", by the false prior statement that the room will contain a very large array of speakers in all the locations that they will hear sounds coming from, to see if that makes a difference ...
Step one is to get the images to step clear of the speakers, this is the sufficient quality problem. Trying to measure image localisation otherwise is like trying to test whether tyres rated at 150mph are up to scratch, by using a car with a 100mph top speed ...
Step 2 is having a stereo audio system good enough in a room - no other speakers, people who have no prior knowledge of that room are brought in blindfolded and are asked to point to where they perceive, think a particular sound is coming from. Perhaps they can be set up with "expectation bias", by the false prior statement that the room will contain a very large array of speakers in all the locations that they will hear sounds coming from, to see if that makes a difference ...
You're chasing a ghost. Those are binaural artefacts. This is not stereo. Stereo is two speakers in a (acoustically controlled) small room.
Or he knows how to pick up a book.
" pertaining to a system of sound recording or reproduction using two or more separate channels to produce a more realistic effect by capturing the spatial dimensions of a performance (the location of performers as well as their acoustic surroundings), used especially with high-fidelity recordings and reproduction systems"
- Status
- Not open for further replies.
- Home
- Loudspeakers
- Multi-Way
- Measurement of phantom source localisation