Crossover Group Delay Audibility Testing - please take part!

Been watching this. 🙂 Question Charlie. Seems odd to me that a 0.9ms variation between the DC level and the peak is statistically audible where as a 3ms difference isn't. But heck, I can hear any difference at all. Then again, I don't listen for nuances anymore.

Also, I'm guessing, but it appears the the 2nd case, 0.45ms DC, 0.55 peak, is what you would get with a 1k Hz LR4 crossover.

Hi John, nice to see you posting here and thanks for your interest! It is a bit odd that the case with the highest peak group delay was not also significant, but that is probably due to the limited number of submissions, errors in submissions due to lack of familiarity with the test, etc. The case that is significant is not so by much, and the higher GD case is not too far behind but does not meet the 95% confidence level. Given 50 responses there must be 33 to reach the 95% confidence level.

Good observation about the 2nd case. It is very close to the GD response you get from an LR4 but it is much steeper in the transition band. It's a new crossover that I have developed (one of around 30). I wrote a paper on these that just came out in audioXpress (September 2024 issue). Have you read it by any chance? I think you would enjoy it.
 
Was the crest factor in the different tests very different?

Crest factor is a likely clue. Volume envelopes may change significantly with selective phase shift changes.

Yes, my systems do not play the "Rectangular Impulse" or "Square Impulse" (it would be nice to have consistent naming)
The "Play Reference Audio" has an extreme crest factor. Neither my headphones nor my loudspeaker system do play the extreme peak.
The 6 "play options" do have much lower crest factors.

Keep that in mind when upping the volume to hear a difference between "reference" and any "play option". When upping the general system volume the "reference" hits the finite headroom of my systems far before all "play options"

Headphones and loudspeaker system alike !


and btw. :
the first thing you need to do is establish if your test subjects have "normal" hearing
Do NOT play the "certified listener" card before a test is technically valid.

Best regards
Bernd
 
Last edited:
@Hörnli
I'm not quite sure what you mean when you said, above:
"My systems do not play the..." - by this do you mean you hear nothing? It's a WAV file...

BUT! You bring up (tangentially with your mention of crest factor) a very good point, one that I did not consider: clipping.
Explanation: (maybe you know this, but not all do) When you apply some weird group delay response to a square wave, the resulting waveform may have increased peak amplitude. The overall sound level should remain exactly the same, but the re-arrangement of the waveform may change the crest factor. If the square wave was already spanning the maximum signal range and then the processing introduces some peaking in the waveform, BLAM you have clipping. Clipping can be audible and may be a confounding factor in the current tests. I did not check the waveforms carefully for these sorts of time domain problems but it is yet another thing I am learning as I go through this process and this will be on the list of things to improve for the next round of testing. OTOH, If the processing that adds group delay changes the crest factor, there is nothing I can do about it and I will not attempt to try and normalize that out.

If the above paragraph makes no sense, you can watch this favorite video of mine:
 
Last edited:
The video attached to #83 is one that people use to make you think humans can't hear effects of phase shifts. Of course its also possible to disprove that idea and show the opposite, that humans can hear phase changes as amplitude modulation or even frequency modulation. https://ptt.purifi-audio.com/blog/tech-notes-1/doppler-distortion-vs-imd-7

So, anyway, it turns out details matter. Sometimes phase changes can plainly be audible, other times not. In some cases in turns out (at least some) people can learn to notice some difference in the sound.
 
Last edited:
It's like many other types of corruptions of audio signals (analog and digital forms) - there is usually a threshold below which humans are not sensitive to the corruption, a range just above the threshold where perceiving it is difficult and may require special conditions to do so, and finally an even higher level that is plainly audible. So, yes, it depends on the details just like many other things.

What I like about the video is that it demonstrates that when there is clearly very large amounts of phase "distortion" that makes the waveform very, very different looking the "sound" is still not all that different than the original, at least that certainly is the case for casual listening. I think this is a very surprising result for many viewers.

From the scientific literature it is clear that when you design carefully controlled listening tests with very particular and usually complete synthetic (non-music) signals the threshold for audibility can be established and is more or less reproduced from test to test despite differences is the protocol used for each.
 
When we talk about thresholds of audibility for one person, it means the level at which that person can hear a given stimulus 50% of the time.

When we talk about thresholds of audibility for all humans on earth, there is no way to measure all humans on earth. The best we can do it to try to make an estimate which is average value at which (usually for people with normal hearing) someone is expected to be able to hear a given stimulus 50% of the time.

The other type of auditory measurement is the Absolute Threshold of Hearing, which is defined as above, except for it applies to single tones only. That's because when tones are combined they interact in the auditory system in such a way as to give different numbers as compared to single tone at a time measurements.



"The hearing threshold is the median value for otologically normal young adults. Consequently, 50% of subjects have a threshold which is more sensitive than the median and 50% have a less sensitive threshold"
https://www.sciencedirect.com/topics/biochemistry-genetics-and-molecular-biology/auditory-threshold#:~:text=The hearing threshold is the,dB below the standard threshold.




1725097958224.png


1725097987477.png


Images above taken from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6856372/


EDIT: Another thing to consider is that the above type of tests are usually intended to measure limitations of the physical ear, and not the brain processing of certain sounds (which may normally be ignored by the brain as "noise.") In the case where brain processing limits signal detection, and a test subject can be trained to notice lower level test signals, then that's not necessarily a limitation of the ear itself.
 
Last edited:
A brief excerpt thresholds from a well known reference on sensory evaluation is attached. It may be noted that the old intuitive idea of a threshold being a sharp cutoff point is often not the best fit to reality, and or for various reasons problematic to define. Modern scientific definitions should now be taking sway.
 

Attachments

Last edited:
Hmmm, that's strange. I just tried it and it works for me just fine.

In any case, I will be shutting down the tests soon. I will put up something similar using instant switching, but will not be collecting statistics on who can hear what. There are just too many challenges to getting any meaningful information out of these sort of tests without controlling variables of the playback chain and environment that are completely out of my control when the test subjects are random people from around the world. I think there is a lot of useful feedback from doing these sort of tests, so I hope to make them more of an informative test suite and will (hopefully) be able to provide feedback to each test taker. I still have to figure out how to do that, programmatically.
 
Indeed, that's strange. May be I was too fast / impatient during the test though.

With respect to my listening chain: Computer - RME Fireface UFX+ - SPL 2control headphone amp - Hifiman Sundara. You might ask listeners to shortly describe their listening equipent.

Did you check the Klippel listening test w/r to distortion - https://www.klippel.de/listeningtest/ - and how the setup of this test has been done:
https://www.klippel.de/listeningtest/index.php?page=how

May be of some help for your follow-up setup.

Thanks again for your nice work. Like these kind of tests very much - to me very educational to learn about what I can hear and to identify the real important topics in speaker development.
 
THE FIRST ANALYSIS OF THE TESTING IS HERE !

I finally got a total of 50 returns for TEST # 1, the square wave.

I analyzed the returns using Pearson's chi-square test, with 1 degree of freedom, at a confidence level of 95%, to obtain the statistical significance for each modification of the original audio. You can read more about this statistic and how to do the analysis at the following links:
https://en.wikipedia.org/wiki/Pearson's_chi-squared_test
https://longform.asmartbear.com/ab-testing-statistics/

The results are shown in the image below:
View attachment 1350218
Only one of the modifications is statistically audible using this test method.

The group delay response vs frequency for that modification looks like this:
View attachment 1350224

Based on the published literature, this should be audible, so the result is not a surprise. In any case it is great to have some data at last!

Thanks for participating. The tests are still open and ongoing. I will post if/when I get enough returns for the other tests, e.g. when they reach (hopefully) 50 returns. Currently all have around 35 returns each.



.
you could write a scientific paper on this.

tried it on the laptop, cursorily I didn't hear differences at all.
what I notice is that it is a rather difficult/strenuous job for the analytical ear.
 
I sent him the processed audio files for the rectangular impulse test so that he could play with them in Audacity. There is a description of each plot in a couple of words of text, just below the plot at the bottom left corner.

For some reason he added 20dB of gain to each one, so there is a lot of clipping. These are NOT the original signals from the test.