When low HD is low enough?

There are plenty of self-proclaimed listening experts claiming to hear differences far beyond the thresholds of audibility

You have no idea how much they are , roaming around here and there 😳 , strangely they have great fear of measurements , maybe because those are not biased ......

@jan , those who can hear below average thresholds are not 60+ years old , which exclude most of us 😎

.
 
Last edited:
  • Like
Reactions: jan.didden
Last edited:
I didn't see data on age versus group delay audibility at that link, especially for GD at LF?

Moving along then: In the case of nonlinear distortion please consider age related hearing loss is mostly at HF; it means that the volume level must be turned up to hear HF. Once its turned up, HF may still be pretty audible.

So, let's take the case of nonlinear distortion, say, where the distortion products are in the low and midrange frequencies less affected by HF sensitivity reduction as a function of age.

Any data to show a loss of ability to hear distortion products at lower frequencies?
 
Last edited:
Its god of course.
But remember what I was told when I first met pro audio world
You can only judge equipment with recordings of acoustic instruments. You have no way of knowing what studio equipment should sound like
Even if you are highly familiar with the sound of a real violin, do all violins sound the same?
How about the placement of the microphone, the sound of the microphone, mic preamp, and all the other electronics in the recording chain?
 
  • Like
Reactions: U102324
The results of susch casual activities are often given as gospel.
That's certainly a problem. Especially when the people who relay the gospel refuse to acknowledge or even consider some of the alternate explanations for their "findings", including the psychological effects.

But when someone asks to do it systematically, like with an ABX where we can get an idea of the actual audibility, all of a sudden people get stressed and fatigued by being 'forced' to listen carefully or 'their brain gets scrambled'. Didn't they listen carefully at the casual listening where they identified the differences??
It all sounds very much like a cop-out to me.
Yes and no. Forced choice is a known issue in some surveys and ABX is a forced choice type of test. Test participants tire and will start to select answers at random, which leads to Type II errors.

I would argue, however, that if the participants tire during the test then the test should be restructured. Maybe present seven sound pairs, then a break, then another seven, etc. until 21 sound sample pairs have been presented (enough for p = 0.05).

I would also argue that if we're discussing these things, then maybe the effect size isn't all that big. I.e., maybe there is a perceptible difference between A and B, but it's so small that a large group of participants can't reliably detect it or identify it.

At least I'm not willing to throw ABX testing out with the bathwater just yet.

Tom
 
Last edited:
"There is considerable debate regarding preferred methodologies for high resolution audio perceptual evaluation. Authors have noted that ABX tests have a high cognitive load [11], which might lead to false negatives (Type II errors)."
https://www.researchgate.net/public...f_High_Resolution_Audio_Perceptual_Evaluation
I suggest that you (re-)read reference [11]. I'll save you the trouble of looking it up:
Jackson, Helen M.; Capp, Michael D.; Stuart, J. Robert; 2014; The Audibility of Typical Digital Audio Filters in a High-Fidelity Playback System [PDF]; Meridian Audio Ltd., Huntingdon, UK; Paper 9174; Available from: https://aes2.org/publications/elibrary-page/?id=17497

You'll find: "An ABX test requires that a listener retains all three sounds in working memory, and that they perform a min- imum of two pair-wise comparisons (A with X and B with X), after which the correct response must be given; this results in the cognitive load for an ABX test being high." To me, this is pure speculation. They don't actually measure the cognitive load (which could be done) or even ask their test participants whether they feel more exhausted after the test than they did before.

The bit you quoted from Reiss (2016) says, "Authors have noted that ABX tests have a high cognitive load [11], which might lead to false negatives (Type II errors)." (emphasis mine).
Further you quoted from Wikipedia, "...forced-choice tests such as ABX tend to favor negative outcomes when differences are small if proper protocols are not used to guard against this problem." (again, emphasis mine).

So I don't think you have enough data to dismiss ABX testing.

Tom
 
The idea that "you need to be trained to do ABX" reminds me of this:

https://goldenearsaudio.com/
That being a trained listener would somehow be an advantage reminds me of this:

Olive, Sean E.; 2003; Differences in Performance and Preference of Trained versus Untrained Listeners in Loudspeaker Tests: A Case Study [PDF]; Research & Development Group, Harman International Industries, Inc., Northridge, CA; Paper ; Available from: https://aes2.org/publications/elibrary-page/?id=12206

Specifically:
  • The loudspeaker preferences of trained listeners were generally the same as those measured using a group of nominally untrained listeners composed of audio retailers, marketing and sales people, audio reviewers, and college students.
  • Different groups of listeners use different parts of the preference scale. Trained listeners use the lowest part of the preference scale, indicating they may be more critical and harder to please.
  • There were clear correlations between listeners’ loudspeaker preferences and a set of acoustic anechoic measurements. The most preferred loudspeakers had the smoothest, flattest, and most extended frequency responses maintained uniformly off axis.

In other words, there's no advantage to being a trained listener. Trained listeners have the same preferences as anyone pulled off the street at random. All listener groups preferred the loudspeaker that performed best in objective tests.

Tom
 
  • Like
Reactions: jan.didden
An ABX test requires that a listener retains all three sounds in working memory...
Exactly! When the differences are very small, and or if they are very peculiar, unnatural sounds, memorizing them is required. Its hard. I know because I have done it. It gets harder and harder to keep it clear in my head with more and more trials. My brain starts to listen for what is the same about A and B, not what is different.

Have you ever tried this with something hard you could barely memorize?

The process of willful memorization amounts to one of self-training, IMHO.

Regarding training in the literature, there is evidence the false negative bias of ABX tends to improve with practice. Whether or not that was in the reference I gave, it is in other references. Perhaps with enough level of comfort and familiarity, System 1 processes can be more relied upon for consistent results.
 
Last edited:
It's all in good fun, @Nico Ras .

I occasionally cherry-pick a few statements to dismantle, but it would be a full-time job to do more, and I'm constantly accused of misunderstanding something or having entirely the wrong idea.

I have mentioned "failure of imagination" before, and it seems to apply here, too. If the limits of hearing are being tested, there is no point deciding early on that variable X, say, gold plating on speaker cable terminals, is obviously snake oil so it can be safely ignored when switching between amplifiers and speakers with a box of relays. I'm no fan of speaker cable lore, but illogical, circular reasoning also irks me.

"We already know X Y Z is far below the threshold of hearing, so it couldn't possibly mess up A B or C in our test to find out what the threshold of hearing actually is for just the real stuff."
 
Many tests are negative or inconclusive for any number of reasons including, but not limited to,

1. The ability of the listener to distinguish between two different sounds is overrated.
2. The system has one or more error.
3. The system performance is not up to the task. Resolution, dynamics, musicality.
4. The tester didn’t follow instructions.
5. The recording or the system used for testing is in reverse polarity
 
  • Like
Reactions: fabrice63
A favourite / pet peeve:
4.1: everybody blindly follows the instructions and nobody doublechecks in case the instructions themselves are wrong.

This can be expounded to cargo cultism, or the scientific method, which is profoundly unscientific and misleading right down to the name. One would think that something with the word "scientific" in its name would actually be scientific, right? Except it's not. Maybe the intention was good, but I'm not even certain of that. There are a few of those opposite names floating around, and not just in Orwell's books.

That's not to poke holes in your list, just touching on a related point: you could have really limited resources and do very 'poor' testing, but still be able to come up with some thoughtful conclusions. What I find is that, over time, I memorise lots of one-off experiments, not at all dressed-up with a white lab coat (expensive equipment, colourful graphs and the latest software), and come up with an overarching meta-analysis. Obviously it's not rock solid, but I already know it's not.
 
  • Like
Reactions: cumbb
With all due respect, that is a very confusing post.
What is the value of 'thoughtful conclusions' based on 'very poor testing' ?
Then you say:
memorise lots of one-off experiments, not at all dressed-up with a white lab coat (expensive equipment, colourful graphs and the latest software), and come up with an overarching meta-analysis. Obviously it's not rock solid, but I already know it's not.
If it is not rock solid (and I agree it isn't), what is its value?
It's totally untrustworthy because the meta-analysis is based on untrustworthy data.
Garbage in - garbage out comes to mind.

Jan
 
Many tests are negative or inconclusive for any number of reasons including, but not limited to,

1. The ability of the listener to distinguish between two different sounds is overrated.
2. The system has one or more error.
3. The system performance is not up to the task. Resolution, dynamics, musicality.
4. The tester didn’t follow instructions.
5. The recording or the system used for testing is in reverse polarity
6. There is no discernible difference between the two stimuli.

Tom
 
  • Like
Reactions: jan.didden