What kind of evidence do you consider as sufficient?

Status
Not open for further replies.
Whenever controversial audio effects (resp. the audibility of those effects) are discussed then sooner or later the demand to provide evidence occurrs, mainly as demand to do some sort of "blind test" .

Let´s assume that level matching is given, what else (wrt test conditions) is needed so that you would consider the evidence as sufficient although you were previously convinced that no difference can be heard?
 
Last edited:
First of, a claim must be made. It means the testing starts with someone who claims to be able to hear something in sighting listening with a particular system.

If we're speaking "evidence", then the factor under test has to be isolated as much as possible. It has to mean blind testing, keeping the system under test as similar as possible to the one that allowed a claim to be made.

I'm not very strict on how "blindness" is achieved. But if the test has to convice people, then it must involve a fair third party (with no particular interest in the outcome) to control the process.

And finally, the test has to be documented, so it can be reproduced.

It has been made clear that blind testing is stressful but I don't see how to avoid its use. Restricting the test to a very particular claim (allowing for training if needed) and allowing the person taking the test as much familiarity with the test setup as he wants might help.
 
The science of measuring human response is well developed so no real inventions are required. It begings with a hypothesis, typically a null hypothesis such as:

-The audible signature of Cable A cannot be distinguished from Cable B

An experimental apparatus is constructed to allow a listening panel to test the hypothesis. It is critical that niether the listening panel nor the experimenter conducting the trial can detect which experimental variant (Cable A or B) is currently in circuit. The panel would then be exposed to multiple randomized sessions in which each cable is auditioned and the listeners would signal a preference at designated intervals. Once the data is collected the null hypothesis is tested to determine the outcome of the experiment. An outcome accepting the null hypothesis would show scores for the preference of around 50% indicating the panel randomly chose either cable as preferred. Statistical techniques such as the Student's T-test would assess the probability that the null hypothesis was rejected and the panel was therefore able to detect a difference in the audible signature. In thye event the null hypothesis was rejected follow-on tests could be designed to more narrowly charaterize the differences detected by the panel.
 
Whenever controversial audio effects (resp. the audibility of those effects) are discussed then sooner or later the demand to provide evidence occurrs, mainly as demand to do some sort of "blind test" .

Let´s assume that level matching is given, what else (wrt test conditions) is needed so that you would consider the evidence as sufficient although you were previously convinced that no difference can be heard?
Wrong question, to start with. You need to define a hypothesis before looking into a) what kind of test would fit and b) what would be considered proof (or disproof) of the hypothesis. There is no one size fits all approach.
 
Whenever controversial audio effects (resp. the audibility of those effects) are discussed then sooner or later the demand to provide evidence occurrs, mainly as demand to do some sort of "blind test" .

Let´s assume that level matching is given, what else (wrt test conditions) is needed so that you would consider the evidence as sufficient although you were previously convinced that no difference can be heard?
What kind of evidence do you consider as sufficient? It's the one that sellers / shills object the most to.
 
It has been made clear that blind testing is stressful
By whom, where and when? Was it by those who sell / shill for high-price audio gear?

but I don't see how to avoid its use. Restricting the test to a very particular claim (allowing for training if needed) and allowing the person taking the test as much familiarity with the test setup as he wants might help.
That's basically a standard.
 
Whenever controversial audio effects (resp. the audibility of those effects) are discussed then sooner or later the demand to provide evidence occurrs, mainly as demand to do some sort of "blind test" .

Let´s assume that level matching is given, what else (wrt test conditions) is needed so that you would consider the evidence as sufficient although you were previously convinced that no difference can be heard?

Jacob, I wish I could give you a straight answer that didn't have conditionals. Obviously agree on the need for clear hypothesis. "It depends" would be my most honest answer. Preregistration of test protocols (DBT with positive and negative controls, or prior research into the capability of the testers under similar conditions), training results showing that variability is heading towards its asymptote are a good start. Consult with a statistician and have a pre-trial analysis work flow established.

This conveys my sentiment better than I can write here:
“P < 0.05” Might Not Mean What You Think: American Statistical Association Clarifies P Values | JNCI: Journal of the National Cancer Institute | Oxford Academic
 
And what does the literature say about mental burden during preference testing and its effect on test results?
About DBT of audio gear in the comfort of listener's own setting and their own pace and duration of their own desire? Nothing detrimental to the listener's ability to hear. It only becomes a problem after the results don't agree with the preexisting narrative of certain group of people, not beforehand.
 
Status
Not open for further replies.