DAC blind test: NO audible difference whatsoever

Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.
Time isn't the only issue. Not knowing upends the playing field as badly as knowing. It's a perception test: you can't muck about with the perceptive state of the listener without prejudicing it one way or the other. It's inherent.

I'm not advocating a viewpoint here, just pitching for a clear understanding of the differences between blind drug trials and blind listening. The obfuscation causes a lot of unnecessary, and ill-founded, argument.

It would be possible, I think, to cast a light on central issues in play, but people generally seem to enjoy digging into a preconceived position on one side of the 'pro/anti-audiophile' war and lobbing bombs, instead of trying to answer the question.
 
Last edited:
Administrator
Joined 2004
Paid Member
But on the other hand, and going back to your previous post - familiarity is important. You knew the sound of your keys from years of hearing them. I might ask you wheile driving "what's that funny noise?" but you don't hear it because you haven't spent 100s of hours driving my car. Well, you might hear it, but not be able to pick it out or be conscience of it.

We don't usually get that in formal audio tests.
 
'Familiarity' is another way of saying 'well-modeled'. You know the sound of your engine when it's healthy because you've spent many hours listening to it. You know the imprint of your listening room's acoustic well enough to deduct it from what you hear in it - or at least make allowances. I find people are more acute listeners in their own environments. In time, you model the bad impression your phone speaker makes of a saxophone, and interpret it as a saxophone.

Although it can be surprisingly adept, hearing is a relatively weak sense: we don't have much grey matter assigned to it, and much of the processing happens subconsciously. There's a real reporting issue here, too: there's so little language specifically assigned to auditory phenomena (we borrow most of it from sight and touch). Very often we have a vague, nagging sensation that something 'isn't quite right' about a component, or that there's some ineffable little quality about it we can't describe. It's a problem. The criticism 'it's all in the mind' is almost a truism; although not everything we 'hear' comes through our ears.

I have a friend who auditions equipment by NOT listening to it - he distracts himself by reading, for instance. The act of 'formally' concentrating with puckered brows might not even be the best way to harness mental horsepower.

As the tone of conversation often reveals, too little forum activity focuses on how to be a better listener.
 
Last edited:
So as is being asked here - what is the value of sighted listening Vs ABX blind testing? Maybe some aspects of our auditory perception are not instant A/B differences but rather more subtle differences which are teased out over time. Being able to A/B these subtle differences may well require training/experience to be able to differentiate

Retiring to my bunker now with hard-hat in place :D

Your post contains some useful points. Not that it would be a particularly useful skill, but I wonder whether it would be possible to train someone specifically to pass blind tests. The bulk of the training would lie in rapid mental modeling and stress management, not merely 'goldenear college'. It would be Zen schnizzle.

But already that treats the wrong question too seriously!
 
The only valid conclusion to be drawn from blind testing audio equipment is that it's the wrong methodology to tell us anything useful about audio equipment. It does, however, tell us something interesting about perception: differences vanish when 'blind', and reappear when sighted: rather like holding your nose when tasting cinnamon.

I find the analogy weak and the conclusion flawed, what exactly is "something useful" about audio equipment? Why would seeing the brand of amplifier or speaker have anything to do with the process except through bias. Listening in the dark vs. in an environment with lots of visual stimuli is a totally different issue.

Smell and taste are connected in a very profound way a better analogy would be sighted and unsighted tasting. The average person when blind often can not tell the difference between pork and veal but a trained food professional does much better. Tasting while holding your nose is like listening while wearing earplugs.
 
Last edited:
Last edited:
Your post contains some useful points. Not that it would be a particularly useful skill, but I wonder whether it would be possible to train someone specifically to pass blind tests. The bulk of the training would lie in rapid mental modeling and stress management, not merely 'goldenear college'. It would be Zen schnizzle.

But already that treats the wrong question too seriously!
Yes, that's exactly my point - ABX testing (the usual sort of blind testing quoted on forums) is really just a party trick (or trap).

Can quick A/B listening reveal some differences? Yes, it can. Does it reveal all the sonic differences between two devices? No, I don't believe it does due to lots of factors that you & others have already mentioned. One of the greatest factors (for non-trained individuals) being that the usual approach to ABX blind listening is a concerted focus to find some specific part of the audio excerpts which nails the difference conclusively so it is repeatably identifiable. This concretely identifiable difference is seldom found. As you quite rightly state, this is a change in how we normally listen to our system & is the trap that most fall into. The next big trap is being able to repeat, a statistically significant number of times, the often subtle difference we think/feel we have identified in a particular excerpt or spot in the audio playback. These two traps in themselves are enough to catch the unwary & only by training can they be mitigated.

It's the instigation & title of this thread that shows exactly the party tricks being used & the traps that many fall for
 
I find the analogy weak and the conclusion flawed, what exactly is "something useful" about audio equipment?
What hubsand is saying is that the typical ABX test is just a test of how well we can overcome the perceptual traps inherent in the test - it tells nothing useful about how our long term listening to our audio system.
Why would seeing the brand of amplifier or speaker have anything to do with the process except through bias. Listening in the dark vs. in an environment with lots of visual stimuli is a totally different issue.
You miss the point - the test introduces lots of other perceptual traps (effectively biases) while assiduously trying to claim that it's goal is to eliminate bias. Only when people trained in what these perceptual traps are & how to overcome them, do the test, does it approach the claims made for it of 'eliminating bias' - most of the time, for most people, it introduces new biases.

Smell and taste are connected in a very profound way a better analogy would be sighted and unsighted tasting. The average person when blind often can not tell the difference between pork and veal but a trained food professional does much better. Tasting while holding your nose is like listening while wearing earplugs.
There you go - you make my point in your text I bolded above.
 
What hubsand is saying is that the typical ABX test is just a test of how well we can overcome the perceptual traps inherent in the test

Sorry he said blind testing period ABX is a red herring here, I don't subscribe to any protocol. Did you get the difference between holding your nose and covering your eyes while tasting, the analogy is so flawed that there is no point in discussing it.
 
Last edited:
Yes, that's exactly my point - ABX testing (the usual sort of blind testing quoted on forums) is really just a party trick (or trap).

Can quick A/B listening reveal some differences? Yes, it can. Does it reveal all the sonic differences between two devices? No, I don't believe it does due to lots of factors that you & others have already mentioned. One of the greatest factors (for non-trained individuals) being that the usual approach to ABX blind listening is a concerted focus to find some specific part of the audio excerpts which nails the difference conclusively so it is repeatably identifiable. This concretely identifiable difference is seldom found. As you quite rightly state, this is a change in how we normally listen to our system & is the trap that most fall into. The next big trap is being able to repeat, a statistically significant number of times, the often subtle difference we think/feel we have identified in a particular excerpt or spot in the audio playback. These two traps in themselves are enough to catch the unwary & only by training can they be mitigated.

It's the instigation & title of this thread that shows exactly the party tricks being used & the traps that many fall for

Bring data for such strong statements that ABX is useless. There's been plenty showing it's limitations, but same with just about every testing protocol. And even more so the entire test, like, replications, controls, etc.

A/B has rife problems, too, mind you. So does triangle, forced pairing, ...

It's fashionable to blast ABX, but it's hard not to see most of the objections more distraction than an earnest attempt to improve the problem.
 
Sorry he said blind testing period ABX is a red herring here, I don't subscribe to any protocol.
Well understanding blind testing & the various strengths/weaknesses of each protocol happens to be very important. Ignoring this seems willfully unconscionable.
Did you get the difference between holding your nose and covering your eyes while tasting, the analogy is so flawed that there is no point in discussing it.
In your post you made my point for me "The average person when blind often can not tell the difference between pork and veal but a trained food professional does much better.". This doesn't change a person's enjoyment of these differences when they encounter them outside of blind testing

This is true of most perceptual testing - for Joe average, it's not a test of real differences, it's a test of how perceptions are altered/conclusions less assured by eliminating one perceptual mode - in this case sight. In other words it tells us nothing about the real differences between veal & pork in our usual consumption of these meats
 
Last edited:
Bring data for such strong statements that ABX is useless.
I didn't say that
There's been plenty showing it's limitations,
Can you summarise these?
but same with just about every testing protocol. And even more so the entire test, like, replications, controls, etc.

A/B has rife problems, too, mind you. So does triangle, forced pairing, ...

It's fashionable to blast ABX, but it's hard not to see most of the objections more distraction than an earnest attempt to improve the problem.
Blind testing by people who are trained in how to do these tests recognises the traps/pitfalls & issues in each protocol, are useful.

That is not what is being represented in this thread or the reason for it's instigation. The title alone is enough evidence to prove the case of the abuse of such testing
 
It's not bad to start here: Sensory Discrimination Tests and Measurements: Statistical Principles ... - Jian Bi - Google Books

(As much as one can at least); remember that ABX is a reversed duo-trio.

I'd argue that any sort of ad-hoc thrown together method of testing has the merit of its experimental design. So this isn't a case of ABX being bad, it's a case of the hacked together test having little merit. Could have been forced pairing (tetrad) and still been the same level of useless. (Never mind the mental load a tetrad puts versus a 3-sample test) We come to the same conclusion about this thread, however. Nothing really was shown.

Testing at home, by and large, is an exercise for oneself, not really definitive in any way. No one's going to worry about characterizing their controls and test bounds (and replicates/resampling/etc).
 
Last edited:
It's not bad to start here: Sensory Discrimination Tests and Measurements: Statistical Principles ... - Jian Bi - Google Books

(As much as one can at least); remember that ABX is a reversed duo-trio.

I'd argue that any sort of ad-hoc thrown together method of testing has the merit of its experimental design. So this isn't a case of ABX being bad, it's a case of the hacked together test having little merit. Could have been forced pairing (tetrad) and still been the same level of useless. (Never mind the mental load a tetrad puts versus a 3-sample test) We come to the same conclusion about this thread, however. Nothing really was shown.
I agree with all you say

Testing at home, by and large, is an exercise for oneself, not really definitive in any way. No one's going to worry about characterizing their controls and test bounds (and replicates/resampling/etc).
Yes, again I agree but yet we find threads like this being started & people believing that there is some value to such a test & discussing/believing the unsupported conclusions stated.
 
Bring data for such strong statements that ABX is useless. There's been plenty showing it's limitations, but same with just about every testing protocol. And even more so the entire test, like, replications, controls, etc.

A/B has rife problems, too, mind you. So does triangle, forced pairing, ...

It's fashionable to blast ABX, but it's hard not to see most of the objections more distraction than an earnest attempt to improve the problem.

You imo simply have to see the "ABX" tag in context of the origins of the "great debate" and its continuation in forum discussions.

Btw, along these discussions there is a lot to learn about cult like belief frameworks, pseudoobjectivism and neglection of nearly one hundred years of sensory evaluation, Signal Detection Theory and cognitive psychology. :cool:

That said, "ABX" is more often blasts as there often is a demand to do an "ABX" - although it should have been "do a controlled (blind) listening test - as right from the beginning (and even then already for unclear/unjustified/unresearched reasons) it was the invention of an ABX comparator device and today it still are mainly the popular "ABX tools" that are recommended and used.

And as stated in posts before, their is quite clear evidence (again going back to the origins of the ABX in general, means to the 1950s) that an ABX tests gives inferior results when compared to A/B tests; even in a quite simple one dimensional and directional experiment; i.e. two tones, pitch of the second higher or lower).

And already then experts were relating the difference to the (assumed) higher internal "stress/distraction/whatever" evoked in the test participants.
I have cited some more comparisons between ABX and various other protocols coming from food evaluation experiments,which showed that the proportion of correct answers was lower in the ABX tests when compared to 3-AFC (for example).

As you´ve said there are drawbacks in other protocols as well, see for example the attached comparion (pooled from various different publications) between 3-AFC and triangle. There is a profound advantage wrt the proportion of correct answers for the 3-AFC.

I´m sure we all can remember all those posts about baseless "excuses from audiophiles", while in reality there is a lot of material available about all possible impacts evoked by the simple choice of test protocols.
 

Attachments

  • ComparisonTriangle_3AFC.gif
    ComparisonTriangle_3AFC.gif
    21 KB · Views: 179
Interesting graph, Jackob - thanks for posting - do you have a link to the source, please?

One question - is this graph representing percentage of correct answers for individual trials within tests or overall test results?

It´s from:
Virginie Jesionka, Benoît Rousseau, John M. Ennis. Transitioning from proportion of discriminators to a more meaningful measure of sensory difference. Food Quality and Preference, 32 (2014), 77–82

In this graph mainly calculated for overall test results; for example in case of the "Tedja et al." 1 main test subject and the other two subjects used for confirmation purposes. Participant 1 got correct 50.4 % in Triangle and 75.1% in 3-AFC and had done 720 triangle and 718 AFC trials, while subject 2 and 3 did 240 triangle and 240 3-AFC each.

In the actual articles often more refined even calculated for the presented different triplets.
 
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.