DAC blind test: NO audible difference whatsoever

Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.
frugal-phile™
Joined 2001
Paid Member
You need to avoid the ''placebo effect'' in order to prove the efficiency of a drug.

The placebo effect is proven to be effective (same as the hifi boutique salesperson's speech) so the way to deal with that: they split the test in two groups: one that will take the real drug and the other one with some sugar pill that doesn't make any effect (besides placebo).

At the end, most of the time the two groups will have positive effects and the real drug must prove itself more efficient than the sugar pill (placebo) by a certain margin.

That is not an ABX test.

I don’t have issues with blind tests, just when the results are not interpreted correctly. An ABX test cannot be used to determine that 2 DUTs show no difference.

dave
 

Attachments

  • truthh.jpg
    truthh.jpg
    6.7 KB · Views: 1,582
That is not an ABX test.

I don’t have issues with blind tests, just when the results are not interpreted correctly. An ABX test cannot be used to determine that 2 DUTs show no difference.

dave


At the essence, the whole idea is to be able to differentiate something from something else (A from B).

Couldn't be simplier than that: Two things.

''Are you able to identify A from B ?''

There is no interpretation possible, it's a clear cut case of YES or NO. That's the beauty of it.

Then, the logic that comes along with that is: If you CAN'T differentiate A from B, how can you possibly prefer one over another?

That's the spirit.
 
I wish I could enjoy my music on cheap gear - I have never understood the concept of comparing one product to another to see which one was 'best' - 'Whose best? it's a bit of nonsense really as what gear I prefer to listen to, another person may dislike intensely

Plus - My 'best' dac (highest specced, that is) is the Ayre but I spend most of my time listening to the modded Line Magnetic and occasionally, a NOS 1541A - it's all about the (listening to) music for me

I've taken part in numerous ABX double-blind tests over the years and find them mostly inconclusive as nearly all are short-term tests, there's little time to concentrate or focus your listening and the written evaluations are completely open to interpretation - you end up listening to differences, not audio reproduction.
 
And, yes Dave, you are right. Tests in pharmaceutical contexts are not ABX. They're not because it's impossible to work that way.

The participants (testees) have no way to test A, then B, then X, numerous times... for obvious reasons. Drugs cannot be tested the same as audio components. It's not a matter of A/B within seconds or minutes, but over days/weeks with a lot of biases.

So the way they do it is similar as the ABX, in the essence, but rather with A group and B group. The placebo group become the base reference. Then, an equivalent positive differenciation would be something like +5% or whatever pourcentage they consider valid, for the non-placebo group.


ABX valid differenciation is usually considered 17/20 or better.
Pharma tests valid differenciation is probably north of 5% over placebo group. I don't know... Interesting, though.

here is a start:

Placebo-controlled study - Wikipedia
 
frugal-phile™
Joined 2001
Paid Member
Please tell me more about that.

Read the entire Wikpedia article you cited.

But here the 1st paragraph (italics mine):

An ABX test is a method of comparing two choices of sensory stimuli to identify detectable differences between them. A subject is presented with two known samples (sample A, the first reference, and sample B, the second reference) followed by one unknown sample X that is randomly selected from either A or B. The subject is then required to identify X as either A or B. If X cannot be identified reliably with a low p-value in a predetermined number of trials, then the null hypothesis cannot be rejected and it cannot be proven that there is a perceptible difference between A and B.
 
find them mostly inconclusive as nearly all are short-term tests, there's little time to concentrate or focus your listening and the written evaluations are completely open to interpretation - you end up listening to differences, not audio reproduction.


I heard that a lot, but i found that -on the contrary- the longer it is, the less your brain can potentially spot a difference.

Best music excerpt time would probably be somewhere between 5sec and 25sec.

Audio memory is very very short.

The thing to remember is NOT all blind test falls in the ''everybody fails'' pit. There is always a threshold. And these threshold proves that ABX test is a valid method. At the very least, it proves that some things show bigger differences than other, who falls in the more...subtle. If any.

I remember the first serious blind test i organized back in 2010... MP3 v.s. AAC v.s. CD v.s. HD 24/96.

I had to lower the quality til 64kpbs (!) MP3 files, to find the threshold where MOST people (not all!) could spot it. That was a shock. I was able to do it, so was my audiophiles buddies... But few participants were not. To my big surprise.

As ''low'' as 192kbps.... no one could spot any of the files. So the threshold was somewhere between 96 and 128kbps. MP3 only, AAC was impossible to spot either.

And i'm not even talking about the 24/96 v.s. HD or the AAC 256kbps... No one was even close. MP3's were challenging enough.

So, YES, thresholds are the key here. ABX shouldnt be discarded because a threshold is not yet found.

I'm pretty sure if you ABX a Pepsi and a glass of Vodka, you'll find it. :D
 
If X cannot be identified reliably with a low p-value in a predetermined number of trials, then the null hypothesis cannot be rejected and it cannot be proven that there is a perceptible difference between A and B.

Ok ?

Today we had, numerous times, participants that WERE NOT sure of the answer, once ''forced'' to give it.

On a scientific point of view, is that a problem ? No.

ABX test is meant to demonstrate that you can identify A from B. Therefore, if you're NOT able to provide the answer for a round, then that round says ''NO, you cannot identify A from B''. It's the equivalent of a negative (wrong) answer.

If X is A
...and you say B. It's a negative.
if you say ''i will not be forced to answer/ i don't know'', it's also a negative.

You cannot cheat an ABX test, unless you know to differentiate but tells the opposite.. Why would we do that?

You have to prove your capacity to identify. Simple as that.
 
frugal-phile™
Joined 2001
Paid Member
ABX test is meant to demonstrate that you can identify A from B. Therefore, if you're NOT able to provide the answer for a round, then that round says ''NO, you cannot identify A from B''. It's the equivalent of a negative (wrong) answer.

That is not so. The conclusion may seem valid, but it is statistically invalid.

dave
 
That is not so. The conclusion may seem valid, but it is statistically invalid.

dave


17/20 is what's considered valid.

We sure as hell can consider a 501/499 result as ''no one can tell the difference'' :D

As much participants as possible for as much trials as possible on the most stable test environment possible. That's the key.

We were only 4 today. Granted: that's not much. On a ''scientific/statistical'' basis, probably not valid.
That being said, i see no hint whatsoever that could change the outcome if the test was made with 20 or 1000 participants.

And, frankly, i was expecting some day & night differences from a 30$ DAC v.s. one that is 100x the price...

EVEN IF 1 participant out of 10 could spot it... that would be a problem, IMO. But that was not even the case.
 
An ABX test is used to prove 2 DUT are different, it cannot be used to prove 2 DUT are the same.

dave


That's where i don't agree with that ''logic''. It doesnt make any sense.

As i mentionned earlier, it's about thresholds. The line where things become possible to identify (for some).

Music files test, it was low bitrate MP3. Not HD24/96, not even lossy AAC, not CD, but MP3 below 192kbps...

Now, midrange drivers... You have to remove 1/2 octave to spot it.

Same with SPL: educated guess here: some will find 0.5db differencial but to get 99%+ positive answer, you'll probably need 1.5db diff. While making anything less than 0.2db would prove that no one is able to spot it.

Thresholds.

Pepsi/Vodka
Pepsi served @ 4.12 deg C v.s. Pepsi served @ 4.13 deg C.

Maybe 1 human out of 1,000,000,000 won't be able to spot the Pepsi/Vodka (probably drunk) and maybe 1 human out of 1,000,000,000 WILL be able to spot the 0.01 deg temperature difference.

But, if the ABX test would be made on all humans on the planet, we could find that the threshold are: 1.5ml Vodka in 250ml of Pepsi is the threshold for 50,1% of the population. And that 1,4 deg C temp difference is the threshold for 50,1% of the population.
 
If 1 audiophile out of 2 audiophiles would be able to spot a difference between component A from component B... Would you be more interested about it than another set of components that gave 0 out of 100 results ?

I would.

That's thresholds. That's having perspective of what's doing what. That's knowing the true audible impact from each components.
 
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.