DAC blind test: NO audible difference whatsoever

Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.
Is it correct that there were 4 people involved in listening?
Can you tell us exactly how the ABX testing was conducted, do you have some kind of ABX test system with relays or some other type of switching? Was switching conducted at line level?
What other equipment was used besides the DACs?
If using speakers in a room, how far were the participants from the speakers?
Did the participants discuss how anything sounded during the test?
Did people have as much time as they wanted to compare DACs?
What source material was used? CD, Hi-Res?

Seems to me there is a claim of a unlikely result with no supporting information. Its similar to many claims some people would describe as implausible.

Recently PMA put on a listening test in another thread here in the forum, and used a DACMagic+, and in one case, IIRC, he remarked that it was easier to hear a difference between 2 files using that DAC in a successful ABX comparison of files. From your experiment or whatever the process was, can you conclude that PMA is imagining that the DACMagic+ is more revealing than some other DAC?
 
I’ve come to the conclusion that differences between properly implemented modern DACs are often very small, if there at all. I do think my Buffalo IIISE/Ivy is better than a decent £100-200 DAC, but I haven’t done an ABX...... How much does knowledge of all those low noise regs, discrete output stage etc. colour my judgement..... Hard to say, but I’m sure expectation is a significant factor.

At the other end of the chain, differences between loudspeakers are much more obvious to me. Amps probably sit somewhere in between. I’ve pretty much decided that upgrading speakers (or at least finding speakers that work well with my music, at my levels, in my room) is where the money should be spent.
 
planet10 said:
ABX is statisically incapable of determining 2 DUT are the same.
I'm not sure if you are playing maths games or word games. If word games then of course ABX does not prove they are the same: it shows that they are indistinguishable.

If maths games then of course ABX does not prove anything but it gives the statistical likelihood of a statement being true. ABX cannot of course prove that two things can be distinguished, for the same reason that it cannot prove that two things cannot be distinguished. However, if a $30 DAC and a $3000 DAC were shown by ABX to be probably distinguishable then I suspect that all the ABX objectors would suddenly go silent (on the weaknesses of ABX) and very vocal on the differences between A and B.
 
It is true that an ABX test can only prove that two DUTs are different, and it cannot prove that they are the same. The reason has to do with the math of statistics. Statistical tests can either "reject" a null hypothesis or "fail to reject" a null hypothesis. They cannot prove a null hypothesis. In this case, the null hypothesis would be that the DUTs are the same. So the ABX test can show that they are not the same, or fail to show that they are not the same. This second conclusion is not the same as showing they are the same. Sounds like double-talk, but it's not. A common explanation used is this: a jury either finds a defendant Guilty or Not Guilty. Not Guilty is different from Innocent. Not Guilty means you couldn't find enough evidence to support Guilty. But it doesn't conclude innocence. Really it leaves the question sort of undetermined.


So how do you apply this to ABX testing of DACs. The OP's ABX test did not / cannot show that the DUTs are the same. But it did fail to show they are different. I guess that might mean for example that there could be a difference, and if you improved the test somehow (different tracks, different lengths, who knows), then you still might have a chance of finding a difference. But even as the test stands, failing to show they are different, in a scenario where we were expecting to easily find a difference, is still quite a valuable outcome.
 
  • Like
Reactions: 1 user
+1@DF96, +1@DC3

@JonBocani, thanks for posting the comparison. The observation does not surprise me, given the maturity of DAC designs. It would be interesting to instrument test both DACs to see what the measurable differences were.

Test semantics aside. The small sample of 4 people may not allow statistical confidence (aka. significance) to be established in differentiating the products. IMO I still think your test says plenty about the diminishing returns (futility ?) of spending more in the "belief" you will get better sound.
 
Last edited:
But even as the test stands, failing to show they are different, in a scenario where we were expecting to easily find a difference, is still quite a valuable outcome.

Not necessarily valuable if there is some human tendency to jump to conclusions about what it means. It might be valuable in a similar way to positive snake oil results being valuable. Maybe it tells us something about needing to apply the scientific method very carefully lest we succumb to certain very common and problematic human biases.
 
If it were snake oil, or everyday marketing it seems, you would see only a selective subset of the positive responses gushing about the product. You would never be allowed to see the total response set. The deficiency in this test is the small sample size.
 
Last edited:
Is it correct that there were 4 people involved in listening?
Can you tell us exactly how the ABX testing was conducted, do you have some kind of ABX test system with relays or some other type of switching? Was switching conducted at line level?
What other equipment was used besides the DACs?
If using speakers in a room, how far were the participants from the speakers?
Did the participants discuss how anything sounded during the test?
Did people have as much time as they wanted to compare DACs?
What source material was used? CD, Hi-Res?

Seems to me there is a claim of a unlikely result with no supporting information. Its similar to many claims some people would describe as implausible.

Sorry about the lack of details, i didnt have the time yesterday and i was too stunned by the results. :eek:

Basically, we use that:

Lossless and uncompressed files (16/44-48)
iTunes of the latest update
Mac Mini latest gen, by the mini Toslink output
Feeding a nanoDigi miniDSP
Then, SPDIF out1 to DAC (A)
SPDIF out2 to DAC (B)
Copied EQ for both, except the gain
Gain adjusted within 0.1-0.3db and double-checked by ears
ICEpower 50asx2 amplifier


At first, a switch box was used (A/V type) to dispatch the signals to the amplifiers. Then, we doubt the switch box and made the switch by hand. Same results.

Also, we tried 21 liters BR enclosures with FR151 and later on a pair of B&W CM9.
Same results.

We tried several tunes, numerous times (we spent about 3 hours non-stop in that test). Controlled room, low noise floor, constant distant/position from listener, etc..

Will post pictures later if that can help.
 
Does Mac Mini work like Windows, performing any SRC for a shared sound device?

What was distance between speakers and listeners?

Did people get to listen in silence, and take as much time as they wanted to vote?

Just asking.
Something seems quite odd about the result, although it certainly is statistically possible. Also, it seems likely the result might be at odds with other blind ABX DAC testing. If multiple groups are using ABX, and for example Benchmark Media is known to use ABX, then the finding in this case seems inconsistent with my general impression of what has previously been found to audible. Intersample overs might be one area where some of the DACs might be expected to differ, but maybe not.
 
+1@DF96, +1@DC3

@JonBocani, thanks for posting the comparison. The observation does not surprise me, given the maturity of DAC designs. It would be interesting to instrument test both DACs to see what the measurable differences were.

Test semantics aside. The small sample of 4 people may not allow statistical confidence (aka. significance) to be established in differentiating the products. IMO I still think your test says plenty about the diminishing returns (futility ?) of spending more in the "belief" you will get better sound.


Usually, i'm conducting my tests on a larger scale but this time i see no clue whatsoever on how different/more people could change the outcome. I don't question the methodology either (except the switchbox)... I would rather change the equipment first. My previous (subjective) experiences was that DAC differences were audible in the highest of frequencies, with very capable drivers in that regards, such as the RAAL ribbons 210-10 or 140-15. That might worth a shot.

Problem is: even though we (or ONE person) can spot a DAC difference using such component, that makes the 30 bucks V.S. 3,000 bucks high-end converter like the greatest of ''deals'' for 99,99% of the audiophiles.

We SHOULD'VE seen a glimpse of difference, yesterday. We should've. But we didn't. And we didn't feel confident enough to ''practice'' more, either.

It was quite humbling, to be honest.
 
Your test reinforces what I already believe, spending more <> improvement. I also believe the testers and the test were sincere, so no problem there. I'm already in the $30 camp, and need no convincing.

However, from a statistics perspective you are taking 4 unbaselined subjects and this may not extrapolate to the entire population. That is why I say more samples would be better. Independent, unbiased and double blind samples even better.
 
Markw4 said:
May I ask how would you define the term, ABX objector?
Someone who routinely objects to ABX, claiming all sorts of reasons (including some he perhaps doesn't understand but read elsewhere) when the real reason is simply that ABX finds results which are a poor match for his own beliefs.

dc3 said:
Statistical tests can either "reject" a null hypothesis or "fail to reject" a null hypothesis. They cannot prove a null hypothesis.
Surely they cannot disprove a null hypothesis either, unless the hypothesis is very carefully constructed? Stats cannot prove anything. Even 100% findings (same or different) prove nothing, as there is a finite chance this could be random. We need to drop words like "prove" when speaking of statistical results.

Markw4 said:
Not necessarily valuable if there is some human tendency to jump to conclusions about what it means.
There is a human tendency to regard as valuable any evidence which supports preconceptions, and dismiss as worthless any evidence which undermines preconceptions. However, the OP reports that his preconception was that the units would be easily distinguishable so his findings that they were not (within that test environment, with those listeners) perhaps carries more weight.
 
I think i'll dig deeper. I'm somewhat unsatisfied... :rolleyes:

So i just bought high quality passive crossover components for the next DIY speakers that will be used in that test:

- Faital Pro 18FH500
- RAAL 140-15D
- Open baffle 2-way
- 1700hz crossover point
- DSP nanoDIGI, as before, for a flat response and gain adjustment.

The main idea is to test with the 140-15D. Probably also near field (we were at about 3m distance, now maybe we might try 1.8-2.2m distance)

Also, new (and more) participants if possible. You're welcome to contact me if you want to try the test yourself, i'm just north to Montreal. PM if interested. Set-up will be ready starting tomorrow after 5pm.
 
Last edited:
As far as i'm concerned, if i can't spot those converters with RAAL ribbon tweeters, that's the end of it. I'll run out of options.

For those who asked: i cannot remember the Fiio model. I'll check when i'm back but it's a tiny, USB-powered, very low-cost 3 years old model or so... Bought it on Amazon.
BTW: it's by no means some kind of advertising to promote the Fiio, i don't feel it's special in any way, i feel ANY functional modern DAC will do the job, at this very moment.

That being said, there is one thing that can help identify DACs from each other: it's the noise. The Forssell (or the Weiss DAC1 mkIII, Gustard X20, etc... i tried few years ago) were all dead silent. Which means that helps with very high efficiency drivers. Will that lead to a successful ABX testing with HE drivers and/or ultra low noise floor room and/or near-field listening ? Probably. I don't know. Between tunes, i think so. Once music is started, not so sure. Maybe in the quiet portions.

The RAAL 140-15D is pretty good in unveiling the noise/artefacts, if any, and that ought to be interesting. I also have a pair of Radian 760NEO that could be helpful in that regards, if needed.
 
frugal-phile™
Joined 2001
Paid Member
I'm not sure if you are playing maths games or word games. If word games then of course ABX does not prove they are the same: it shows that they are indistinguishable.

I am not playing word games. The way an ABX test is designed, it is strong when it comes to saying 2 DUTs are different, it is weak at saying there are the same, so weak no conclusions can be made. Any case for indistinguishable only applies to the particular test/test day/participant set. Results cannot be used for a general statement.

dave
 
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.