DAC blind test: NO audible difference whatsoever

Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.
The results of a blind-test alone, doesn't say much because it all depends on the actual test: it is possible to make a blind test and "prove" that a $10 device sounds the same as a $10,000 one. Then make another test and reverse the result by exposing all the cheap device's weaknesses.

This is a lesson I've learned after designing several image/video tests: in order to uncover the subtle differences, you need to drive the subjects to their limits -that applies to everything, from car engines to electronics, to humans. Because in idle, everything works almost the same.
So it is necessary to design the tests so that they will focus on exposing their weaknesses in a way that a person can confidently perceive.

In other words, random music is not enough. It will require either a collage of carefully selected music parts, or for the ultimate control, synthesized sounds designed specifically to drive the DACs to their limits. Someone inclined in creating custom synth sounds from ground zero along with someone knowledgeable on acoustics and DAC technology for both DACs would be able to make the best tests possible.
After the success of the blind tests with artificial sounds, the criteria to seek the proper music parts will be crystal clear, and you could repeat those tests with just music.
Also, the actual process should be well designed and completely emotionally neutral.
The sort of blind tests reported on forums are usually ABX style, uncontrolled, & invariably result in null results.

In other words these sort of tests tell us far more about the participants then it tells about the DUTs
 
In other words these sort of tests tell us far more about the participants then it tells about the DUTs
e1110a474bcb2d60d92b41f1e94ffd72.gif
 
Hello,

Almost funny how those who say "Or you can take into account the probability, yes the PROBABILITY, that some of us can discern the differences."

Seems they won't ever set up and do an Blind test. If you set it up you can control the amount of time listening and other variables, but they will never do so as don't really expect to be able to get the results they want.

More fun worshiping at the religion of mine is better, costs more and is prettier.

But hey, it is their money and only themselves they are fooling.

Flame suit on....
 
The results of a blind-test alone, doesn't say much because it all depends on the actual test: it is possible to make a blind test and "prove" that a $10 device sounds the same as a $10,000 one. Then make another test and reverse the result by exposing all the cheap device's weaknesses.

This is a lesson I've learned after designing several image/video tests: in order to uncover the subtle differences, you need to drive the subjects to their limits -that applies to everything, from car engines to electronics, to humans. Because in idle, everything works almost the same.
So it is necessary to design the tests so that they will focus on exposing their weaknesses in a way that a person can confidently perceive.

In other words, random music is not enough. It will require either a collage of carefully selected music parts, or for the ultimate control, synthesized sounds designed specifically to drive the DACs to their limits. Someone inclined in creating custom synth sounds from ground zero along with someone knowledgeable on acoustics and DAC technology for both DACs would be able to make the best tests possible.
After the success of the blind tests with artificial sounds, the criteria to seek the proper music parts will be crystal clear, and you could repeat those tests with just music.
Also, the actual process should be well designed and completely emotionally neutral.

No.

You got it all wrong.

The blind tests we did so far are made in contexts where participants are focused much more than normal. Everything is made to help them find differences, if any. It's already unnatural because that's a focused, active listening mode that is way beyond what people do at home. So if it doesn't work in that kind of context, you can bet your shirt it won't be better in a more normal casual context.
Even in active listening mode, usually you'll focus on music rather than spotting A from B...

Also, in each tests i carefully select the music excerpts that will help the most for a potential identification. Obviously.
Most of the times, participants can choose which tunes they prefer for the test from a rather large selection and also the lenght. Once they have a little experience with ABX testing, they choose the shortest lenght in order to make quick back-to-back comparisons... because that helps.

Bottomline, the test's environment is designed to help for a potential identification, not the other way around. At least the tests i'm organizing. Our goal is not to ''prove'' that cheap components are better than expensive ones, not at all. The results are the results, if the expensive component were to be identified from the cheap one, i'd say it as well.

In fact, it stopped at the 19.99$ v.s. 3000$ duel but that was not planned like that. We were supposed to spot easily the differences, as a warm up, and then move on bigger converters...

But it didn't happen exactly that way.
 
The sort of blind tests reported on forums are usually ABX style, uncontrolled, & invariably result in null results.

In other words these sort of tests tell us far more about the participants then it tells about the DUTs


Like i said in previous pages; INDEED it tells more about the participants then about the tested components.

But by ''participants'', you need to understand ''humans''.

;)


The only way to refute that statement is with a demonstration that said participants are not accurate samples of the human species. Or even more precise: accurate samples of the audiophilia population, which i believe consists of 99% males between 25yo and 70yo ?

We were 4 males with an average of about 45yo, it's a good start.

:cool:
 
The result of a blind test can't be measured on a one-dimensional scale (worse/better) but rather on multiple quite subjective scales. How do you measure terms like punch, slam, spatiousness, feeling of presence, tiring after extended listening, etc.?
Also a DAC is a combination of digital and analog circuitry; the digital part could be excellent and the analog part poor, or vice versa.


It's an ABX test in his purest form: IDENTIFICATION.

(A) is presented with XYZ music excerpt

then

(B) is presented with the same XYZ music excerpt

then

(X) is played without saying if it's A or B.

then

Participant must tell if X is either A or B.

That's it. There is nothing else. Identification only, nothing about the preferences or anything at that stage. The ABX identification test is the sina que non condition for going any further into the ''does it sound better'' kind of questioning. First, you need to prove they are distinguishable.

WHEN and IF you can prove you're able to spot A from B, then you're ''allowed'' to give your subjective feedback about it. That is very simple logic.
 
Which amp and which cabling?

Also: what did the 4 participants have to verify? Or just answer to "do you hear a difference?"?

Roberto


Common amplifier for all DACs: ICEpower 50asx2

You can see the measurements here:

http://www.icepower.bang-olufsen.com/files/solutions/icepower50asx2_datasheet_1_5_20150709.pdf

I have total confidence in that amp.


Cabling was nothing exotic as previous comparisons showed that cables are not making any audible differences whatsover (unless defective, of course)
 
As I said, the test is not "same" or "different" - it is "probably indistinguishable" or "probably distinguishable" on that day using that auxiliary equipment in that electromagnetic environment with those listeners. Results either way cannot be used for a general statement. However, if repeated tests show similar results and those results match what might be expected from known physics and psychoacoustics then we may approach consensus among those who actually seek knowledge rather than affirmation.

It's well put, in a very neutral, objective way, DF96.


Basically, that very test is an Hypothesis moving toward Theory status.

Hypothesis
A suggested explanation for an observable phenomenon or prediction of a possible causal correlation among multiple phenomena.
Based on: Suggestion, possibility, projection or prediction, but the result is uncertain.

Theory
In science, a theory is a well-substantiated, unifying explanation for a set of verified, proven hypotheses.
Based on: Evidence, verification, repeated testing, wide scientific consensus


Hypothesis vs Theory - Difference and Comparison | Diffen


If the idea is ''There is no audible differences between digital-to-analog converters'', then it's an Hypothesis, unless other tests were made before.

On the other hand, if the idea is ''The human auditory sensorial system can be biased'' then it's a Theory. A lot of researches made the demonstration that our brains can play trick on us (psychoacoustic).

In any ways, an hypothesis such as the one in this ABX test with DACs can be made invalid only with other credible test(s) with very different or opposite results. Which, to my knowing, do not exist as of yet.

So, not as much a general statement but rather a strong invitation to prove the test wrong.

Meanwhile, i'll try to prove it wrong myself, starting tomorrow with a new set-up. :)
Again: if you're willing to visit us near Montreal, you are most welcome. Free cappuccino coffee: you need to be focused!
 
Last edited:
Big changes for the next set-up:

.: A&K 300 as the source (instead of iTunes & MacMini) using the toslink output

.: Uncompressed high definition music excerpts

.: Faital Pro 18FH500 in Open Baffle + Ribbon RAAL tweeters 140-15D

.: Solen Film & foil caps for the RAAL (passive 12db/octave xover)

.: Reduced distance (nearfield listening)

(same amplifier, also the nanoDigi will still be used for EQ and gain adjustements)

I believe that should provide more chances to spot differences if any. It's in fact a bit extreme, compared to the vast majority of audio systems... But i think it's a good idea to dig deeper.
 
Like i said in previous pages; INDEED it tells more about the participants then about the tested components.

But by ''participants'', you need to understand ''humans''.

;)


The only way to refute that statement is with a demonstration that said participants are not accurate samples of the human species. Or even more precise: accurate samples of the audiophilia population, which i believe consists of 99% males between 25yo and 70yo ?

We were 4 males with an average of about 45yo, it's a good start.

:cool:

What makes your 4 'participants' any more 'accurate' (whatever that means) samples of the human species than the more than 20 people who participated in the blind tests that I linked to?
These people also thought that there was n difference between the DACs that they listened to over the course of a year until, that is, they were pointed to some differences & then they could all hear differences. Why would you consider that you & your listeners have reached a definitive conclusion?
Did you read the link I gave & the comments made from those participants? What did you glean from those comments?
 
Last edited:
No.

You got it all wrong.

The blind tests we did so far are made in contexts where participants are focused much more than normal. Everything is made to help them find differences, if any. It's already unnatural because that's a focused, active listening mode that is way beyond what people do at home. So if it doesn't work in that kind of context, you can bet your shirt it won't be better in a more normal casual context.
Even in active listening mode, usually you'll focus on music rather than spotting A from B...

Also, in each tests i carefully select the music excerpts that will help the most for a potential identification. Obviously.
Most of the times, participants can choose which tunes they prefer for the test from a rather large selection and also the lenght. Once they have a little experience with ABX testing, they choose the shortest lenght in order to make quick back-to-back comparisons... because that helps.

Bottomline, the test's environment is designed to help for a potential identification, not the other way around. At least the tests i'm organizing. Our goal is not to ''prove'' that cheap components are better than expensive ones, not at all. The results are the results, if the expensive component were to be identified from the cheap one, i'd say it as well.

In fact, it stopped at the 19.99$ v.s. 3000$ duel but that was not planned like that. We were supposed to spot easily the differences, as a warm up, and then move on bigger converters...

But it didn't happen exactly that way.
"Much more than normal" focus is required in all kinds of tests, so that's a common denominator for both approaches and thus we can remove it for a moment.

What remains, is a scientific approach with criteria based on the specific technology weaknesses and advantages between two devices of a different quality class (I'm not referring to cost) and the human psychoacoustics, vs a honest, organized, yet arbitrary effort, criteria and method based on instinct and/or best guess.

But science beats arbitrary methods any day, hands down, no matter how honest.
In other words, you need more science to compare devices based on science.
You need to KNOW their technological differences first, and their effect to human perception, in order to design the best tests that will reveal them.

As for not being "natural" to focus that much: when you listen to music normally on your couch and you are cool and relaxed, you might not realize it, but you perceive flaws of music reproduction subconsciously, with direct, subtle mood changes, emotionally etc, for example in difficult music passes where a lot of instruments overlap and the demands for faithful, transparent reproduction skyrocket, like the emotional differences or uplift from listening to music with a decent speaker/headphone system vs a crappy one, only far more subtle. So focus in an "abnormal" procedure is a requirement in order to consciously realize and spot the differences in a scientific comparison.
 
Last edited:
If differences are so small that only a minority can hear them under conditions of high concentration then they are of little consequence.

The differences may be small enough that high concentration is necessary for short-term ABX testing. That doesn't mean they are small for long term ABX, or non-ABX.

It could be that only a minority of listeners have suitable brain DSP to recognize the differences and communicate them to conscious awareness. For that processing to work, it may or may not take a lot of concentration. If concentration is needed, its not the same type of wrinkled forehead concentration needed for weight lifting competition.

For the smallest differences detectable, maybe more like the concentration needed for an expert musician to play a difficult piece of music exceptionally well. Something that takes practice, focus, freedom from extraneous distraction, and not excess fatigue. Sometimes it may come easily, and other times the best performance may not be forthcoming no matter how much it is desired.
 
Last edited:
Did you expect any other result?

If the DAC chip is not the cheapest crap, it will have a datasheet THD+N of better than
90 dB and a flat frequency response up to 20 kHz.
This is more than the typical listening room, which is about 90 dB - 30 dB = 60 dB.
And more than the typcial CD source.

In fact, any hearable difference will be due to second order effects of your loudspeakers or amplifiers or imagination
(like intermodulation at nonlinearities of the speakers).

Maybe you could try a very quiet sample near zero, where the DAC has some noise modulation or idle tones?
I bet different DACs have different noise levels, or crosstalk from digital parts.

You could try a very sharp pulse train.
But probably you will test the limits of your amplifier.
Or maybe the DAC has an output stage, which saturates during loud sections?

This is a difficutl task, even with very clean files (preferable in 24 bit) and in an anechoic room.
 
Last edited:
"Much more than normal" focus is required in all kinds of tests, so that's a common denominator for both approaches and thus we can remove it for a moment.

No, i cannot remove that.

The ultimate goal is to make decisions about audio reproduction equipment choices. Therefore, it needs to suits ''normal'' uses.

By example: you don't test and compare winter tires on F1 cars or at -70c because that's not ''normal'' and results would be useless for a real-life application.

I'm not in the quest of absolute answers, i just want the thick fog over Audiophilia to clear as much as possible. I feel it's thick enough to provoke accident, as of now... ;)
 
Last edited:
Why EQ? I would try to minimize any unnecessary processing.


For obvious reasons SPL-matching is mandatory, to begin with.

Then, the DIY speakers we working with here needs EQing. Especially if we seek for a full frequency response (9.5 octave or better)

That being said, the pair of B&W CM9 don't have any EQ processing nor electronic xover/subsonic. It's straight through, only the gain is adjusted, and it's all kept in the digital domain.
 
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.