DAC blind test: NO audible difference whatsoever

Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.
Concerning DA C's I have my personal theories. There should be a point where the reproduction leaving the DA's pins is perfect. If this is achieved, any difference between the DAC chips is just a matter of the parts that push the signal in the line output.
The digital input signal has to be perfect, of course. From this side, the “better” DAC would be the one that can reconstruct a digital signal that has some kind of imperfections, to a perfect output signal.

On the other side, there could be coloration's in sound that are leading to a different sounding analog signals, but without being better or worse. Loudspeakers and amps together with wiring always have a coloration, which also depends on the listening room. There are no technicaly perfect audio systems, just as a trumpet or piano does not sound identical in different rooms, but a defective instrument sounds wrong in any room.

So can there be some tolerance to different “DAC sounds”, as long as this is not distortion or another annoying mistake. But where is the border line?

If you are a “if it measures the same, it sounds the same” person, this sure is blasphemy.
In my personal live, living with Hifi for about 50 years, the measuring technician has always been wrong, because his believe in his actual equipment was wrong. He never measured everything that is possible. He only saw what the analyzer, he had at his time, showed him. He could not imagine that there where audible differences he could not measure.
Then, a little later, a new technique was invented and suddenly the audible was measurable. Like simple distortion measurements, which proved to be mostly irrelevant (Matti Otalla/ feedback) or Klippel measuring, showing loudspeakers in a new light.

Is there no way to discuss in a non aggressive way?
 
We should emphasize again on one of the most important points of propper experimental design, namely to define a research question/hypothesis that the forthcoming experiment should address.

For instance, looking for parameters of the population is something very different from testing if a specific listener is able to perceive a difference between two DUTs.

Unfortunately sensory testing is a surprisingly complex (or maybe not so surprising as we have to deal with humans who are very complex systems) task and for good reasons there exists a plethora of scientific literature about doing this sort of tests in a propper way.

As Mark4 already pointed out again humans are most likely always biased - as said in posts before therefore there most probably never exists a "ears only" test and everybody should refrain from using such misleading terms - and unfortunately this bias is reflected in stating hypothesis as facts.

A difference not spotted by a person in a "blind test" can´t be important for normal everyday listening? Might sound reasonable, but since when is plausiblity a sufficient conditions for acceptance as correct?

A difference can´t be important if people are so easily distracted to not detect it? Sounds reasonable too, but look at the experiments for inattentional blindness and inattentional deafness.

Phenom-Gorilla-spotting-520.jpg


You might dispute the relevance of the difference but obviously participants are quite easily distracted so that they do not detect the "gorilla" (human in gorilla mask for all Deschamps fans :) )

So please provide evidence for all these assumptions before stating as facts. Demanding evidence from "audiophiles" (whatever the definition for an audiophile migh be) is a valid approach; you shouldn´t demand less from yourself.

@ Mark4,

i´d still strongly dispute the socalled weak auditory memory theory.
The pro´s and con´s of very short music excerpts were already discussed in this thread and imo it is obviously dependend on the research question / variables under examination.

If we are looking for practical relevance for normal listening obviously everything that can´t be remembered after a couple of seconds wouldn´t be of relevance because you would not remember.

That working with short samples might be suitable for efficiency reasons or certain research topics is a given.

@ DrDyna,

yeah, fooling audiophile listeners is quite easy - it´s quite easy to fool professional listeners as well, just remember the often told story of the bypass EQ - but as a mere fact that should be seen as a strong warning to not underestimate the difficulties of obtaining _correct_results .

Let me once again point to the "gorilla" in our midst.

Stating that fooling people is easy and than assuming that they will not be fooled if only put in a "blind" test is a contradictio in ratio .
 
Last edited:
@ Mark4,

i´d still strongly dispute the socalled weak auditory memory theory.
The pro´s and con´s of very short music excerpts were already discussed in this thread and imo it is obviously dependend on the research question / variables under examination.

If we are looking for practical relevance for normal listening obviously everything that can´t be remembered after a couple of seconds wouldn´t be of relevance because you would not remember.

That working with short samples might be suitable for efficiency reasons or certain research topics is a given.

Regarding short auditory memory, I believe there is a range of time, depending. From personal experience in one of PMAs most difficult listening tests, in that case my particular auditory memory was quite short. So I will stick with, "it can be very short, depending."

On the other hand, some types of auditory memory seems to persist for a long time, years in some cases. A audiographic memory of a well known song in the mind of a singer might last decades and may be played back mentally at will.

I think it probably depends on the complexity of memory involved, and the familiarity with the sounds. The exact spoken sound of a word may persist longer if in a language with which one is highly fluent. Otherwise, if gibberish noise, it may fade from memory rather quickly assuming all of the details were ever captured at all.

Regarding relevance, memory duration matters little for musical enjoyment. Musical enjoyment occurs in the moment, and also perhaps as savored from memory. A memory that one had a pleasurable or unpleasurable experience is enough to affect how happy one may be for a more prolonged time. To me, that makes it relevant.
 
Last edited:
Auditory perception is temporally bound. So is visual perception.

Imagine being shown an image for 1 second (man sitting on a bench at a bus station with a newspaper in hand, several people in frame). Recall what the headline in the newspaper said? Okay, now imagine being shown 4 different images for 1 second each. Can you remember what the headline in image 2 Said? Okay, now imagine you can loop this repetition of images. Can you now easily recall what the newspaper headline says in image2? After several loops?

There are so many studies *proving* this but we don't need to reference them. Our own experience bears this out everyday. More time to focus increases recall of detail.

When we're trying to quantify the quality of a DAC:
- there are listeners who can hear the difference
- there are listeners who cannot
- there are listeners who think they can hear a difference
- there are listeners who think they cannot hear a difference

Everyone's experience is valid but perceptions can be formed regardless of signal reality. So what can we know?

- Some people have amazing pitch correlation
- Some people have amazing transient/phase awareness
- Some people have amazing amplitude acuity
- Some people have great acquisition fidelity across the entire bandwidth
- Some people have narrow acquisition fidelity with increased resolution at particular frequencies.

Does a DAC's performance matter when > 50% of the public can't agree? well, I know a lot of women, (seems few contribute around here), who can hear more details than their male counterparts (including me.) If an improved DAC is just for the women in your life, or your kids, then that is probably a good enough reason to embrace a higher precision offering (if said DAC manufacturer is not a woo-spewing parasite leaching off the music loving community.)

& then there are a lot of cool DIY options, for example the SOEKRIS DAM 1021.

As a friend of mine used to say,
"After a certain number of bits no one cares anymore."

I guess the real question is, how hard does one have to focus or train in order to perceive a difference and does that relate in anyway to subconscious awareness of sound quality vs conscious awareness of sound quality? IOW, do people already perceive quality sound but cannot (ABX) test out as they have no listening skill?

I have no idea nor do I intuit anything here. So for me, pursuing the ability to differentiate sources on the premise that my subconscious is already perceiving those differences is an unnecessary line of questioning. OTOH, I don't think it's a bad thing for audiophiles to do some light ear training as the rewards of being able to listen with more acuity are justification alone.

Learning that we can notice more detail by looping sources, instantly switching between them and adhering to head-in-vice type protocols is valuable to the art of signal reproduction. I can get behind anything that increases our ability to communicate and listen.
 
As a friend of mine used to say,
"After a certain number of bits no one cares anymore."

Many good points, thank you.

If I may, I might restate you friend's observation to something more like, "After a certain number of bits, sufficiently accurate in time and amplitude, including enough bits beneath the noise floor, then no one cares about bits anymore."
 
Regarding short auditory memory, I believe there is a range of time, depending. From personal experience in one of PMAs most difficult listening tests, in that case my particular auditory memory was quite short. So I will stick with, "it can be very short, depending."

At that point i totally agree.

On the other hand, some types of auditory memory seems to persist for a long time, years in some cases. A audiographic memory of a well known song in the mind of a singer might last decades and may be played back mentally at will.

Or see for example the sound of a specific instrument (or instrument class), that can be quite easily transferred to long term storage and remembered for very long time spans.
Afaik which way we parse and store this sort of information is still unkown.
It seems that there exist a direct path somewhere that does not need conscious processing during the transfer.

I think it probably depends on the complexity of memory involved, and the familiarity with the sounds. The exact spoken sound of a word may persist longer if in a language with which one is highly fluent. Otherwise, if gibberish noise, it may fade from memory rather quickly assuming all of the details were ever captured at all.

Of course, maybe i wasn´t precise enough with "weak auditory memory theory" - we know about the (although varying) time spans of echoic memory and working memory - and the consistent part of every memory theory i´m aware of is, that for transferring something to long term storage, categorization takes place (although see above storage of sounds seems to be at least partly sort of a autonomous process) and works often the better the more different parts of our brain are involved.

Regarding relevance, memory duration matters little for musical enjoyment. Musical enjoyment occurs in the moment, and also perhaps as savored from memory. A memory that one had a pleasurable or unpleasurable experience is enough to affect how happy one may be for a more prolonged time. To me, that makes it relevant.

Which is what i meant; the usual "weak auditory memory theory" works along the line that music samples have to be short because the auditory memory is weak and listeners would not be able to remember after a short time span.
That the quite diverging time spans for the various memory mechanisms (according to the literature) present a counter argument is often neglected in the discussion about test protocols.

But, more important, it ignores that the emotional response might only occur after a longer time span and that listening to short sample tends to favour the analytical approach to the music while preventing to consider appropriately the emotional response.

The enjoyment while listening to music (or more specific the degree of enjoyment while listening) is certainly something that can be remembered too.

So, its a valuable approach to listen to longer music samples to get a feeling for the impact of the presentation while later zeroing in by using short samples to work out what the physical differences really might be.
 
Sorry, I got very busy over the holidays.

Never mind, as you see i´m not able to follow up in time either... :)

My issue isn't that Clark and Frindle were lacking enormous details. My issue was that you portrayed them, IMO, as a bit more definitive. And then upon reading the actual papers, there was no meat there.

Which goes back to my original point: The science here is poor (something you didn't really refute).

I can´t refute it, because a lot of data is often missing. Furthermore i´ve often encouraged people (who believe in hearing thresholds as hard facts valid for every listener) to read the actual publications to find out what the limits are.

But as said before i was wondering that you seem to dislike accepting Clark´s or Frindle´s results due to missing a detailed description of the experimental procedure (and therefore being not able to evaluate scientific rigour) but proposing a different approach which would not provide such sort of information either.
Or, did i miss something?

But peer review wouldn't accept a statement such as Clark's without supporting evidence. Clark's statement, when he made it, wasn't widely known and didn't cover the methodology. And it's still suspect today, nor has it be replicated. Thus, how did it get through peer review? Peer review, by design, would reject new and novel statements of fact without supporting data.

We should not expect more from the peer review process than it can deliver.
As there exists (most likely) no perfect experiment there exists (most probably) no perfect documentation either. To a certain degree someone (reviewer, editor or reader) has to trust in the honesty and ability of the experimenters.
Neither Clark nor Frindle claim to have found a good estimator for the mean of the underlying population but presented results that they found during experimentation under certain constraints and a limited number of participants.

Usually we would give them credit in believing that they were able to get their instrumentation right, measure the right things and avoid the basic errors in experimental setup and execution.
 
I listened to about 6 DAC in last year. And, believe me, each of them was clearly audible different. Entry level and high end. However, what's most important, that you have to pay tremendous amount of money for slight improvement. So anything from build in Auralic Aries Mini up to Exogal Comet Plus, Mytek Brooklyn and Arcam D33.

I have not heard ES9038 DAC, and I am thinking of going into cheaper DAC than my Coment, due to various reasons, however, all PCM and ESS solutions I heard were quite open and bright, especially PCM and for example ESS Based DACs have great soundstage without harshness, which I recognize, in Arcam PCM based solutions.
 
It is interesting to note all the reasons put forward as to why we should ignore any result which shows that two devices are not distinguishable:
- its the wrong question
- its the wrong people
- its the wrong day
- its the wrong auxiliary equipment
- the people were untrained/stressed/biased/prejudiced
- the music samples were too long/short/simple/complex
- the statistics don't prove anything

Posts like this make me always wonder as we had already discussed some of these points - sometimes even quite "excessive" with cited scientific evidence theory of statistical significance testing.

My concerns start with the usuage of "not distinguishable" as we simply don´t know about that (in the standard tests usually done).
As explained before, we are analysing the data under the premise that the null hypothesis is true (null hypthesis usually means in our cases "data could have been produced by random guessing") and decide about the compatibility of the _observed_ data with this assumption about the null hypothesis.

We do this decision by applying a statistical test (in our case it is quite often the exact binomial test) and using a predefined decision criterion (i.e. the significance level) and as a result we either refect the null hypothesis or we do not reject the null hypotheses.

In the latter case we do not know about the reasons for the observed data because we haven´t researched it. Therefore we can´t conclude that they were "not distinguishable" , may be the DUTs were distinguishable but something odd happened.
All we do know that we can´t reject the null hypothesis as the probability to get the observed data (and more extreme) was above our predefined decision threshold.

What about the ironically listed other factors (i assume that i understand the posters disbelief in the importance of all that correctly)?

Let´s compare it to a list of posted influecing variables in sighted listening:

- knowledge about brands
- different colors
- other peoples opinion about the DUTs
- expectation about a difference in general
- fear about expressing no difference
- not enough listening experience

and surely there are others as well.
In addition one has to maintain the argument that listeners are unable to compensate the impact of these bias factors in sighted listening tests but are totally shielded against any bias factor if only they take part in a "blind" listening test.
And of course you have to stand by this argument even if scientific knowlegde implies that it is wrong.

Curious then that a test finding of 'distinguishable' is accepted almost without comment, even when the electrical difference combined with psychoacoustics means that the DUT ought (probably) to be indistinguishable.

Who decides if something not commented is accepted?
Mind reading abilities somebody?

Maybe people take anecdotical results as exactly that in hte sense of "might be correct or might be not" even if correct it might be relevant for someone else or might be irrelevant.

I have dipped into this thread again. As things have not actually moved on I will dip out again.

Don´t want to be (too) offensive, but moving on starts with acceptance of long established scientific evidence of the bias effects in controlled listening tests (including the double blind feature).
 
I'm surprised this thread is still going.

Hi Jakob2: You clearly know a lot about stats and test protocol. You've talked about the gorilla experiment where he goes unnoticed a couple times. Your point being real things can escape our attention (Beta error). What is your view of the Alpha error risk (false positive) in audio tests - especially the less structured ones in the home or at the audio dealer? People claim all the time to see or hear things that aren't there? I just watched David Copperfield appear out of no where then shrink himself to 2'. I saw this with my own eyes! ;-)

I would say the preponderance of data shows the Alpha risk to be at least as high as the Beta risk.
 
<snip> You've talked about the gorilla experiment where he goes unnoticed a couple times. Your point being real things can escape our attention (Beta error).

Not only "real things" but even quite big real differences can remain undetected.

What is your view of the Alpha error risk (false positive) in audio tests - especially the less structured ones in the home or at the audio dealer? People claim all the time to see or hear things that aren't there?

It´s hard to say as it still depends on so many variables. Of course chances are high for false positives if some basic requirements (for examples no apparent level differences) are missing, and of course evaluation of audio equipment (and propper description of differences if there exist some) needs often practice.

Otoh i was up to now always able to get correct results in controlled (even blind ;) ) listening tests when trying to find corrobation for something that i´ve previously noticed in sighted listening conditions.

I would say the preponderance of data shows the Alpha risk to be at least as high as the Beta risk.

Might be so, but it misses imo the point in a discussion like in this thread, as we are discussing the results of controlled listening tests and the possibility of further concluding to reasons for these results.

People don´t trust sighted listening tests and use some "blind" tests instead and it doesn´t make any sense in doing so if incorrect results are in the "new" tests as likely as they were in the "old" ones.
 
Considering that ABX testing "proves" there is no audible difference between CD redbook and compressed MP3, why would anyone want to use such a blunt test tool ... Trolls are rampant.

Can you prove that you can really hear the difference, or do we just have to take your word for it!
You are the one making the claim so the onus is on you to prove it.
If you don't like ABX testing then come up with a better method. Until you do, your comments are just heresay, and have no scientific validity.

Of course you could always just keep on trolling!

Edit
The point of ABX testing is not to prove but to disprove, the "Null Hypothesis".
Proving a "null hypothesis is like trying to prove the non existence of God - it can't be done.
 
Last edited:
Can you prove that you can really hear the difference, or do we just have to take your word for it!
You are the one making the claim so the onus is on you to prove it.
If you don't like ABX testing then come up with a better method. Until you do, your comments are just heresay, and have no scientific validity.

Of course you could always just keep on trolling!

Edit
The point of ABX testing is not to prove but to disprove, the "Null Hypothesis".
Proving a "null hypothesis is like trying to prove the non existence of God - it can't be done.

:D good reply, completely missing logic.
 
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.