Blind DAC Public Listening Test Results

Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.
Calling RS's idea a stroke of brilliance was what prompted me to respond :p I've used more technical terms than you and broken what you said in outline down to individual issues so that RS can address them individually (if he so wishes) :D

:) Well, the brilliant part comes from a study design standpoint. "Internet" experiments typically suffer from 2 major problems:
#1 - the reader doesn't know who the listeners were, or even if they existed
#2 - the reader can't be completely sure that the results weren't actually fabricated
#3 - even if the reader believes the listeners existed, they don't know if they can be trusted to identify differences that existed (or were motivated to complete the tests)
So, uploading files and having known forum users take the test solves #1-#3. Doing so requires generation of a file that captures the output of the DAC, which necessarily requires an ADC. So yes, it introduces an additional ADC-DAC step, but it solves 3 important concerns that many people have. So I think that is pretty smart, if you ask me.

I thought about the DAC part of it but that will be different for each user but the same DAC will be used in comparisons. If the DAC used sucks then it might well mask differences but that's an onus on anyone listening to upgrade their DAC :) No-one has a chance to change the ADC (except RS) so its got to be blameless for these tests to carry any weight.

That's only half true. If his testing shows that X% of listeners can identify differences Y% of the time, and it passes statistical tests to show that the numbers are unlikely due be due to chance, then it is irrelevant that an ADC-DAC step was added (since the same ADC-DAC step was added to both DAC's). The only way around this is to argue that the ADC-DAC step somehow was applied differently to each DAC, but that's a stretch. After all, if 2 DAC's have identical outputs, why would the addition of the same ADC-DAC step cause the outputs to have audible differences? OTOH, if listeners cannot distinguish between the 2 DAC's, then one could argue that the ADC-DAC step messed up the signal so badly that small potentially audible differences between DAC's were masked (extreme example: the ADC-DAC step added -6dB level pink noise and you could barely hear the sound clip anymore, and they therefore both DAC's sounded the same). If he encounters "no difference" between DAC's, then you need what's called a positive control to establish that your listener group is CAPABLE of hearing small differences. One way to do this is to take 2 test files that have subtle differences and make your listener group take a blind same/different test to show tha they can distinguish [defined] differences in audio clips.
 
So yes, it introduces an additional ADC-DAC step, but it solves 3 important concerns that many people have. So I think that is pretty smart, if you ask me.

Well you're completely entitled to your view :) Mine is that a chain is only as strong as its weakest link and if the issues I've mentioned aren't addressed, that link is made of liquorice :D

That's only half true. If his testing shows that X% of listeners can identify differences Y% of the time, and it passes statistical tests to show that the numbers are unlikely due be due to chance, then it is irrelevant that an ADC-DAC step was added (since the same ADC-DAC step was added to both DAC's).

Yes - but its also quite likely that no difference would be heard. Then what? When the issues I've raised aren't addressed the differences in practice will be reduced so X and Y get smaller. Which is pretty much RS's position on DACs at present (barring a Damascus Rd since we last interacted) - so his experiment only confirms what he already 'knows'. Confirmation bias is jolly hard to avoid even for 'objectivists' so its how I'd expect him to design the experiment.
 
Well you're completely entitled to your view :) Mine is that a chain is only as strong as its weakest link and if the issues I've mentioned aren't addressed, that link is made of liquorice :D

Exactly. Is like trying to measure the differnce in quality of two cesium clocks having as measuring tool an alarm clock.
When you try to discern differences that are at -95...-100dB level with a DAC/speaker system that has an accuracy of -65...-70dB it is obvious you will not objectively hear the "differences", but only what your brain wants to hear.

I have an ADC with -110dB THD+N. So I will be capable to make a "test" from DAC's that are at -96..-100dB level, at least 8-10 dB below my ADC capability in order not to mask completelly the DAC "sound signature".
The result will have to be HEARD on a DAC that has more than ADC level of THD+N in orderd to be valid. Actually it would have to be again some 8-10dB higher to be transparent, at 118-120db THD+N. Raise of hands who trully has that.
How will be controlled what DAC's and especially what amplifiers/speakers/headphones are used on the listening end? A mother board audio? Powered PC speakers? Skype-ready headphones?
 
Last edited:
When you try to discern differences that are at -95...-100dB level with a DAC/speaker system that has an accuracy of -65...-70dB it is obvious you will not objectively hear the "differences", but only what your brain wants to hear.

Ah we must be talking at crossed purposes. I don't attempt to hear differences down around -95dB - I'm almost completely sure they'd be totally inaudible.

I have an ADC with -110dB THD+N. So I will be capable to make a "test" from DAC's that are at -96..-100dB level, at least 8-10 dB below my ADC capability in order not to mask completelly the DAC "sound signature".

Have you ascertained that your set-up won't transgress the four points I've already outlined? For example, how do you know that your ADC is transparent - solely from the THD+N spec?
 
I don't know, but I have an educated guess :)
If my ADC (AK5394) is not "transparent", none of the digital recording on the market are, because it is the one used in studio master (Digidesign's flagship ProTools HD 192 I/O interface).
Yes, spec numbers tell a story for me. It might not be the whole story, but the part that is missing doesn't have that much influence in the end, like some people think.
 
Last edited:
A lot of the objections are pretty easily addressed from an experiment design standpoint rather than from a measurement or engineering standpoint. It is very important to have statistics and study design knowledge when performing human subject research (which is actually what you're doing) rather than applying an exclusively engineering perspective.

Assuming no listener can distinguish between DacA and DacB using and ADC-generated audio file played back on their own equipment, people will object that the added ADC step caused so much disruption of the signal that small differences were masked.

EASY. In this case, recruit a pool of motivated listeners - i.e. those listeners that insist that the ADC changes the sound. Then have these listeners do blind ABX same/different tests on the original audio file vs. the file after DAC and ADC and see if they can reliably determine same/different. If they cannot, then you're done - the ADC did not introduce any detectable differences to the sound.

If they can, then you need to do more work: the most elegant step would be to run the test without the ADC. Simply invite motivated listeners (those that are adamant that DacA sounds different from DacB) and have them perform a direct blinded A/B listening test with the actual equipment (without the ADC) to see if they can distinguish between the two. If they cannot, the totality of the experiment would likely convince most reasonable people that there is probably not audible difference. Applying appropriate statistics would also be very important to show that the tests were sufficiently powered (statistical power) and were unlikely to have been due to chance (guessing).

Honestly, you can debate all day about how THD, FR, and SNR measurements differences may potentially be audibile when an ADC step introduced into the chain, but at the end of the day, measurements are just measurements, the only way to prove AUDIBILITY of measurement differences is to conduct a controlled LISTENING test on an appropriate listener group.
 
I'll leave the debate to others here. Suffice it to say if went on record the Earth is spherical in shape Abraxalito would argue it may well be flat. Perhaps he wants to believe in Santa Claus and is upset I keep pulling Santa's beard off. I'm trying to focus on the bigger picture these days.

I made some comments today in the blind DAC blog article. The blind DAC test has several flaws that I admitted from the start. It could be improved in many ways and I might take another pass sometime once I get some other things out of the way first.

I've done live blind testing between the ODAC and the Benchmark DAC1 and so far it has not revealed any audible differences. I plan to expand on that in the future.

I'm also planning a blind test with something like an iPod and O2 amp up against $10,000 or more of source, DAC, and headphone amp--sort of in the spirit the Matrix Audio blind test but with headphones. That should be fun forum fodder. :)
 
Assuming no listener can distinguish between DacA and DacB using and ADC-generated audio file played back on their own equipment, people will object that the added ADC step caused so much disruption of the signal that small differences were masked.

Yes I agree - many subjectivists will probably argue that after some listening results come in. Which is why its important to tidy up the details before the listening is even begun. This way the objections can easily be dismissed should the listening results show no differences.

I see from RS's post that he's not interested in taking my points on from an engineering perspective but is enjoying his traditional trolling :D Very pleased to hear his DAC sounds indistinguishable from the Benchmark, it means less competition for mine when I eventually get around to designing it.
 
Yes I agree - many subjectivists will probably argue that after some listening results come in. Which is why its important to tidy up the details before the listening is even begun.

You suggest the impossible. The moving of goal posts, post hoc rationalization, and subsequent trolling is inevitable no matter what you do. There's always the faith-based, the armchair critics and pseudo-philosophers, the trolls, and those with something to peddle, who in aggregate will be infinitely creative with excuses and rationalizations. But no actual data, mind you.

RS is well aware that one can't prove a negative. But I suspect he's also aware that ten thousand Internet postings of ill-informed sneering and excuse-making don't measure up to a paragraph of real data.
 
Yes, I understand. My point was, and is, that no experimental design will prevent specious objections from the agenda-laden.

This is very true. The most common objection I see to blind listening tests comes from individuals who already "know" something to be true (ex: DacA sounds better than DacB) and therefore believe that any experiment that concludes otherwise (i.e. DacA/B sound the same) must be improperly done or problematic.

An analogous example would be a cigarette smoker who does not believe that smoking causes heart attacks - after all, all his friends smoke and none of them have dropped dead yet, so all the medical studies must be falsified or part of a conspiracy. No scientific experiment will convince this smoker that smoking is harmful. First, this particular smoker doesn't read medical journals, he couldn't understand the study methods anyway, he certainly couldn't follow the statistical analysis, and he doesn't trust doctors. In fact, he uses his common sense and reasons that if none of his friends have have heart attacks, then obviously all those studies and doctors are wrong!

It's the same thing with many aspects of hi-fi. "Well, OBVIOUSLY, DacA must sound better than DacB! I read it in Stereophile. DacA costs 5x as much. And everybody on XYZ forum knows this to be true too." "Therefore, if somebody does a blind listening test with 100 expert listeners, each performing 50 A/B comparisons, and NONE of them could reliably distinguish between the DAC, then the blind listening test was defective because it couldn't show an OBVIOUS difference that clearly exists!" And this individual will start hemming and hawing. It's a complete waste of your time to engage these particular individuals because they don't have the background or capacity to discuss a scientific study. It's like arguing with an 8 year old that a Porsche 911 is faster than his daddy's Toyota Camry. Sorry, but daddy's Toyota Camry will always be faster than any car you pick.

Which brings me back to the work by NwAvGuy. I'm not sure what he'll find when he does public A/B blind listening tests with his new DAC vs. a Benchmark DAC1. Needless to say, if it doesn't support the widespread "internet consensus," then the "common sense" people will be attacking his experiment. But my point is that who cares. You will never convince the people who do not understand science, the scientific method, experimental design, and statistical analysis - and it's a waste of time.

However, NwAvGuy DOES have an opportunity to convince the audience that can appreciate his work. All he has to do is properly design his listening test so that it is controlled and analyzed properly. This involves finding motivated listeners (i.e. those individuals who absolutely think there's a difference and will go through great pains and effort to correctly identify same/different in a randomized blinded A/B test). It helps if the motivated listeners are known to the intended audience (such as actual forum members that have been posting for many years already). He will need to have enough listeners AND enough listening trials for each listener so that his study has sufficient statistical power (i.e. his testing will show that there is a difference when there truly is a difference). He will also need to run statistics to show that any differences identified (example: listener was 70% correct in identifying same/different) are either significant (i.e. unlikely to have been a result of "lucky guessing") or not significant (i.e. could simply have been the result of "lucky guessing"). And finally, after getting some meaningful results, addressing important objections (i.e. an ADC step masked small differences that would otherwise have been identifiable) in a so-called "peer review" process to be even more convincing.

So far, NwAvGuy's work is not rigorous enough in its methodology or analysis to be able to draw meaningful conclusions. I don't think it was intended to be either. But hopefully with all the time/energy he's putting into designing the ODAC, he can spend an additional amount designing an appropriate blind listening test that can provide scientifically convincing results.
 
However, NwAvGuy DOES have an opportunity to convince the audience that can appreciate his work. All he has to do is properly design his listening test so that it is controlled and analyzed properly.

So far, no evidence that he's seriously interested in controls. Controls are hard work, require rigorous experimental design.

This involves finding motivated listeners (i.e. those individuals who absolutely think there's a difference and will go through great pains and effort to correctly identify same/different in a randomized blinded A/B test).

Yes, I see this as a problem for the experimental design. People like myself who notice differences between DACs won't be likely motivated to take part - why would they be? There's no benefit for them in doing so. So the only people who will likely take part will be those who hear no differences anyway and those who aren't confident about hearing differences.

So far, NwAvGuy's work is not rigorous enough in its methodology or analysis to be able to draw meaningful conclusions. I don't think it was intended to be either.

I concur - its not inability to be rigorous, rather unwillingness. I don't see any benefit to him in being rigorous.

But hopefully with all the time/energy he's putting into designing the ODAC, he can spend an additional amount designing an appropriate blind listening test that can provide scientifically convincing results.

But he'd need to be motivated - and I don't see it happening.
 
So far, no evidence that he's seriously interested in controls. Controls are hard work, require rigorous experimental design.

For the hypothesis that "DacA and DacB have audible differences" with the null hypothesis being that "DacA and DacB do not have audible differences," DacA can serve as the control for DacB if your test methodology is repetitive trials of blinded "same/different."

Yes, I see this as a problem for the experimental design. People like myself who notice differences between DACs won't be likely motivated to take part - why would they be? There's no benefit for them in doing so.

Individuals who genuinely believe that there are differences in DAC's would be the ones who would take time/energy to listen critically during a series of blinded same/different tests...because if they can consistently identify when they are hearing the same 2 DAC's vs. 2 different DAC's, and they can do this more often than could be attributed to lucky guessing, then they have effectively demonstrated that there ARE differences in DAC's. (Whereas people who don't think there's a difference won't have a vested interest in listening critically to correctly identify same vs. different, because doing a half-a$$ed job will still support their own beliefs.) Let me know if this still doesn't make sense, because it's a very basic principle for study subject recruitment for these types of tests.

I concur - its not inability to be rigorous, rather unwillingness. I don't see any benefit to him in being rigorous.

The benefit to him is that performing a rigorous experiment will allow him to successfully defend many of the common objections to his results/analysis/conclusions. I understand that you believe DAC's sound different. A blinded listening test can easily show this as well!
 
Individuals who genuinely believe that there are differences in DAC's would be the ones who would take time/energy to listen critically during a series of blinded same/different tests...

Perhaps they would, I'm not one of them. Rather I'm someone who hears that DACs don't sound identical. I have preferences, I enjoy some DACs more than others but I have no beliefs about DACs. Why would I need beliefs? Sure I form hypotheses when I'm listening to some new design, but I don't call hypotheses beliefs because they're not permanent.

...because if they can consistently identify when they are hearing the same 2 DAC's vs. 2 different DAC's, and they can do this more often than could be attributed to lucky guessing, then they have effectively demonstrated that there ARE differences in DAC's.
[/

Indeed for those people that might well be a step forward. For me I already hear differences - that does demonstrate that there are differences for me. I'm not particularly interested in demonstrating those differences to others - where's the benefit? They can (and sometimes do) think up other reasons for people getting statistically improbable results.

(Whereas people who don't think there's a difference won't have a vested interest in listening critically to correctly identify same vs. different, because doing a half-a$$ed job will still support their own beliefs.)

Correct - RS is one individual who holds such a belief - as I've already pointed out, there's no incentive for him to design a well controlled experiment when its more likely to produce a result contrary to his existing beliefs.

Let me know if this still doesn't make sense, because it's a very basic principle for study subject recruitment for these types of tests.

Have done so :)

The benefit to him is that performing a rigorous experiment will allow him to successfully defend many of the common objections to his results/analysis/conclusions.

In his mind he already has a successful enough defense - that's measurements. No need for reinforcements, hence no incentive.

I understand that you believe DAC's sound different.

No, that's a misunderstanding. What use are beliefs when you have experience? If I have a belief I'm 1.88m tall how will that help me if I'm in fact 1.6m ? Or if my belief about my height does accurately reflect my measured height is it a useful belief?
 
Yes, I see this as a problem for the experimental design. People like myself who notice differences between DACs won't be likely motivated to take part - why would they be?

I know this seems crazy, but... there are people who have beliefs but are actually curious about reality and want to find out what's real and what's illusion. Of course, they're not planning to enter this business so have nothing to lose but real knowledge to gain.

As an example, I'll single out my buddy Michael, who is certain that he heard a difference between 24/96 files and 16/44 versions. Yet he spent the time and effort to do ABX testing to see if those differences were real to his ears. It was worthwhile to him (though perhaps not to others) to know whether to put his energies toward getting increased resolution and bandwidth or whether that was a ghost and to worry about other things. Of course, he was in the nice position of having nothing to sell. That was also my approach to data-compression formats- I was very curious about whether the differences were in my ears or in my brain. Pure curiosity. So it was DBT time and a willingness to admit that my perception was not reality, if that's what the data showed. Not everyone has that willingness.
 
Add me to those people too :)
Actually that is the definitin of SCIENCE - the curiosity to test new theories.
There are faith-based pseudo-scientists, but they just hijacked the name of what they belive in.
I see all the time "belivers" that have no support to even understand what they are hearing, drawing wrong conclusions based on faulty sensorial input and faulty logic Like "measurements don't tell all the story - therfore all measurement are useless" Forgetting that the "story" told by adequate measurements and analisys is some 98% of the whole real "story". And instead trying to improve that precentage, they just dismiss what does not fit with their believe system.
Is trowing the baby away with the wash water.
 
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.