DAC blind test: NO audible difference whatsoever

Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.
AX tech editor
Joined 2002
Paid Member
The blurb on the Diffmaker download page leads me to believe it compensates for frequency response anomalies.

Under the heading 'What can Audio Diffmaker do?' it says, inter alia :

Measure the frequency response of the equipment being tested and apply it so the effects of linear frequency response can be removed from the testing.

Yes, but that is a separate set-up test. If you 'just' compare two files, DiffMaker has no way of knowing what the equipment freq response is and any freq response differences between the files will be noted as differences.

On the time alignment, IIRC it time-aligns the start of the files, not dynamically. But I am not 100% sure.
Bill Waslo is a member here (bwaslo), could ask him.

Edit: DiffMaker site: Precisely align two similar audio tracks to the same gain levels and timing

At Gearsluz, someone says: The "magic" of course is in the algorithm used to align the samples in terms of time (including sample rate drift), and signal amplitude (bold mine).

Jan
 
Last edited:
abraxalito: It says to make a reference recording first, then change something in the system for which you want to measure the difference. That's not the way diffmaker is being used in the GS tests. The only recording they make is essentially the reference recording. The difference between the reference recording and the source file wouldn't be frequency compensated, that's how it looks to me at this point.
 
Yes, but that is a separate set-up test. If you 'just' compare two files, DiffMaker has no way of knowing what the equipment freq response is and any freq response differences between the files will be noted as differences.

So then its entirely possible that those Gearslutz results are omitting the FR compensation aspects and this would indeed be a reasonable explanation for the DACs posting nulls poorer than 40dB. A 40dB null would be commensurate with a ~0.1dB FR deviation - cheap DAC chips easily have this magnitude of ripple or droop in their FRs.
 
It time aligns as best it can - is this based on some chosen (start of track?)sample period of analysis between the two tracks? If the tracks drift in time during the track being analyzed does it dynamically align them? This would be wrong IMO as it could also hide other timing differences that are not due to clock drift.
I reckon it's probably best to use synchronous clocking as Scott says & turn off timing compensation in Diffmaker

I need to correct this but time has expired so the correction is highlighted above
Edit: Sorry, I see the thread has moved on

@Jan "On the time alignment, IIRC it time-aligns the start of the files, not dynamically. But I am not 100% sure.
Bill Waslo is a member here (bwaslo), could ask him."

Yes, it would be good to get an answer on this as we are not the only ones confused I see archimago includes 1kHz tones at intervals throughout his Diffmaker test track - he says it's to allow Diffmaker do time-alignment Archimago's Musings: PROTOCOL: [UPDATED] The DiffMaker Audio Composite (DMAC) Test.
Interspersed between each track are dual bursts of 0.1s 1kHz tone at -4dBFS interspersed with 0.1s silence. This serves as a "beacon" for DiffMaker's alignment algorithm. The trickiest part of this test is temporal alignment and doing this has significantly improved the consistency of the results for me.
 
Last edited:
abraxalito: It says to make a reference recording first, then change something in the system for which you want to measure the difference. That's not the way diffmaker is being used in the GS tests. The only recording they make is essentially the reference recording. The difference between the reference recording and the source file wouldn't be frequency compensated, that's how it looks to me at this point.

I don't see anywhere that you can feed in the frequency compensation into Diffmaker from doing a pre-test reference recording but maybe I'm missing something?
 
I don't see anywhere that you can feed in the frequency compensation into Diffmaker from doing a pre-test reference recording but maybe I'm missing something?

It works automatically. If you take a source music file and loop it through the data converters, the resulting file is called the reference file and it has whatever frequency response it acquired from running through the data converters. Then after modifying the system to do some test we compare the result file from that with the previously created reference file (not the source file). By that means the FR of the system is subtracted out.
 
I don't have the time right now but Jan should have a bunch of multitone signals up on Linear Audio from my article. I think I even generated some for folks that only have power of 2 FFT's. Run that through the diffmaker process first trying a process that won't have sample rate drift. If the frequency response anomalies are totally removed the 30 tones will be completely absent from the residual.

Another suggestion, take one of the reference files and put "holes" in the spectrum with an FFT filter. The residual in the holes should contain only non-linear distortions for any settings. The residual should be the same across the spectrum, again an easy visual check on the process.

BTW any combo that has significant sample rata drift rather than a simple offset would be a very poor implementation.
 
mmerrill99 said:
What I can't find are the settings used for the Gearslutz table of results. If the settings aren't defined & adhered to by everyone submitting results then it's a useless table of results.
I'm sure the truth will emerge in due course, but it could turn out that these results are yet another example of carefully obtained numbers which contain no significant figures. Modern test equipment and software make it easy to obtain numbers; doing measurements is somewhat harder.
 
I'm sure the truth will emerge in due course, but it could turn out that these results are yet another example of carefully obtained numbers which contain no significant figures. Modern test equipment and software make it easy to obtain numbers; doing measurements is somewhat harder.

Yes, what you say is wise but I wonder why you don't apply the same logic to the ABX test here?
 
<snip>

Similarly, if I know from synthetic testing that my ability to detect distortion requires the distortion to be greater than -40 dB, and my amp can deliver -100 dB, then I don't need to worry about distortion in my amp at all. I know my threshold, and I know the weakest link much much better than what I can personally hear. At that point, I know I'm hearing things precisely as the producer heard the track when he printed it.

The premise of this is the assertion that to know about (every) single parameter thresholds allows to conclude what every combination of these parameters would provoke.
Maybe i´m mistaken at this point but that seems to be exactly the same theoretical approach we are struggling with since the invention of "high quality playback" and measurements.

If this premise is true is questionable.

Btw, i´ve read you blog post and noticed that you asserted that the ABX test does not rely on memory, which is obviously incorrect.
Comparing two samples of music needs memory otherwise at the time the first sample is gone nothing is left to compare the second against.

ABX lets you find that threshold. And once you know the threshold, everything else fall into place.

As implied above if the premise isn´t true nothing falls into place. :)

And there is some evidence that listeners do need more training (or accommodation time) to get used to the ABX protocol in comparison to A/B (forced choice methods) .
If you don´t inform the users about the fact that they most likeley will not found out about their real thresholds (means thresholds under real life conditions) at the first attempts a lot of incorrect results will result.
(If it is even possible to get those real life thresholds with quantitative data gathering under artificial test conditions is an exciting question itsself...)
 
<snip> In particular, human cognition is not nearly as linear as your theory seems to suppose and require.

That´s for sure....

<snip>

In addition, it's not clear that ABX is reliable for finding low level limits. It's not the only way to do blind testing, and I haven't seen any research showing it to be the most sensitive or no

less sensitive than any of the others. That being the case, nobody really knows how sensitive it is.

Harris /1/ asserted already in a letter to the JASA (1952) that in their experiments an A/B test was more sensitive than an ABX, with further corrobation by other people doing similar experiments.

In addition they found that subjectively the ABX test task was more difficult.
"In this laboratory we have made some comparisons among DLs for pitch as measured by the ABX technique and by a two category forced-choice judgment variation of the constants method (tones A B, subject forced to guess B "higher" or "lower"). Judgments were subjectively somewhat easier to make with the AB than with the ABX method, but a greater difference appeared in that the DLs were uniformly smaller by AB than by ABX. On a recent visit to this laboratory, Professor W. A. Rosenblith and Dr.Stevens collected some DLs by the AB method with similar results.
The case seems to be that ABX is too complicated to yield the finest measures of sensitivity."

Huang/Lawless /2/ did experiments in 1997 comparing the ABX with other protocols (like paired comparison, 3AFD, Duo-Tria and Triangle) and their data showed that, although all tests delivered significant results, that the proportion of correct responses was higher for paired comparison and 3AFC.

Macmillan/Creelman /3/ predicted that, due to their models used, 2AFC and (mostly) 3AFC tests would show greater proportion of correct responses than ABX except when the differences were really large. (ABX would be more sensitive than same/different tests)

But all of these were done with tests where the DUTs differed only in one dimension, so it might be different with multidimensional stimuli, although i´ve experienced the same feeling of difficulty when trying ABX myself and observed it too when asking two other people trying it, therefore dropped decided not to use it.

Otoh people like Paul Frindle /4/ reported quite expressive sensitivity considering his list of differences detected in ABX tests:

-) Absolute and stereo differential gain anomalies of less than 0.1 dB
-) Differential stereo delays of 1 uS
-) Frequnecy response variations of 0.1 dB from "flat" 20 Hz - 20 kHz
-) Harmonic distortion components at 80 dB below signal level even when they are more than 10 dB below noicse floor
-) Limit cycle behaviour of Delta/Sigma DAC converters at 100 dB below max signal level
-) ......

So i think it´s important, as Mark4 said, to find a controlled test protocol that suits your personal habits/abilities, do some training and be patient until getting used to it.

/1/ J. Donald Harris, Remarks on the Determination of a Differential Threshold by the So-Called ABX Technique, J. Acoust. Soc. Am. 24, 417 (1952).
/2/ Yu-Ting Huang,Harry Lawless, Sensitivity of the ABX Discrimination Test, Journal of Sensory Studies 13 (1998) 229-239.
/3/ Neil A. Macmillan, C. Douglas Creelman, Detection Theory: A User’s Guide, 2nd edtion, Lawrence Erlbaum Associates, Inc., 2005, 253
/4/ Paul Frindle, Are We Measuring the Right Things? Artefact Audibility Versus Measurement, UK 12th Conference: The Measure of Audio (MOA) (April 1997), paper no. MOA-05 .
 
Last edited:
I suspect trying to hear a difference in day chips is useless, as they are pretty good now. The differences or lack thereof (audibly noticable) would come from the different implementations in output circuits. These different implementations also would make comparing chip a to chip impossible. I don't want to listen to a day anyway, I would rather listen to music. Maybe those people who are happy with a cheap table top radio have it right!
 
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.