Then you are left with the difficult reconciliation of how is that some people claim to hear well above 20KHz in music or vanishingly small amounts of distortion in a DAC when they can’t hear those frequencies or artifacts on a standard hearing test or ABX test. Hmm...
Sean Olive said young children can hear up to 30kHz or maybe higher, but that by age 24, that's down to about 16kHz on average.
Doesn't matter though. Ultrasonics is not what is so interesting, as it turns out. What is probably more interesting is the confluence of cognitive psychology, neuroscience, and Auditory Brainstem Response research. What it boils down to is that brains are very plastic in terms of how they process whatever vibrations are picked up by one's ears, and everybody hears differently. Apparently groups of neurons are able to phase-lock with various aspects of sounds, amplitude, frequency, and perhaps any other pattern that could be detected.
As a consequence, some people already hear differences between DACs blind. Other people can learn to hear some of the differences with coaching and practice until they can also discriminate blind. It may be that some people are not going to be able to do that, or at least they may not be able to learn how easily or quickly.
Of course, with respect to any claims of being able to hear something, or just to practice learning how to hear something, some kind of blinding is necessary at times. What is not necessary is that it always be foobar ABX. I have may own ways of practicing and testing myself blind that I have been using for many years and that work for me. I find them helpful rather than distracting. There is no apparent reason not to try other blind applications or methodologies to find out what works best for different people.
For people who claim they can only perform sighted, I would suggest to them they try to find some blind method they like and that works for them. At times everybody will fool themselves, and the only way to be fully honest with one's own self is to do some blind testing and practice, at least now and then.
Last edited:
So, yes it seems we agree - I'm not sure what strawmen you might be talking about?
I misread your comment more as a hack and slash at anyone who wants DBT (in any capacity) versus trusting word of mouth. My apologies.
Fair enough. No problem!I misread your comment more as a hack and slash at anyone who wants DBT (in any capacity) versus word of mouth. My apologies.
Well, if there are amplitude differences or tracks are eq'ed differently then the test is flawed, anyway
No, the EQ differences could readily be inherent in the amp and a serious flaw in the enginering. But my point is that its very possible that given an AB test ("which sounds better" test) people would opt for the enhanced bass and crisper treble every single time. Even though that might be miles removed from what the producer heard when he created the track.
So, to me, the goal is ultimate fidelity. You want to know that you are hearing precisely what the folks that made the record heard. You want their tastes to imprint on you. if you like the sound of a cello so bright that a cellist would say "Oh man, something is very wrong here" then it's a problem.
Isn't that circular logic - if you can tell a difference therefore the test is not a challenge?
I don't see why it would be circular. If you have mastered something like calculus, then looking at integrals isn't taxing at all. If you have barely mastered calculus, then looking at an integral can cause you anxiety.
Similarly, if I know from synthetic testing that my ability to detect distortion requires the distortion to be greater than -40 dB, and my amp can deliver -100 dB, then I don't need to worry about distortion in my amp at all. I know my threshold, and I know the weakest link much much better than what I can personally hear. At that point, I know I'm hearing things precisely as the producer heard the track when he printed it.
ABX lets you find that threshold. And once you know the threshold, everything else fall into place.
Similarly, if I know from synthetic testing that my ability to detect distortion requires the distortion to be greater than -40 dB, and my amp can deliver -100 dB, then I don't need to worry about distortion in my amp at all. I know my threshold, and I know the weakest link much much better than what I can personally hear. At that point, I know I'm hearing things precisely as the producer heard the track when he printed it.
ABX lets you find that threshold. And once you know the threshold, everything else fall into place.
It looks like you are describing a theoretical point of view about how some of human cognition works. As a theory, it seems significantly at odds with much of cognitive psychology and neuroscience research, perhaps fatally so. In particular, human cognition is not nearly as linear as your theory seems to suppose and require.
Beyond that, there are other problems with some of the claims in the above quote. For one example, even if you were in the same room with the producer when he printed it, you almost certainly wouldn't hear the exact same way the producer does. Maybe he has perfect pitch, and other ways of hearing that you don't hear in the same way. That is, being exposed to the same sound waves and hearing the same thing as someone else are two different things.
Also, its not clear what distortion less than -40dB means. THD? Any kind of THD? What kind of source material? Any kind of source material?
In addition, it's not clear that ABX is reliable for finding low level limits. It's not the only way to do blind testing, and I haven't seen any research showing it to be the most sensitive or no less sensitive than any of the others. That being the case, nobody really knows how sensitive it is. Of course, that doesn't stop people from thinking up reasons in support of the proposition. But, that type of reasoning is often a poor substitute for careful application of the scientific method, particularly so in the area of medical research, which hearing research is.
In short, there are multiple problems with the theory as stated. Probably more than one of the problems is fatal to the overall point being expressed, IMHO.
Last edited:
But how does "spot-the-difference" ABX testing tell you this? (presuming you actually do "spot-the-difference" in ABX testing?). If an ABX test results show that you can differentiate one device from another - what does this tell you?No, the EQ differences could readily be inherent in the amp and a serious flaw in the enginering. But my point is that its very possible that given an AB test ("which sounds better" test) people would opt for the enhanced bass and crisper treble every single time. Even though that might be miles removed from what the producer heard when he created the track.
You make assumptions which are dubious. I agree with MarkW4 but you are also assuming that you will always be drawn towards the higher distortion device. This is not what the Sean Olive/Harmon tests show when blind preference testing was used to evaluate a range of speakers - the one with the lowest on & off axis distortion & amplitude smoothness are the preferred speakers (in he context of their test).
Why wouldn't one prefer the lowest distortion, more accurate device?
You've built a strawman hereSo, to me, the goal is ultimate fidelity. You want to know that you are hearing precisely what the folks that made the record heard. You want their tastes to imprint on you. if you like the sound of a cello so bright that a cellist would say "Oh man, something is very wrong here" then it's a problem.
From the above, it seems to me that you don't actually use a blind test for evaluation - you are using it to establish your thresholds.I don't see why it would be circular. If you have mastered something like calculus, then looking at integrals isn't taxing at all. If you have barely mastered calculus, then looking at an integral can cause you anxiety.
Similarly, if I know from synthetic testing that my ability to detect distortion requires the distortion to be greater than -40 dB, and my amp can deliver -100 dB, then I don't need to worry about distortion in my amp at all. I know my threshold, and I know the weakest link much much better than what I can personally hear. At that point, I know I'm hearing things precisely as the producer heard the track when he printed it.
ABX lets you find that threshold. And once you know the threshold, everything else fall into place.
Thereafter you simply look at specifications & decide what is below your threshold & is therefore transparent to you. I have the same questions s Markw4 - thresholds for what aspect of sound?
This is the view of many so why would they be interested in ABX testing for evaluation purposes when simply being used to establish personal thresholds.
If the assumption is that published specs fully characterise a device then how is it that the results of Diffmaker (which compares the difference between input signal & output signal) posted on Gearslutz for A/D to D/A loop, shows that the output doesn't match the input to a significant level. The results mostly show a correlation depth (the degree to which output matched input) less than -60dB.
How does this square with the published specs for these A/D D/A devices?
Last edited:
As a consequence, some people already hear differences between DACs blind. Other people can learn to hear some of the differences with coaching and practice until they can also discriminate blind.
Does learning to hear, what are at best tiny and very hard to measure differences, result in an improved listening experience?
If learning this causes a loss of perceived pleasure what is the point,except as a technical exercise?
I have some similar experience - working with visual image compression I learned to be able to spot things in images that pretty much no one else could see. While it enabled me to complete the job in hand, it definitely reduced my appreciation of some images - it distracted from the aesthetic, emotional enjoyment.
I had to subsequently learn to be able to ignore it! (with some help from my friendly neuroscientist friends!).
I had a similar problem when I did an art for beginners course, I couldn't just stand and stare without analysing......I stopped because.....I like to stand and stare
Does learning to hear, what are at best tiny and very hard to measure differences, result in an improved listening experience?
If learning this causes a loss of perceived pleasure what is the point,except as a technical exercise?
I have some similar experience - working with visual image compression I learned to be able to spot things in images that pretty much no one else could see. While it enabled me to complete the job in hand, it definitely reduced my appreciation of some images - it distracted from the aesthetic, emotional enjoyment.
I had to subsequently learn to be able to ignore it! (with some help from my friendly neuroscientist friends!).
This is a good question & helps to delve into the workings of auditory perception. Maybe you can tell us how you subsequently learned the ability to view without the image anomaly being the foremost focus of your attention?
But before you do, why would you train yourself to spot visual anomalies that 99% of people cannot see? When you say it helped to complete the job in hand, do you mean that there was a need for spotting this & presumably eliminating it because without so doing it would interfere with most people's perception of the overall image even though they couldn't spot the exact anomaly you can?
how is it that the results of Diffmaker (which compares the difference between input signal & output signal) posted on Gearslutz for A/D to D/A loop, shows that the output doesn't match the input to a significant level. The results mostly show a correlation depth (the degree to which output matched input) less than -60dB.
How does this square with the published specs for these A/D D/A devices?
That type of testing looks questionable. Loopback tests check the A/D as much as the D/A, there is no way to tell if one is causing a lot more of any problems than the other without additional testing. In some of the cases, it was clear that a poor A/D was used with a better A/D.
Also, it's not clear to me that recording software ability to correct for A/D and D/A hardware latency results in looped back data that is exactly synchronized in time. If any sample offsets due to latency issues, then that might result in false test results, or incorrect interpretation of results significance.
in other words, if the files differ mostly by a small timing offset, that's not same as if they differ mostly due to distortion.
Last edited:
Does learning to hear, what are at best tiny and very hard to measure differences, result in an improved listening experience?
Sean Olive said, "basically, it ruins your life," which was of course partly a joke.
It can go the other way too. When something sounds better played on a good system, it can sound really better.
However, most systems, most records, most reproduction formats, etc. are flawed, or are imperfect, to some extent. If one only focuses on the shortcomings, that's not so good for enjoyment. Which also happens to be true for many areas of life. We need some time to enjoy things too.
Sure, I'm aware that both A/D & D/A are involved in the results. But this would always be the case - the A/D inside an analyser limits & possibly adds to the distortion seen in the results.That type of testing looks questionable. Loopback tests check the A/D as much as the D/A, there is no way to tell if one is causing a lot more of any problems than the other without additional testing. In some of the cases, it was clear that a poor A/D was used with a better A/D.
But the claim is often made that both A/D & D/As are transparent for all purposes & there is quite a range of tested A/Ds & D/As in the list I linked to which doesn't show anything near transparency.
There are also results for the same A/D used in many of the results so in these cases it's possible to extrapolate which is likely most responsible for the results - A/D or D/A
This adjustment for amplitude & drift is automatically done in Diffmaker. I'm not saying Diffmaker is perfect but those who live by measurements have never explained what the problem is that leads to such reported differences between input & output of <-60dB for supposedly transparent devices. There is corroborating results from Matlab also given so it's not relying on just one tool for analysis.Also, it's not clear to me that recording software ability to correct for A/D and D/A hardware latency results in looped back data that is exactly synchronized in time. If any sample offsets due to latency issues, then that might result in false test results, or incorrect interpretation of results significance.
in other words, if the files differ mostly by a small timing offset, that's not same as if they differ mostly due to distortion.
If the Gearslutz results are correct then is there not an elephant in the room? If they are not correct then where are they wrong?
If they are not correct then where are they wrong?
I don't know for a fact because am not aware of anybody having carefully looked into it. But, the numbers seem almost too poor to be true. When seemingly funny results are obtained from some experiment, it would seem to make sense to (1) verify the results by other means to confirm the numbers are correct, and (2) try to determine the cause of the apparent anomaly, or explain why it isn't really anomalous.
As one data point, at Benchmark Media they reported looping their D/A and A/D and they found that it took 17 times though the loop for any difference (compared to the original file) to be detected by ear, using a hardware ABX system.
Of course, we know there may be some limits to ABX, but how bad does if have to be if it takes 17 loops to hear a difference?
Looks to me like something is wrong somewhere, and I would like to see a full and clear explanation of why.
Last edited:
This adjustment for amplitude & drift is automatically done in Diffmaker. I'm not saying Diffmaker is perfect but those who live by measurements have never explained what the problem is that leads to such reported differences between input & output of <-60dB for supposedly transparent devices.
This problem has existed long before Diffmaker, many simply ignored the Carver null test because there was no clear way to interpret a macroscopic look at the total difference between input and output.
I don't know for a fact because am not aware of anybody having carefully looked into it. But, the numbers seem almost too poor to be true. When seemingly funny results are obtained from some experiment, it would seem to make sense to (1) verify the results by other means to confirm the numbers are correct, and (2) try to determine the cause of the apparent anomaly, or explain why it isn't really anomalous.
Sure, I agree but as I said that if you look at each result you will see two sets of values - one direct from Diffmaker called "Corr Depth" (correlation Depth) & another set of figures, "Difference*" which is derived using MATLAB in a separate test. AFAIK, it essentially uses the reported correction for amplitude & drift that Diffmaker reports, applies them to the same input & output files & does a comparison between the resulting, adjusted output file Vs the input files. In all cases it comes up with a slightly better result than Diffmaker alone but it confirms the correlation is still no better than about -60dB. I didn't quote the lower "Correlation Depth" figure which shows no better than -42dB
2) Yes, I agree - that's what my question was about - how can these abysmal results for what many claim are transparent devices be explained?
This problem has existed long before Diffmaker, many simply ignored the Carver null test because there was no clear way to interpret a macroscopic look at the total difference between input and output.
From my understand this form of null testing predates Carver - where the output is divided down to match the input signal level & phase reversed in order to attempt nulling between the two.
What sort of null depth did Carver achieve?
What needs to be interpreted in the results? - the signal seems to be changed in passing through the DUT & not just in amplitude or drift or is this interpretation wrong in some way?
If the results are hard to explain do we just accept that these devices are mostly not audibly transparent when handling complex signals such as as music?
I believe in the past you tested some DACs using multitone test signals (which is a closer analog to music signals) & if I remember correctly stated something like 'this separates out the wheat from the chaff' or something equivalent. Can you say anything about these tests?
Last edited:
Without qualitative analysis of the data no judgement of audibility/transparency can be made. Simple ripples in frequency response of tiny fractions of a dB due, for instance, to the anti-aliasing/imaging filters in the different chip sets could cause -60dB errors. Any real speaker/headphone is orders of magnitude worse in this regard. If you put a dummy head at your favorite spot and played just about any piece of music and fed the original and the recording to diffmaker you would essentially get garbage.
2) Yes, I agree - that's what my question was about - how can these abysmal results for what many claim are transparent devices be explained?
To start with, it might be helpful to avoid certain absolutes and extremes in construction of the question to be investigated. The results are not in all cases fully abysmal, nor are data converters perfectly transparent (at the very least, to test equipment).
The question is probably more like, "how can poor data converter measurement results be reconciled with very good data converter measurement results." Maybe we could even define what we mean by "poor" and "very good."
What sort of null depth did Carver achieve?
I believe in the past you tested some DACs using multitone test signals (which is a closer analog to music signals) & if I remember correctly stated something like 'this separates out the wheat from the chaff' or something equivalent. Can you say anything about these tests?
I and Bob Cordell both have tinkered with this, you need to do some phase adjustments in the attenuator to get a really deep null. The problem is in interpreting the residual, -60dB residual is a .01dB match between amp and passive divider. Is it annoying crossover distortion or a tiny amplitude/phase mismatch?
I have done multitone tests because you can adjust the crest factor to be more music like and exercise a device over its whole frequency range and you don't need a special tool. I don't remember finding any really bad DAC's but this test in a loopback like the GS tests would make an interesting data point.
Without qualitative analysis of the data no judgement of audibility/transparency can be made.
It might be interesting to take the loopback test files and the original source files, give a human two knobs (one for time delay on one of the files, and one for relative level adjustment) to try to null the files for minimum volume level, distortion, or objectionable-ness, etc., and see how they tune the knobs to optimize for different difference characteristics. (Some make-up gain would be needed too, and results could possibly differ if make-up gain were automatically maintained or user adjustable.)
They might also tune the knobs differently at different sections of the music or with different test signals.
Last edited:
- Home
- Source & Line
- Digital Line Level
- DAC blind test: NO audible difference whatsoever