DAC blind test: NO audible difference whatsoever

Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.
@QAMAtt,

It´s a nice attempt, although it seems that you are already strongly biased in your opinions about the audibility of effects in general.

Did you provide a countermeasure for the alpha error familiywise? I mean you place a bet, so there is some risk.

In your ABX do you have realized random assignment (in each trial) of music samples to "A" and "B"?

A quick look reveals that in the blog about the software some inaccuracies/errors are present, which imo should be corrected.
An ABX test _does_ rely on memory, although the results are immediate

Doing 10 trial ABX tests is only recommended _if_ the detection ability of any participant is _really_ high, otherwise the beta error risk gets enormous.

One does usually not aim for 90 - 95% sucess rate but for a statistical significance on a certain level. The percentage of correct answers needed to get a positive result (i.e. the nullhypothesis can be rejected) depends on the number of trials.

Let´s assume the traditional SL = 0.05, that means in a 10 trial test you need 9 correct answers, for a 16 trial test you need 12 correct answers, for a 20 trial test you need 15 correct answers and for a 100 trial test you need 59 correct answers to get a significant result.
 
Last edited:
@Markw4 writes
>> QAMAtt: I think you should consider awarding the $50 to Jakob2 who has provided the proof you want and much more. It is in the form of published research using ABX protocol rather than your program, but its may be just as reliable or better. Again, Foobar ABX can be cheated, don't know about your program. On the other hand, Frindle is serious engineer and researcher at Sony Oxford who has been doing this stuff for a long time. <<

But I didn't ask for people to cite research. I can generally find and read the research myself. The sad fact is that our most fundamental cancer research cannot be replicated today. And if the state of cancer research with its zillion dollar budget is that poor, what do you think that holds for audio R&D with its comparatively miniscule budget? Fletcher-Munson's work, groundbreaking as it was, sat flawed and unchecked for decades and is still being cited today in spite of some serious issues that would not be allowed to linger in a well funded environment.

Yes, the published research for audio is that poor IMO.

@Jackob2 writes
>> it is good tradition to look up what was already found in the past (and mabye try to replicate) and it was already reported by Clark (~198x) that in their ABX tests the limit for broadband level differences between music samples
was a bit below 0.3 dB...." <<

See bit above about the state of cancer science. Seriously, we can do better than relying on pink noise or a Clark's 1984 ABX box with TL074 opamps with some back to back diodes for introducing distortion. That was more than 30 years ago...

>> Frindle listed some of the differences they were able to detect (using ABX tests) in their system <<

He states these differences in the paper, but I'm not aware of any published research from him confirming this, sample sizes, etc. Did he ever publish those studies along with the details? See above about the cancer R&D's inability to repro fundamental research. And then tell me if you think audio research is more or less disciplined and funded that cancer research?

>> It´s a nice attempt, although it seems that you are already strongly biased in your opinions about the audibility of effects in general. Did you provide a countermeasure for the alpha error familiywise? I mean you place a bet, so there is some risk. <<

My opinions are merely informed by my own abilities and could be readily undone by contrary evidence. For example, my vertical jump is poor. If you told me someone could leap in excess of 40 inches, I'd not believe it based on my own capabilites. But if you showed me videos of people doing it, I'd flip my opinion instantly. It's that simple.

>> One does usually not aim for 90 - 95% sucess rate but for a statistical significance on a certain level. The percentage of correct answers needed to get a positive result (i.e. the nullhypothesis can be rejected) depends on the number of trials. <<

Yes, the tradeoff as you know is small trial count means it's easy to guess your way to success. If we're always picking 10 trials for example, then we can increase the cost of guessing your way to success (such that it's probably not worth it) while not overtaxing the listener.

And I should add that what I'm suggesting isn't the absolute limit of human hearing abilities. There's an interesting concept from toxicology that could be readily applied to audio. LD50 (Lethal Dose 50) is the dose of a substance that will cause death in 50% of the population. It applies to cyanide, caffeine, weed killer...anything you might imagine. The benefit of an LD50 study is they are incredibly easy and cheap to do: You feed a population (rats) the substance and see what happens in the next day or so. There's another threshold called "NOEL", or No Observable Effect Limit. Understanding this threshold is very expensive, as you need control groups, doctors, long term studies, extended lab space and time. But most interesting is that you can often just take LD50 done in rats and divide by 1000 (a safety margin) and quickly arrive at a limit for humans. Caffeine has a lethal dose of 240 cans of soft drink, and a safe dose of 10 cans. So that's a 24X safety factor, but that required very expensive testing to determine the safety margin.

OK, so back to audio. Imagine an ABX test that wasn't stressful. You simply aimed to determine whether or not gross differences could be heard via causal auditions. No furrowed brow. No fingers on temple. No closed eyes. No stress. Just a quick few seconds on each trial: Can you pick the difference?

And let's say you can reliably detect 12 bits of quantization (9 of 10 times) but you cannot hear 13 bits (6 of 10) on a particular piece of music. And thus you derate your answer by a safety margin of 20 dB, which about 2.2 bits.

To me, that is what is so helpful about artificially degrading the stream in real time: You establish limits to the types of distortion you can readily hear, you pick a safety factor, and you are done. And you can have enormous sample sizes.

So, nobody wants to prove they can hear 0.8 dB gain anomaly on their home system, while citing 20 year old research sans details that says "less than" 0.1 dB can be "reliably detected"? I mean, if something can be reliably detected, then it should take just a few minute or two to do it, right?
 
@Markw4 writes

The sad fact is that our most fundamental cancer research cannot be replicated today.

...Yes, the published research for audio is that poor IMO.

QAMAtt, Thank you for your thoughtful reply.

Having worked is cancer research, I am very aware of some of the problems with research in that and other fields, particularly research involving people. It is a very challenging and complex area in which to work.

Also, I would agree with you that a non-trivial amount of published audio research conducted primarily or exclusively by engineers involving human subjects is questionable, needs replication, and or updating.

The concerns in that regard would also apply to your efforts, which so far show a void in understanding in the areas Jakob2 has been trying to helpfully bring to your attention.

The above being said, conducting an ABX test and reporting results without making any claims about what can be inferred about the general population is drastically simpler than cancer research.

As Jakob2 has repeated tried to explain, and its not clear why people seem to gloss over it, what can and can't be concluded from experiments depends on what the exactly experimental question being asked is. Also, it is possible to prove a positive, but not possible to prove a negative. However, both positives and negatives can have probabilities, but one has to be very careful about framing of claims, and one's statistical work which can in some cases be very counter-intuitive.

Anyway, I have not read Frindle's paper since it is behind a paywall. Don't know if you have read it and have particular concerns, or if your reservations are in principle only. Personally, I don't find Frindle's conclusions as presented by Jackob2 to be surprising for at least a few trained or otherwise very skilled or talented listeners. (Even with my hearing loss and tinnitus I could probably do most of them myself if I wanted to train with the damm ABX. Much easier with some other blind test protocol, again as Jakob2 has repeatedly pointed out, including with research citations.)

However, and maybe we could agree on this, I don't think most of the findings are representative of most listeners in the general population, not even close.
 
Last edited:
I mean, if something can be reliably detected, then it should take just a few minute or two to do it, right?

The above is a debating type argument, not a scientific one. Its the way lawyers and politicians try to win an argument. It's about winning, not truth-seeking.

To put a different perspective on the question, we do know some physical senses take time to accommodate to changing conditions. With hearing we know that masking occurs for a time after loud sounds. For seeing a very low light levels, it takes time to accommodate after being exposed to bright light. Even a short flash of bright light can affect low light vision for quite awhile. Complete adaptation from bright to dark vision can take up to 45 minutes. How long does it take our eyes to fully adapt to darkness? | Science Questions with Surprising Answers
Fiddling with ABX controls to me is distracting and requires considerable re-adaptation for very sensitive listening tasks, but I haven't tried to measure it.

I have said that making a few changes to currently available ABS software might help a lot, but nobody who writes the software cares to try it. I wish somebody would be willing to work on it collaboratively.
 
The above is a debating type argument, not a scientific one. Its the way lawyers and politicians try to win an argument. It's about winning, not truth-seeking.

Did you miss the parable above about LD50 versus NOEL testing? All science and industries are comfortable with the concept of varying thresholds for different situations (relaxed versus stringent, let's say). What I'm advocating is that a skilled listener casually see how she or he does on a "speed" round of testing. Not the stressful kind where you loop 500 mS over and over with your eyes closed and jaw clinched for 20 minutes. Instead, I'm asking "What you can you reliably deduce in a few seconds (5 to 30 sec or so) of listening in your normal listening environment before voting?"

And of course, you test your way to that limit. You start with a gross distortion that can readily be heard in a second or two, and then gradually reduce it until you feel you can no longer hear it reliably. And then share the last reliable measure you achieved.

In other words, I want to learn what LD50 is for audio. Historically, people treat ABX as a NOEL test.
 
In other words, I want to learn what LD50 is for audio. Historically, people treat ABX as a NOEL test.

You mentioned earlier in this thread about personal listening thresholds & how amplifier specs then define a device (amplifier) that is transparent for you. Two problems I had with this - one of which you have avoided answering so far (but both are related):
- how do you know what are the FULL range of distortions that you need to find your thresholds for
- how do you know what are the specifications that fully characterise a device & define it's performance as far as auditory perception is concerned?

What has LD50 got to do with any of what you said previously?

I would love to find a set of measurements/distortions that fully define auditory perception's capabilities but so far I haven't found anything that qualifies, have you?
 
Last edited:
In other words, I want to learn what LD50 is for audio. Historically, people treat ABX as a NOEL test.

It sounds a lot like you have an ABX test program and you are trying to find a use for it. In other words, it sounds like what you have is a solution in search of a problem.

Of course, you could do as you propose and try to measure something with your program, but so what?

You can also count the bumps on people heads.

In that particular case people tried it and thought they were onto something using a lot of non-scientific reasoning, but they were wrong. Phrenology - Wikipedia

Look, we already have a situation in audio with THD. It is easy to measure, so we measure it. Means very little because it correlates very poorly with hearing perception. An amplifier could have a measure THD of .01% and according to Earl Geddes that could mean nobody could hear it or everybody could hear it. So, he argues that it is useless.

Regarding the toxicological LD50, it means something useful outside of artificial laboratory conditions, which is why it is used. On the other hand, some kind of LD50 ABX test only has meaning under the artificial test conditions under which it is measured. For it to have any practical meaning, you would have to correlate it with some human ability occurring in normal non-artificial conditions. Arguing philosophically about it leads nowhere. You would have to measure a bunch of people for ABX LD50, then measure some corresponding human ability under some other actually meaningful conditions and show that ABX LD50 is a useful proxy for the meaningful thing using a lot of statistical numerical analysis.

At least for best-you-can-do type of ABX testing, it's clear what it means as a practical matter in the real world, which is it proves it is possible to hear something (effectively if someone can always get 100% right), or it can be used to calculate a probability that it is possible to hear something (although the probability resulting may be wrong and probably is in many cases, since ABX is a low sensitivity test without a lot of practice by test subjects).

Also, regarding the no-observed-adverse-effect level (NOAEL) as used in toxicology, I'm not seeing what that has to do with ABX testing. In toxicology it is a point at which an effect is so small it becomes unmeasurable, but why it becomes unmeasurable is not known or not the same for every substance. It might be there is no toxic effect at some level or not, it can't be measured is all we know. Maybe ABX reaches a measurement limit (no observed effect limit) with something like very small distortion because it is too insensitive to measure somebody's real ability to hear it? In that case it's just identifying a problem with ABX being useless after some detection limit (that probably varies from person to person and with the amount of training they have done).
 
Last edited:
....... In that case it's just identifying a problem with ABX being useless after some detection limit (that probably varies from person to person and with the amount of training they have done).
Yes & it also depends on the sensitivity of that particular ABX test setup itself - calibration of such sensitivity (which includes the participants) is studiously avoided in these home-run tests, treating all ABX tests (including these amateur tests) as some sort of gold standard that has some conformance to some standard!! When questioned most have no idea of the recommendations for perceptual testing & it's no different here.
 
Last edited:
how do you know what are the FULL range of distortions that you need to find your thresholds for
- how do you know what are the specifications that fully characterise a device & define it's performance as far as auditory perception is concerned?

What has LD50 got to do with any of what you said previously?

At the risk of repeating...LD50 is a sledgehammer of a test. It's very fast and very cheap. NOEL is not. It's very slow and expensive. To date, ABX testing has been like NOEL, I'm advocating LD50.

The big difference is the amount you derate. In LD50 testing, you might derate by 1000X, in NOEL you might derate by 10X.

Now, 20 dB in audio is an enormous amount. Huge. Massive. Giant.

Instead of NOEL ABX test where you expect your hearing to be at or near the limits of the equipment, why not an LD50 test and then derate by a massive amount such as 20 or 40 dB?

And to answer you question re: auditory perception. You are chasing something that is unknown, something that is there but cannot be tested yet that might be present on a small % of the population. That's fine, that's not my area. I'm more interested in making statements like "If you achieve this level of performance, this amp will be indistinguishable from that amp by 99% of the population."

I think you are more interested in statements like "Under certain random and unknown conditions, 1% of the population might be able to discern these amplifiers"

Markw4:
It sounds a lot like you have an ABX test program and you are trying to find a use for it. In other words, it sounds like what you have is a solution in search of a problem.

We know the following, and I think each are fact:

1) ABX testing, when correctly done, is the gold standard for detecting whether or not differences can be observed. It's not perfect. But there's nothing better.

2) People that are highly trained tend to over-perform when put into demanding and stressful situations. People that are not trained tend to under-perform when put into demanding and stressful situations.

3) Training is an important part of ABX testing. If your goal is to see if differences can be heard, then you must help the person understand what the differences sound like.

4) It requires a fair level of skill to prepare ABX files.

Given the above, you don't see the benefit of an application that let's anyone take a pristine wave file and incrementally contaminate it at increasing levels of distortion of various types until you can no longer detect the distortion via ABX testing at each stage? And then they can prove the results to others?

In any case, we're just going in circles at this point, so I'll bow out. Sorry for the interruption. Good luck in your quest! I really thought this approach might help, but I guess not.
 
At the risk of repeating...LD50 is a sledgehammer of a test. It's very fast and very cheap. NOEL is not. It's very slow and expensive. To date, ABX testing has been like NOEL, I'm advocating LD50.

The big difference is the amount you derate. In LD50 testing, you might derate by 1000X, in NOEL you might derate by 10X.

Now, 20 dB in audio is an enormous amount. Huge. Massive. Giant.

Instead of NOEL ABX test where you expect your hearing to be at or near the limits of the equipment, why not an LD50 test and then derate by a massive amount such as 20 or 40 dB?
I see your logic - establish what is perceived by 50% of the population & derate by a huge amount to be below perceivable thresholds. Hmm, good luck with that

And to answer you question re: auditory perception. You are chasing something that is unknown, something that is there but cannot be tested yet that might be present on a small % of the population.
Your definition of auditory perception is not my or the recognised understanding of it - everyone uses auditory perception - it's what defines hearing itself. I don't know what you think it means?
That's fine, that's not my area. I'm more interested in making statements like "If you achieve this level of performance, this amp will be indistinguishable from that amp by 99% of the population."

I think you are more interested in statements like "Under certain random and unknown conditions, 1% of the population might be able to discern these amplifiers"
No, your classification of what I'm interest in is incorrect.
Again if you answered the two questions I posed, it would bring you a long way to understanding the problem with your approach. Until you do, you are not facing up to your premise with logic, simply belief

Markw4:


We know the following, and I think each are fact:

1) ABX testing, when correctly done, is the gold standard for detecting whether or not differences can be observed. It's not perfect. But there's nothing better.
Incorrect as has been shown by Jakob & previous research

2) People that are highly trained tend to over-perform when put into demanding and stressful situations. People that are not trained tend to under-perform when put into demanding and stressful situations.

3) Training is an important part of ABX testing. If your goal is to see if differences can be heard, then you must help the person understand what the differences sound like.

4) It requires a fair level of skill to prepare ABX files.
What are the FULL list of differences/distortions you are going to train for? Are you sure this represents all possible differences that can be perceived between audio devices?

Given the above, you don't see the benefit of an application that let's anyone take a pristine wave file and incrementally contaminate it at increasing levels of distortion of various types until you can no longer detect the distortion via ABX testing at each stage? And then they can prove the results to others?

In any case, we're just going in circles at this point, so I'll bow out. Sorry for the interruption. Good luck in your quest! I really thought this approach might help, but I guess not.

You mean, is there room for a a training program? Sure there is but I don't think you can assign the ultimate "Final Solution" importance you seem to assign to it
 
Come on, it's not rocket science. A DAC output (the original topic of this thread) is just an electrical waveform into a high value resistive load. And a waveform for which we actually have the ideal original to be reproduced (what's in the digital file, the original sound recorded is out of reach).

Once you measure variations in frequency response, phase, amplitude, noise and harmonic spectrum at all power and frequency points, measure IMD (again at various frequency/power points), check what's happening out of band and keep the system operating in specs (no digital clipping and the like), you pretty much have all you need to know about the ability of a particular DAC to be faithful to the input signal. There are a few more implementations concerns like crosstalk and emi interferences to account for. You can also check for jitter independently to isolate the problem.

It gets a bit more involved for amps since we have now an inductive/capacitive load and the strong likeliness of clipping. So we have to measure what's happening with various loads and how the amp deal with clipping (what distortion is generated and how it recovers). The importance of testing at various power/frequency points is greater since the strain put on the power supply will vary a lot.

For speakers... it gets crazy and that's not something I'm familiar with so I won't get into that.

The likeliness of something not yet discovered to be measured is very, very low, at least for amps and DACs. The crux of the matter is how to deal with what we are able to measure. The typical THD+N figure at 1khz/1W is a bad metric, everyone agrees with that and that's why Geddes came up with a new metric to make sense of measurements. How important is the order of distortion ? Its level vs the sound level in the room ? The ratio of various distortions ?

Imo, the problem is not so much being unable to measure what's happening. The problem is that there are so many variables that the task is out of reach of casual attempts. Still, a manufacturers giving you graphs with thd+n vs frequency at a few power points, a few IMD graphs and a phase/frequency response graph is already being very honest about the abilities of its product.

There's also the fact that sometimes, things are just good enough. Once you measure a product at a few critical points, the likeliness of a problem at other points is low so it's not unreasonable to be satisfied by a smaller amount of data rather than the full picture.
 
We know the following, and I think each are fact:
If you had just said you think they are facts and left out the "we know" part, your introductory statement would have been more accurate.

1) ABX testing, when correctly done, is the gold standard for detecting whether or not differences can be observed. It's not perfect. But there's nothing better.

In you mind, sure. Not supported by scientific evidence in this case however, which Jakob2 kindly pointed out and which you ignore.

2) People that are highly trained tend to over-perform when put into demanding and stressful situations.

Not clear what you mean by "over-perform." People perform however they do.

3) Training is an important part of ABX testing. If your goal is to see if differences can be heard, then you must help the person understand what the differences sound like.

Not exactly, I would say you must help the person start learning how to practice listening. It's more like coaching than instructing, if that makes sense.

4) It requires a fair level of skill to prepare ABX files.

It can, but it depends on what the goal of the particular experiment is.

It can also take good equipment. As people start to improve they need more accurate DACs to practice with. Some computer sound cards have THD+N around -60dB, which can potentially mask what it to be listened for. For the most demanding listening tasks, differences are clearest for me with the Benchmark DAC-3 and its headphone amp. Trainees need access to good equipment to lean how to perform at their best. Also, recorded files may need to be made with good ADCs to accurately capture analog data.

Given the above, you don't see the benefit of an application that let's anyone take a pristine wave file and incrementally contaminate it at increasing levels of distortion of various types until you can no longer detect the distortion via ABX testing at each stage? And then they can prove the results to others?

In principle there might be some benefit if people liked doing it or otherwise were motivated to use that methodology. Personally, I don't think we are ready to jump to a whole new kind of testing that would need to be validated that it works as intended. That would take a some number of users spending time with a prototype system and making adjustments to improve it until it could measurably meet some performance level.

I don't see all that happening, I can't even get anyone to make a few minor adjustments to the ABX we have now to see if there are performance improvements. And that would require far, far less effort than starting out with a whole new testing idea that may or may not pan out in the end.

In any case, we're just going in circles at this point, so I'll bow out. Sorry for the interruption. Good luck in your quest! I really thought this approach might help, but I guess not.

Everything else in your post sounds like you "guess so," which makes your ending sound bitter and rhetorical. Sorry if you feel that way.
 
Again if you answered the two questions I posed,

We're going in circles, but I'll type again:

- how do you know what are the FULL range of distortions that you need to find your thresholds for

I don't. I am interested in trying gross distortions one at a time to understand sensitivity to each. Quantization, amplitude, slew rate, clipping, TANH, etc.

how do you know what are the specifications that fully characterise a device & define it's performance as far as auditory perception is concerned?

I don't. But if we establish baseline sensitivities for each type of distortion, there's likely some interesting statements that could be made.

Incorrect as has been shown by Jakob & previous research

Again, more circles, but I'll type it again: Neither the Frindle paper or Clark papers was rigorous. There was nothing to be peer reviewed or replicated or analyzed. They were written at the level of a Radio Electronics article. Interesting, but that's not science. I'm looking for sample sizes, mean, stddev, test setups....all the things required to replicate a test.

And as previously covered, the historic tests oft cited in audio are equally poor (Fletcher Munson curves, etc).

If you think you are hanging your hat on solid science, you are incorrect. If Jakob could provide a link to a formal research paper with statistics that has been replicated, I'm all ears. But Frindle and Clarke ain't it.

What are the FULL list of differences/distortions you are going to train for?

More circles. Asked and answered. See above for most recent recitation.

Sure there is but I don't think you can assign the ultimate "Final Solution" importance you seem to assign to it

You put "final solution" in quotes as if those were my words. They are not. Those are your words. I'm merely suggesting that being able to state that that a mean and sigma on an amplitude test with a few hundred skilled listeners that have trained themselves on their favorite system would be very interesting. You don't think so and would rather rely on 35 year old opinions. I get it.

If you want to go over something new, then by all means, I'm happy to. But if you just want to do your usual re-ask of the same thing over and over as if it's new, then I'm out.
 
Come on, it's not rocket science. A DAC output (the original topic of this thread) is just an electrical waveform into a high value resistive load. And a waveform for which we actually have the ideal original to be reproduced (what's in the digital file, the original sound recorded is out of reach).

Once you measure variations in frequency response, phase, amplitude, noise and harmonic spectrum at all power and frequency points, measure IMD (again at various frequency/power points), check what's happening out of band and keep the system operating in specs (no digital clipping and the like), you pretty much have all you need to know about the ability of a particular DAC to be faithful to the input signal. There are a few more implementations concerns like crosstalk and emi interferences to account for. You can also check for jitter independently to isolate the problem.
Great! Now can you produce all these measurements for two DACs, showing how much each DAC differs from the input signal & then tell us how these measurements correlate to sonic differences? RE you talking about simple sinewave test signals or something with the dynamic complexity of music as test signals?

And just saying that all measurements are below audibility doesn't cut it without showing the measurements.

Sure, it's not rocket science but it seems like it's too much to do all these measurements on a device

It gets a bit more involved for amps since we have now an inductive/capacitive load and the strong likeliness of clipping. So we have to measure what's happening with various loads and how the amp deal with clipping (what distortion is generated and how it recovers). The importance of testing at various power/frequency points is greater since the strain put on the power supply will vary a lot.
QAMatt maintains that he chooses an amp as transparent for him based on specs. As I asked him before I would like to see the specs he uses in his choice but no answer so far.

......

The likeliness of something not yet discovered to be measured is very, very low, at least for amps and DACs. The crux of the matter is how to deal with what we are able to measure.
We just don't yet know what to measure, to what level & using what test signals & in what combinations.
The typical THD+N figure at 1khz/1W is a bad metric, everyone agrees with that and that's why Geddes came up with a new metric to make sense of measurements. How important is the order of distortion ? Its level vs the sound level in the room ? The ratio of various distortions ?
Yes & Geddes metric was an attempt at moving towards a correlation with auditory perception

Imo, the problem is not so much being unable to measure what's happening. The problem is that there are so many variables that the task is out of reach of casual attempts. Still, a manufacturers giving you graphs with thd+n vs frequency at a few power points, a few IMD graphs and a phase/frequency response graph is already being very honest about the abilities of its product.

There's also the fact that sometimes, things are just good enough. Once you measure a product at a few critical points, the likeliness of a problem at other points is low so it's not unreasonable to be satisfied by a smaller amount of data rather than the full picture.
Sure, sometimes things are good enough & sometimes they're not. Unfortunately, measurements that are commonly used are a very weak predictor of audibility.
You mention complex measurements "being out of reach of casual attempts" & what's mainly being discussed here is that the same applies to ABX tests of any worth.
 
Come on, it's not rocket science. A DAC output (the original topic of this thread) is just an electrical waveform into a high value resistive load. And a waveform for which we actually have the ideal original to be reproduced (what's in the digital file, the original sound recorded is out of reach).

Once you measure variations in frequency response, phase, amplitude, noise and harmonic spectrum at all power and frequency points, measure IMD (again at various frequency/power points), check what's happening out of band and keep the system operating in specs (no digital clipping and the like), you pretty much have all you need to know about the ability of a particular DAC to be faithful to the input signal. There are a few more implementations concerns like crosstalk and emi interferences to account for. You can also check for jitter independently to isolate the problem.

According to Jakob2, a number of DAC artifacts have proven to be audible with ABX testing. I have attached an image he provided with a list of some below.

We could probably add intersample over distortion to the list. Intersample Overs in CD Recordings - Benchmark Media Systems, Inc.
 

Attachments

  • Jakob2 table from Frindle1997_detected.jpg
    Jakob2 table from Frindle1997_detected.jpg
    96.6 KB · Views: 131
In you mind, sure. Not supported by scientific evidence in this case however, which Jakob2 kindly pointed out and which you ignore.

What is the alternative to ABX? See previous post re: Frindle and Clark article. Those are not rigorous. At all.

Not clear what you mean by "over-perform." People perform however they do.

See "Yerkes-Dodson" law.

Everything else in your post sounds like you "guess so," which makes your ending sound bitter and rhetorical. Sorry if you feel that way.

Well no, not bitter! Just tired of the circles and the continued falling back to papers that are absent of any scientific rigor. I'd think people would welcome a chance to build a database of results to help us all understand this better.
 
According to Jakob2, a number of DAC artifacts have proven to be audible with ABX testing. I have attached an image he provided with a list of some below.

Have you actually read the paper you are citing? There is nothing rigorous in there at all. There is nothing that can be replicated. There is nothing in terms of sample sizes or statistics. There are no references, there are no cites.

I hope this isn't what you are hanging your hat on.
 
Great! Now can you produce all these measurements for two DACs, showing how much each DAC differs from the input signal & then tell us how these measurements correlate to sonic differences? RE you talking about simple sinewave test signals or something with the dynamic complexity of music as test signals?
Will you pay for my time and for the equipment required ? Otherwise, just make a search on people who did extensive rmaa testing on their DAC. Since you seem happy to send people on vague google searches, you should be able to do it yourself.

As for the correlation between measurements and sonics, that's the whole point of QAMatt 's approach. Don't mix two different issues: what we are able to measure (and how it fully defines the electrical signal) and what we should measure in practice.

Sinewave testing is a perfectly valid substitute for audio measurements btw (at least for DACs, under the constraint that we are dealing only with legit signals). Please make a search on that. The DAC knows nothing of the "dynamic complexity of music".

@markw4: all those points fall under the list I made. Exceptions are dithering, which depends on how much processing is made on a particular DAC and the intersample stuff, which pretty much falls under illicit signals and digital clipping.
 
Will you pay for my time and for the equipment required ? Otherwise, just make a search on people who did extensive rmaa testing on their DAC. Since you seem happy to send people on vague google searches, you should be able to do it yourself.
You made the claim but now won't show the evidence - fair enough

As for the correlation between measurements and sonics, that's the whole point of QAMatt 's approach.
You have it backwards - QAMatt is showing something about a particular limited set of distortions & personal thresholds - nothing new - these on-line training programs have been around for a long time.
Don't mix two different issues: what we are able to measure (and how it fully defines the electrical signal) and what we should measure in practice.
I don't believe I'm mixing up anything but happy to hear what you think I'm mixing up.

Sinewave testing is a perfectly valid substitute for audio measurements btw (at least for DACs, under the constraint that we are dealing only with legit signals). Please make a search on that. The DAC knows nothing of the "dynamic complexity of music".
Ah, yes, I guess you think it 'good enough'.
 
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.