Hires 96/24 listening test of opamps

Which of the files do you prefer by listening?

  • rr = LM4562

    Votes: 1 4.5%
  • ss= OPA2134

    Votes: 2 9.1%
  • tt = MA1458

    Votes: 2 9.1%
  • uu = TL072

    Votes: 9 40.9%
  • vv = OPA2134

    Votes: 1 4.5%
  • I can not hear a difference

    Votes: 7 31.8%

  • Total voters
    22
  • Poll closed .
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.
Jakob, perhaps I misinterpreted what you meant by this? I read it as you suggesting a non-null conclusion. Did you more mean to say that no conclusion could be reached (results support neither a null nor non-null conclusion)?

That was related to an earlier post by PMA where he stated that the poll numbers wouldn´t allow a statistical analysis. Therefore i was surprised that the poll numbers allowed two categorical messages/conclusions to be drawn. :)

If so we're in agreement, and my apologies.

No need for apologies. Not sure about the agreement, as i think the numbers allow the conclusions that probability to get the results by chance is low (p=0.021), but that we can´t be sure about the reason due to the mentioned nonindependence.


And I wrote my explanation to be thorough about my line of thinking, not as an attack on your character (openly acknowledging the obvious acrimony between Mark and myself).

Sorry for that, obviously what i wrote wasn´t well worded.
It was related to the apparent animosity between you/billshurv and Markw4.

You all seem to be smart people that aren´t really hostile so should be able to find a workaround and to not think about hidden motivations for posts.
 
Jakob, insofar as we're both reluctant to take any conclusions from this poll, I think we're both in agreement. We hit on different reasons why we're not drawing any conclusions, but with the same end result--I didn't even look at the p value because there's multiple multivariate analyses that could be run to get wildly different results. (and my own discomfort with p-values in general)

As to the latter-most point, yes, there's some large difference in mindset that led to a lot of frustration (at least on my behalf) that I'm trying to distance from. Definitely a lesson in what buttons get me going.
 
That was related to an earlier post by PMA where he stated that the poll numbers wouldn´t allow a statistical analysis. Therefore i was surprised that the poll numbers allowed two categorical messages/conclusions to be drawn. :)

I would elaborate a bit. The poll has been public so anyone could see previous results and preferences. This together with number of voters leads me to the conclusion cited by yourself.

Regarding "categorical messages", for the reason that no one, not a single participant of the test was able to post a valid ABX result that would show and confirm he was able to hear the difference between the opamps, more than this, not even between the D/A - opamp - A/D chain and the original sound file, I dared to make my conclusions.

Have a nice day :).
 
Regarding "categorical messages", for the reason that no one, not a single participant of the test was able to post a valid ABX result that would show and confirm he was able to hear the difference between the opamps, more than this, not even between the D/A - opamp - A/D chain and the original sound file, I dared to make my conclusions.

(1) There is no way to be sure no one was able to post a valid ABX result if almost no one tried very hard. It is only possible to conclude that no one did post what you requested.

(2) Is it possible you made your conclusions before the poll even started? And that the purpose of the poll was to confirm what you already firmly believed?

(3) How can anyone be sure that ABX tests are suitable for verifying ability to distinguish very small differences in distortion?

(4) What about information provided by Jackob2 that successful ABX testing for small differences requires extensive training and practice?
 
Last edited:
(1) we do not know who has tried the ABX (and failed) ... it would be more objective if the ABX result would be encrypted and not known immediately and it could be decrypted only after after the organizer reveals decryption key [this is not possible with foobar abx plugin]

(2) this is not relevant, PMA's (dis-)belief should not influence others (in ABX test)

(3) I have also another question. How can anyone be sure that without ABX test it is possible to distinguish very small differences in distortion?
 
@meiro

Question (2) is relevant to the issue of whether or not the 'researcher' was unduly influenced in forming conclusions because of confirmation bias and/or other errors of cognition. Such issues are well known problems in research.

Regarding your question (3), if no one can be sure then people should refrain from drawing overconfident conclusions based on being very sure.
 
Bill, I would be very interested to know if you can provide a link to a post by me stating what you claim. In fact, I never directly compared the two files until Pavel said they were the same, and I noted that there was a failure to directly compare them.

To further set the record straight, before I started listening I had serious doubts that anyone could hear a difference between two unity gain op-amp under the best of circumstances. After listening I said what differences I did find were very small, and that I only "roughly" sorted them in some way as part of a process to try to find two candidates for ABX testing, not with the idea that I was going to prove anything.

It was only after the final results were posted, and Pavel claimed they were essentially random, that I noticed some patterns from other participants. It was at that point I started thinking maybe there was some indication other people were hearing things somehow similar to what I seem to hear, and I said so.
 
Last edited:
I think the votelists do have a patern, they show voters generally heard biggest difference between 2134 and 2134.

Pardon?

I think I could have not been more clear:

ss(2134) -- the group of ss(2134), vv(2134), uu(tl072) are the better part.
between them ss seems to have extra sting in the high registers, makes it look like high resolution but I think it's a problem, not resolution.

vv(2134) -- like uu, but gives the feeling of even more insight, higher resolution.

uu(tl072) -- have good resolution and body, balanced, inviting.

I also told:
tt(MA1458) -- it seems like a middle way between rr and the rest: no big problems but some compression? flat? unintresting? gray?

And I had a bad opinion on the music in general:

A last note, if I may: the general quality is quite below my best music samples, very far below what good 16bit CD production is capable to give today..
One can simply not talk of 3D qualities - there are none, air, ambience, dynamics, extremely compressed.

In fact the lm4562 brought this forward very good:

rr(4562)-- shrieky, compressed, bad mp3, no body to the voices, and some shouting quality

Ps.: in the poll I voted for vv(2134)
 
Last edited:
If JosephK is voter6, I only looked at voterslists, he did identify biggest difference being 072 and 4562 (or original file), 4562 being closest to original. He did rank closest to original as worst. He did rank all other differences equal or smaller than 2134-2134, not enough to swing general patern, biggest difference being 2134-2134.
 
THERE WAS NO ORIGINAL FILE IN THE TEST.
And I could not care less about 'voterlists'.
It was created after.
The judgement was really given in the voting mail.

And I did identify identical samples like being the same group, even gave the same description to the same samples.
And identified the higher resolution part between tl072 and opa2134.
Identified the bad sample like such, uninteresting sound (741)

Do better than that, in such a test.
 
I would even add:

Given that the voters were not provided the original file during the test, I would say that anybody retaining himself intellectually honest should refrain from forming any judgement based on informations (original file) outside the logic of the original test;

In this light the only valid, judgeable test feature was properly that, the repeated sample mixed in;

And I would like to point out the fact that the voters able to clearly identify the identical samples (voter 1; voter 6) were in the same time judging the 4562 as the worst sounding, and preferring the Fet-sound;

And the two voters judging the 4562 as best were performing the worst in identifying the identical samples. (voter 2; voter 4)

Ciao, George
 
@Joseph K:

Thank you for providing detailed information about your listening test results. They are interesting, and would tend to confirm that ABX testing may not be the only way to verify listening ability levels.

May I ask if you happened to give ABX a try?

After trying it myself I was left with the impression that I would need to practice more with it to get better at working within it's methodology, at least with the file differences as small as they seem to me. However, I know I don't hear as well now as I did 10 or 20 years ago. On the other hand, you may still be in your good years, and happily so, as I was hoping someone would come along who could do better than me.

While it is good to hear of your results, I would like to suggest being very careful regarding not claiming too much about what may be inferred from the overall results. Biomedical research is a very difficult area even for experts, as we often see in the newspapers when previous research later turns out to be wrong.

In particular, focusing on results in this particular test involving the two identical files as the most salient factor of the test could actually later turn out to be a illusory distraction. We don't know enough yet to be sure, would be my view.

What I think it would probably be fair to say is any evidence showing correlation between verbal descriptions of audible differences and measured differences of the op-amp circuits is likely to turn out to be important.

Not that we can draw very much in the way of conclusions now, but, at least IMHO, there is probably good reason to want to undertake more serious research keeping mind that someone who can hear about like you can should be on the research team to help validate the test setup.
 
Last edited:
Human's ability to concentrate is not consistent. At an ABX test, if one can hear the difference clearly at only a few rounds but failed at all the other 10 rounds, does it mean he failed to tell the difference? Statistically, it should be concluded that his successful rounds were just a coincidence, but that conclusion can be wrong. The truth is the difference is so small that he could tell the difference only a few times.

Then the question is, that kind of subtle difference is meaningful for audiophiles? The synergy gained from a lot of such a subtle differences can be huge or not. I honestly don't know...
 
Last edited:
AX tech editor
Joined 2002
Paid Member
At an ABX test, if one can hear the difference clearly at only a few rounds but failed at all the other 10 rounds, does it mean he failed to tell the difference?

It does mean that it is highly probable that a difference cannot be detected. If it was possible, there would be a lot (statistically significant) right.
If you do 10 or 20 tests in a situation where there is no audible difference, you still will get some right, some wrong. You can't say, he was able to hear a difference only a few times.

If I flip a coin on 20 tests I probably get somewhere between 9 and 11 right. Does that mean I heard a difference 9 or 11 times? Of course not.

Jan
 
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.