Hires 96/24 listening test of opamps

Which of the files do you prefer by listening?

  • rr = LM4562

    Votes: 1 4.5%
  • ss= OPA2134

    Votes: 2 9.1%
  • tt = MA1458

    Votes: 2 9.1%
  • uu = TL072

    Votes: 9 40.9%
  • vv = OPA2134

    Votes: 1 4.5%
  • I can not hear a difference

    Votes: 7 31.8%

  • Total voters
    22
  • Poll closed .
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.
Mark (and PMA), reflecting some more, it probably wouldn't be possible to include an 'I don't hear a difference' in the score. If you read some of the listening tests in the past they often do a 'dry run' first and then cull those participants that did not hear a difference, by their own account.

For the reason that there is no category for them in a preference listening test like this one. You could include it in a test where participants were asked if they could hear a difference, but not if you ask for prreferences.
In itself it is interesting to log those very honest participants, but you can't include them in the number score for preferences.

Jan

I hope that it isn´t a too strong distraction in this thread, as it is a complex matter.
The "no difference"/"no preference" option is connected to the internal criterion problem of any participant and using it evokes problems in the statistical analysis.

The traditional answer is to avoid the socalled "tie" and to use forced choice answers instead.

The more modern solution is to use forced choice answers in combination with a negative control to get a socalled identicality norm of the listener/participant group. That is the reaction of the group to an identical stimulus and allows to test all further results against this identicality norm.
Statistical analysis is a bit more complicated as the trial to get the identicality norm and the other trials can not be assumed to be independent.
 
Maybe one reason we have so few votes is because the "can't hear" option was not available until now.

Another reason might be that people are disappointed if they can't hear and perhaps tell themselves they might be able to hear if something or other was only different, such as if they happened to feel like trying harder. Allowing that they might feel more like it later, they may put off voting in the present, and indefinitely.

One way to help sort some of that out would be to keep track of how many times the files have been downloaded verses how many votes are cast or listening reports filed. It turns out counting downloads is possible, even if using Dropbox.

One discussion of how that might be done is described here: How to Monitor Download Count in Dropbox - Hongkiat
 
AX tech editor
Joined 2002
Paid Member
Another reason might be that people are disappointed if they can't hear and perhaps tell themselves they might be able to hear if something or other was only different, such as if they happened to feel like trying harder.

That's an unsure path - if you are feeling so pressed you need to hear a difference, what is the value of such an eventual vote?

Jan
 
Administrator
Joined 2007
Paid Member
I think the response has been fairly positive to this test, and I know from past experience how much effort is needed to organise these kind of things. Its been a decent response imo.

When the test is over I will post my notes (that I sent to Pavel) which show how I couldn't at first differentiate these (in fact I seriously wondered over the quality of the material ;)) but then how I realised it was a limitation of the PC sound system.

Playing even the MP3 version through a better DAC and amp was much more revealing.

I got complacent, after all how hard could it be to at least pick a 1458 from the crowd. I was wrong... I needed to up my game.

I think its been a great test all round tbh :up:
 
Moreover, even negative test result (can't hear a difference) would be important. And, we had some trials here to get ABX results from these test files, none positive so far.

I have tried the Foobar ABX test a few times. Strangely, I seem to have a definite bias towards answering wrong, not random, but wrong. If A is X, then I tend to choose A is Y. Don't know why.

Also, I never hear file differences in Foobar as clearly as I do in Reaper. However, I have no particular reason be believe that Foobar or Windows is altering playback such as by hidden SRC or some other mechanism. Therefore, I can only assume my brain's System 1 is somehow reversing the correct choices.

Could be part of my problem is that there is no automatic looping available in Foobar ABX. I can only speculate that hearing differences of very low distortion levels involves some kind of adaptive brain DSP, and for such small distortion levels my particular DSP seems to have maximum sensitivity only for very short duration samples.

Also, any distractions during times DSP happens to need high processing priority seems to be problematic. Trying to manually loop a short section in Foobar every half second may interrupt DSP too frequently for it to achieve some kind of recognition lock, whatever that may entail.

Maybe the only way to find out more about it would be if looping were added to the Foobar ABX module, then it could be tried.

Something else that would be of interest to me would be reports from anyone who may have tried the Reaper sorting methodology I described. If someone else finds it works more reliably for them than Foobar ABX does, it would be helpful to know that.

In addition, although PMA may or may not be interested in researching various psychoacoustic testing approaches, I certainly am. It seems possible that one reason prior listening research hasn't shown lower level distortion audibility than it has could be because there actually is some problem with ABX as it has been implemented so far. People can speculate, argue, etc., but the only way to really find out it would be to experiment and then replicate results.

With regard to trying different listening test approaches, ABX, sorting, or otherwise, for people having difficulty with the rather challenging files in PMA's current test, a set of easier-to-differentiate practice files might be useful. If we had some files that, say, maybe about half the participants could differentiate on their existing systems, then we might have enough responses to collect some useful information about potential sensitivities of alternative testing methodologies.
 
I have tried the Foobar ABX test a few times. Strangely, I seem to have a definite bias towards answering wrong, not random, but wrong. If A is X, then I tend to choose A is Y. Don't know why.

.

I think this is nothing more than a proof that in an ABX protocol you do not hear a difference. I have the same results when I am unable to tell the difference. I think it is a foobar randomizer that is capable to make uncertainty enhanced ;), taking into account results of previous trials. We should not try to find excuses, IMO. We are either able to get positive ABX result or not, in the test. There are some tests where we are able to get a positive result, there are the other tests where we are not. That's the reason why we are testing :)
 
Last edited:
I think this is nothing more than a proof that in an ABX protocol you do not hear a difference. I have the same results when I am unable to tell the difference.

That seems very odd. If you guess, you should get 50% wrong on average. I don't know of any way for Foobar ABX to make sure you will get, say, 80% wrong on average if you were guessing. Rather, 80% wrong on average would be an inverse correlation. And, correlation is not random. Guessing is random.

The thing is you can't observe what 95% of your brain is doing. That 95% is sometimes referred to as System 1 Processing. The other 5%, System 2 Processing, is your conscious awareness, it the only thing you know about, the part that you consider to be you.

System 1 is far more powerful than System 2. Among other things it does all your auditory and visual processing prior to those things popping into your conscious awareness. If there is inverse correlation in a listening test, almost certainly System 1 is responsible. It is responding to something it somehow recognizes, although its operations are not observable by your System 2. Since you don't know what your System 1 is doing, you can only infer that it is doing something because otherwise inverse correlation should be impossible.
 
plasnu, my opinion at this point is that ABX is not the most suitable test for this. But others are still fixated on it. Therefore, it would be nice if someone could get comfortable with using it. I don't know how long before it times out. Lots of rest between trials might prevent listening fatigue from trying to force getting through the thing without wasting too much time. Playing the tracks over and over again in succession makes them all start to sound the same.

May I ask if you sent PMA your listening impressions? Also, what sound card are you using, and what frequencies ranges sound most different to you between the files?
 
Last edited:
Moreover, even negative test result (can't hear a difference) would be important. And, we had some trials here to get ABX results from these test files, none positive so far.

I beg to differ, negative results (means the null hypothesis could not rejected) don´t really help, if the reason for the negative is not known.

-) is it, because there is no difference
-) is it, because a listener is having problems with the test protocols used
-) is it, because bias effects are too strong
-) is it, because the playback system masks the difference
-) is it, because the test was underpowered
 
AX tech editor
Joined 2002
Paid Member
A little intermezzo.

I was thinking about the situation where you cannot, at least conciously, hear a difference between two tracks.
In many situations like that (not just sound but anytime a choice has to be made between two almost equal things), your subconcious has picked up a difference but you cannot articulate it.

If you have a gut feeling, follow that.

If you do not even have a gut feeling, try this: flip a coin. If the result makes you feel slightly relieved, that's the correct choice. If the result makes you feel slightly disappointed, select the other case.
This is a nice way to eek out things your subconcious 'knows' but which are normally not accessible to your conciousness.

If interested: Incognito. The secret lives of the brain, by David Eagleman.

Jan
 
In my opinion, if someone does not hear a difference and there is no option like "can't hear a difference" to make a vote, then the tester might be pushed to click any of the other options, which makes the vote completely useless. At least, IMO, the "can't hear" option says, that with the test samples and under the specific conditions of the individual private test the tester was not able to hear a difference.
We are not performing a strictly scientific test, and it is, assuming the conditions allowed by the web-based test, in fact impossible.
 
The difference between worst and best is evident for anyone who does not have deafness unless their equipment suffers from a lot of noise and interferences I think.

Last night (Sunday) my system sounded scandalous, incredible. Today, Monday, will not sound so good because I can not further attenuate the noise / interference without affecting the dynamics of sound, at least with my solutions.

Either I move from home or I need to generate my own power, something problematic and expensive.

With those problems it is not logical to build or buy a better amplifier :mad:
 
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.