Listening Test. Trying to understand what we think we hear.

Status
Not open for further replies.
Final answer 😉

Were there any question? 😉

That was ordered by sonic quality, which is quite similar to my earlier post(my preference, as I have stated previously, cannot be changed even if I see all the opamps).

I have learnt what is required from a music reproduction to be enjoyable, and sonic quality is the most important. Unfortunately, there is the second criteria, it must not be fatiguing, which I found is correlated with high order distortion and IMD. Unfortunately, we cannot see the second criteria from the spec (but we can from FFT simulation) which only show THD.

Noise can be a big problem if there is IMD caused by the speaker design. My speaker has little issue with distortion so noise from the electronics is not a big problem.

the FET input stage group should be the noisiest here given the very low circuit impedance throughout... can you tell though ?

I didn't try to listen and sort based on noise (because noise is not the critical criteria for me). But is it correct that 4560 has such a big noise, or I read wrongly from the PDF?

(Increasing Noise): LM4562 (2.7) - LM833P (4.5) - NE5532 (5) - TL072 (18) - TLE2070 (18) - RC4560 (1200)
(Increasing THD): LM4562 (0.00003) - LM833P (0.002) - TL072 (0.003) - TLE2072 (0.013) - NE5532 (None) - RC4560 (0.05)
(Decreasing CMMR): LM4562 (120dB) - LM833P/NE5532/TL072 (100dB) - RC4560 (90dB) - TLE2072 (89dB)
(Decreasing BW): LM4562 (55M) - LM833P (16M) - RC4560 (15M) - NE5532/TLE2072 (10M) - TL072 (3M)
(Decreasing SR): TLE2072 (35) - LM4562 (20) - TL072 (13) - NE5532 (9) - LM833P (7) - RC4560 (5.5)

NB. Please note that the numbers I took from PDF may not represent equal measurement condition, so only for quick reference
 
I felt a simple breadoard layout was sufficient. The 'dirty ground' was a connection back to the PSU.

Have you read the PDF of the TLE2072? There are cautions/warnings mentioned. They realize that the opamp is hard to implement.

Remember previous opamp test, where I found the highest slew rate opamp, LM4562 (SR=20) to be high-end but sounded wrong, and I suspected implementation issue.

In current test, the fastest opamp is TLE2072 (SR=35). It will be fun if D and C are either TLE2072 or LM4562 😀
 
Were there any question? 😉

That was ordered by sonic quality, which is quite similar to my earlier post(my preference, as I have stated previously, cannot be changed even if I see all the opamps).

Just which you preferred. If you keep coming to a specific file as being preferable then hat implies you can audibly tell the differences.

Many opamps have lots of cautions when applying them. I always confirm there are no obvious issues by using a 100Mhz analogue scope.
 
I also raised some objections to the test and lack of reference early in the thread. I don't want to assign a "sound" to some opamp when I don't know what else is in the circuit. Does the unknown part have a coloration so large that it masks differences in the DUT? Don't know.

The first question is: "What is the test"

I perceive the test to be as follows:

Listen to this circuit:

An externally hosted image should be here but it was not working when we last tested it.


Implemented with the following op amps:

1/ JRC4560
2/ LM833P
3/ LM4562
4/ NE5532
5/ TLE2072
6/ TL072

The obvious reference for this test (has to be unchanging and readily available to all) is: A short piece of wire - probably audio cable.

The test setup is as follows:

Digital source file -> digital music player -> DAC -> UUT -> ADC -> Digital recorder.

So we need a total of 7 test files, the 7th being of the test setup with the op amp circuit replaced with a piece of wire (our reference).

Since there have been concerns over the audio performance of the balance of the test rig which includes a recording with its own problems, it would be good to know the measured performance of the 7 UUTs and the measured performance of the UUTs (other than the reference) needs to at least as flawed as the rest of the test system.

The preferred evaluation method would be to ensure that at least some listeners would hear the 7 test files as being different. The 7th file (reference) is likely to have the most perfect sound due to its minimal processing. Therefore we need to do some kind of well-controlled listening test comparing the reference file in rotation to the remainig six files.

To avoid obvious errors, our listening evaluation methodolgy needs to avoid the following common errors related to listeening tests:

(1) Audiophile Sighted Casual Evaluations are not admissible as reliable evidence because they are not tests. That is, they do not involve comparison to a fixed, reliable standard.

(2) Audiophile Sighted Casual Evaluations are not as reliable evidence admissible because they involve excessively long switchover times, which makes them highly susceptible to false negatives because they desensitize the listeners.

(3) Audiophile Sighted Casual Evaluations are not admissible as reliable evidence because the do not involve proper level matching, which makes them highly susceptible to false positives because people report the level mismatches as sonic differences.

(4) Audiophile Sighted Casual Evaluations are not admissible as reliable evidence because they do not involve listening to the identical same piece of music or drama within a few milliseconds, creating false positives because people report the mismatched music as sonic differences in the equipment.

(5) Audiophile Sighted Casual Evaluations are not admissible as reliable evidence because they constantly reveal the true identity of the UUTs to the listener, creating false positives because people report their prejudices and preconceived notions as sonic properties of the equipment

Also, any particular set of tests needs the following levels of consistency to be ensured:

(a) Listening to the identical same music

(b) Listening via the identical same system

(c) Listening in the identical same room with

(d) The identical same listener is listening

(e) The listener is seated in the identical same place:


It seems like editing the test files properly and using an ABX file Comparator to perform the test will easily meet all of these requirements.
 
Just which you preferred.

I have posted it but here it is again. First, I ranked by sonic quality, then I moved those with fatigue (and non-musical character) to the bottom of the rank.

1.A
2.B
3.E
4.F
5.D
6.C

then hat implies you can audibly tell the differences.

😕 What surprised me, is why so many cannot tell the differences?? :scratch: Like I often said, hearing differences is very easy, what difficult is to make preference as we need to relate with knowledge and experience. Knowing the difference only take me seconds, but knowing there is fatigue??? I need to listen for an hour for A (and found no fatigue), half an hour for C (when I felt the fatigue). And what if my "feeling" is not reliable. If I don't trust my feeling, I need to do the same procedure several time!

I always confirm there are no obvious issues by using a 100Mhz analogue scope.

Then I still have home work, to understand why I subjectively don't like C and D, which "objectively" is a high-end sound.
 
(1) Audiophile Sighted Casual Evaluations are not admissible as reliable evidence because they are not tests. That is, they do not involve comparison to a fixed, reliable standard.

Showing the reference later, as in this test, is even better than showing in front.

(3) Audiophile Sighted Casual Evaluations are not admissible as reliable evidence because the do not involve proper level matching, which makes them highly susceptible to false positives because people report the level mismatches as sonic differences.

Agree with little caveat: Added distortion will increase sound level. Does this imply that higher distortion will tend to be preferred (if listener is not skillful in recognizing distortion, which is actually hard)?
 
😕 What surprised me, is why so many cannot tell the differences??

Just guessing, but the only people surprised by this are those who:

(1) Don't know how well the test circuits probably measured. This is excusable because I haven't seen any relevant measurements, either.

(2) Don't know about the thresholds of hearing for common distortions and noises.

This is probably excusable because not that many people have studied psychoacoustics deeply enough to know what these are.

:scratch: Like I often said, hearing differences is very easy,

Saying stuff doesn't make it true, but given your probable history with doing good listening tests, saying this is also excusable.

Reality is that almost everybody's beliefs about sound quality are overwhelming influenced by the biases that are present during just about any listening test:

(1) Levels aren't matched well enough
(2) Switching takes too long to hear that a lot of stuff sounds the same once you address (1).
(3) The music that is used to audition is not time-synched well enough so that in fact every listening session involves at least slightly different music which of course sounds different.
(4) Enough is known to the listeners about the test by non-audible means that the listeners are actually struggling to base their judgements on "Just listening".

Bottom line is that knowing what I know, I can pass just about any audiophile listening test and tell you what A & B are even when A & B are the identical same piece of equipment. Obviously sound quality has nothing to do with it. It's the poor quality listeniing test procedures.

In fact I just did this with the files in the 6 chip test.

I can pick out one of them in any comparison with the rest of them, 16/16 in a level-matched DBT in no time and with negligible effort. And that was by audiophile standards, a relatively well-run test.

In science there is a critical test of any scientific equipment or principle called falsifiability. In audio tests falsifiability among other things means that the UUT will not be found to be different when compared to itself.
 
Between which ?

Whatever you are comparing.

In this set test files the level matching was IME very good, but just level matching is not sufficient.

For example, a switch over time of more than a second or two flummoxes everybody's memory to the point where everything might sound different because too much about the last sound was forgotten to get a clear impression of "sounds the same".

There is a goodly list of things you have to do right to hear "Sounds the Same" (when it is true), but it is relatively easy to hear "Sounds Different" even when it is false.
 
Whatever you are comparing.

In this set test files the level matching was IME very good, but just level matching is not sufficient.

Thanks. The level match between all six original files should be 'perfect' because only the device under test changed.

Todays 'MICROMEGA' file may be detectably different in level simply because of output stage differences between the MM and the opamp circuit. The MM file should be slightly lower because of a 68 ? or thereabouts internal series resistance adding to the 2k2 +2k2 divider present in the original setup (and present in the MM file). Also the omission of the input and output coupling caps will also be detectable by a difference in phase shift at LF. Nothing audible but those with the file handling skills will confirm those differences exist.
 
I have now listened to all the files again and compared them directly to both the original CD and the LP.

Speakers are Quad ESL57 and the amp a Leak Stereo20.

The Micromega file is closest but still not as good as the CD or LP. C is still my favourite of the others.

In general, it isn't really a matter of 'quality', more how I could relax and enjoy the music rather than the sound. The voice on all the digital files seemed to have a false resonance to it that I found fatiguing however and the ringing on some of the treble was considerably dulled.

The LP would still be my choice.
 
Status
Not open for further replies.