John Curl's Blowtorch preamplifier part II

Status
Not open for further replies.
Hi,

OK, Thorsten, thanks for clarifying your position. Are you saying that you have specific criticisms of ABX testing, but other types of double-blind test could be acceptable to you and might yield useful results?

I have specific criticisms of specific tests, promoted as a specific protocol, performed and generally publicised in a manner consistent with the lowest gutter press by a certain mainly US based group (which I call ABX Mafia) and which prominently features Messers Krueger, Clarke, Nousaine, Lipshitz and others. I feel that these tests are neither scientific nor are those performing the tests as such interested in the truth (or they would have long adjusted their tests to counteract criticisms and that each significant individual criticism suffices to remove any credibility or statistical power.

I also note that due their usually sensationalist presentation in the style of sideshows: "Watch us make the arch-subjectivist jump through our hoops and rings of fire and see how we demonstrate he only imagines there are differences" has netted them a high profile and helped them to attain a profile not warranted by their tests and results, while their critics by using more formal academic routes of publication remain obscure, as do those who in fact do carry out real research on the subject matter.

To me any test is acceptable that demonstrates:

1) That it eliminates bias (this usually means subjects must be ignorant to the nature of the supposed difference being tested).

2) That it minimises test stress, accommodates and compensates for attention span issues.

3) This it provides adequate positive and negative controls including demonstrating the ability to resolve known audible differences prior to attempting to assess unknowns.

4) Uses data collection methods that maximise available data so sophisticated statistical analysis is possible.

5) Uses statistical analysis that are appropriate to the sample size and nature of the difference, as well as providing a sufficiently equal risk of type a/1 and type b/2 statistical errors.

6) Provides full disclosure of the test details and resultant data and makes sure to limit any conclusion to that which can be supported from the actual data.

Sadly I am generally unaware of any Blind tests that have seen publication that are above criticism in all five areas AND test what some call "Audiophile Nonsense".

However I will be happy if anyone could make me aware of any such publications of tests that fulfil the above criteria and concerns subjects that address issues of interest to high quality audio.

Ciao T

PS, no need to point out Mr Ackerman's Thesis research project to me, I am aware of it and will note that it does fulfil most of the above stipulations, though it seems to lack controls...
 
<snip>The conclusion must be that controlled listening tests don't work so we should instead use uncontrolled listening tests? Concentration on the task in hand renders that task impossible, so to help listeners hear better we make them aware of price/styling/brand/design/designer and we can be confident that this extra information won't bias the results?<snip>

That would be jumping to conclusion. :)

The most plausible conclusion is, that the design of a controlled listening test isn´t as easy as most people seem to think.
The basic requirements for every experiment (it doesn´t matter if an experiment consists of a listening test or something else) are that it has to be objective, valid and reliable.

In case of a listening test the experimenter has to deal with a bigger amount of uncertainty due to the human detector.

The more artificial the conditions of a listening test the bigger any impact on the performance of the detectors might be.
That point limits the usability of ABX - testing in cases were listeners could not hunt for a unique sort of artefact but more for a general distinction in reproduction quality.
Nevertheless it seems, according to the literature, possible to overcome these limitations if people are used to work with the ABX-protocol.

An A/B test, for example preferred by Thorsten, is up to a certain degree more similiar to the normal listening routine people are using when trying to find out what they like more.
If used as a discrimination test it needs more trials than the ABX as it is normally a two-sided test.

But in every listening test the experimenter has to use positive controls to ensure that the listeners reach a sufficient level of sensitivity for the task. And he has to use a negative control to ensure that a detected difference was detected due to the EUT and not due to any other mechanism.
 
ThorstenL said:
1) That it eliminates bias (this usually means subjects must be ignorant to the nature of the supposed difference being tested).
For this to work you would have to ensure that the supposed difference being tested is the only difference which could be present, otherwise people might notice the 'wrong' thing. I'm not sure how you do this. For example, how would you test for distortion while guaranteeing identical frequency response? You could certainly never do this with real amplifiers, because any two real amplifiers will differ in lots of ways. You could only use specially-contrived test amplifiers, which may or may not be sufficiently similar to real amplifiers for the results to be useful.

Jakob2 said:
But in every listening test the experimenter has to use positive controls to ensure that the listeners reach a sufficient level of sensitivity for the task
Why? Everyone says this, but is it not also useful to know that (say) only 5% of the population can reliably detect a particular change? If you only test the 5%, then you might conclude that everyone can detect the change. If you filter your initial sample of test subjects using a different test, then you might have excluded people who can detect one thing but not the other. It seems to me that filtering people before the real test begins is a guaranteed way to get false results.
 
Last edited:
Hi,

For this to work you would have to ensure that the supposed difference being tested is the only difference which could be present, otherwise people might notice the 'wrong' thing. I'm not sure how you do this.

I agree, it is a problem.

Many of the blind tests I do tend to be about details of products, like different brands of different value capacitors. I tend to first run the gear through a set of AP2 tests to make sure nothing is broken and to make sure level differences are sufficiently low to not cause trouble.

For example, how would you test for distortion while guaranteeing identical frequency response? You could certainly never do this with real amplifiers, because any two real amplifiers will differ in lots of ways. You could only use specially-contrived test amplifiers, which may or may not be sufficiently similar to real amplifiers for the results to be useful.

If I was (for arguments sake) testing an SE Tube Amp and a Solid State Amp I would probably take measures to add a suitable complex impedance using high quality parts to the output of the solid state amplifier to give both options the same frequency response.

I have indeed done so on at least one occasion where a potential customers tube amplifier which he brought for comparison produced excessive bass due to high output impedance. Once I had increased the output impedance of our own amplifier artificially to a close match both amplifiers showed essentially the same frequency response. Following I adjusted the Levels to less than 0.1dB difference. This allowed a fair comparison of both Amplifiers.

Of course, it will still be very difficult to deliberately test a single variable.

I have the advantage that I am not out to prove difference or not, but only to find out what my potential customers PREFER, provided they hear a difference at all (sometimes they do not).

Ciao T
 
<snip>
Why? Everyone says this, but is it not also useful to know that (say) only 5% of the population can reliably detect a particular change? If you only test the 5%, then you might conclude that everyone can detect the change. If you filter your initial sample of test subjects using a different test, then you might have excluded people who can detect one thing but not the other. It seems to me that filtering people before the real test begins is a guaranteed way to get false results.

Validity of a test is closely related to the question (or effect) that should be evaluated. Normally there is a operational period in the development of every experiment during which the tasks were expressed and step by step transformed in a test procedure.

And of course you´re right, it is a difference if your are asking if "at least one listener is able to detect a difference" or if your are asking "what percentage of the population is able to detect a difference" .

Our concern is normally the former question (due to the enormous effort to take a sufficient number of samples to conclude something for the whole population) and so the test sample size are normally quite small.
Therefore it would be simply not possible to conlude from this small sample to the whole population and hopefully during the statistical analysis and conclusion a statistician would tell the experimenter about this.

It was some time ago that i last calculated the sample size for representative measures of the whole population (still not world wide) but my estimate would be in the region of 10000-30000 samples depending on the accuracy needed and the people in this sample would have to be representative for the whole population.

So we normally use kean listeners for tests and use positive controls to ensure that they are still kean listeners under the specific test conditions which is not a given.
 
Last edited:
Hi Thorsten,

To me any test is acceptable that demonstrates:

Personally, I have no interest in tests themselves, tests for the sake of proving or disproving anything. My sole interest is the sound quality of my own sound setup.

However when it comes to my ability to note differences in sound, or to evaluate a change in sound quality of a certain change in the setup – I can do it only on a sound setup that I'm well familiar with, either my own setup, or another setup I heard many times and like its' overall sound. When I hear a setup that I'm not familiar with, the "sound stamp" of that setup overshadows differences – the differences become very small to my ears, compare to the overall "sound stamp" of that setup.

This is much more prominent when I don't like much the overall sound of that setup. In such a case, when I don't like the overall sound, changing a certain component, like a CD player, will be meaningless to me, since I wouldn't like the overall sound, with the first CDP and with the second one. Hence, I wouldn't be able to say which one I evaluate as "better". More than that, I may have so much discomfort with the overall sound that I may no note any difference at all between two sources, or two cables, or whatever.

Possibly, this happen to other people as well. Hence, possibly, this is a major obstacle in any test, on top of how scientific the procedure may be.
 
Of course, you are correct, JoshuaG. What are we into this for? It is difficult to hear minor differences, in most cases, without FORGIVING the rest of your playback system. I have FORGIVEN my present set-up and it sounds good to great to me, however, I have not knocked over any of my audiophile friends with it. It is the same, in general, when I listen to their system. I can always point out problems, but to what avail?
In my opinion, you have to get your audio playback system up to a certain technical and listening approved level to do meaningful comparisons.
This is often where the Dr. L and his associates often fail. They start with the concept that virtually everything sounds the same, so adding a 'virtually perfect' addition to the component under test, is perfectly OK. I remember a test done over 30 years ago, by Dr's L and V where they ADDED a DYNA equalizer in series with the DUT, a Walt Jung designed phono preamp. They did this to make the RIAA LESS accurate, in order to track their 'reference ' preamp better. Guess what? They could not hear any difference. Now, what were they listening to? The DYNA or the Jung Phono?
They stated plainly that the DYNA was virtually flawless in reproduction in their article.
This is science? And it was done by a couple of Phd's. So much for a heavy education in math and a light one in audio engineering. And so it goes!
 
This is often where the Dr. L and his associates often fail. They start with the concept that virtually everything sounds the same, so adding a 'virtually perfect' addition to the component under test, is perfectly OK. I remember a test done over 30 years ago, by Dr's L and V where they ADDED a DYNA equalizer in series with the DUT, a Walt Jung designed phono preamp. They did this to make the RIAA LESS accurate, in order to track their 'reference ' preamp better. Guess what? They could not hear any difference. Now, what were they listening to? The DYNA or the Jung Phono?
They stated plainly that the DYNA was virtually flawless in reproduction in their article.

This is science? And it was done by a couple of Phd's. So much for a heavy education in math and a light one in audio engineering. And so it goes!

Ph.D.s who have forgotten their research design classes, if such is taught in engineering and/or math grad school.

On to saner things: I am personally glad to see you involved here as you're one of my all-time audio heroes.

Have you made any changes to your JC-1 pre-amp / phono-amp designs? Is there a place where I could find and download the schematics? It's been a number of years since I've warmed up the soldering iron, but I'm nearing retirement and would like to start back into the DIY audio hobby. I usually stick to hollow-state electronics, but your designs seem to have withstood the test of time if I'm reading many of the messages here correctly.
 
Bubba, are you talking about the Levinson JC-1, designed 39 years ago?
Sorry, I personally do not release schematics, so you should search around here for good ideas.

Yes, the JC-1 or any of its progeny. I've seen schematics here on diyAudio, but I always prefer to get them from the source. You never know what will be parading around as "the real thing."

At this point I'm looking for high quality, relatively straight forward solid state phono and line amp designs to build. I normally use 12AX7's, but living in the southeastern US has its drawbacks, usually beginning about mid-to-end of March . I wish there was a great solid state power amp that WASN'T Class-A (and therefore a space-heater, too).

Best wishes, and thanks for the reply.
 
Darth----, don't lose hope, there are a number of good, buildable designs on this website. It will just take a little time for you to find them. Perhaps some others here can give you some hints. My designs, developed over the last 30 years, are difficult to build by just about everybody, and that is why I discourage people trying to make them, unless they are fully prepared and with a lot of experience.
 
Can you point me to where Prof. Lipshitz has said this? I've read most of his published papers and somehow missed that remarkable assertion.
Ha, the resident lawyer at work. Of course, they do not say that with such clear words. However, I and many others on this earth live in a world where you do not need to say things openly, but it suffices to say something in a roundabout way. The only people where you can not communicate in this manner are .... well I leave this to the reader.
 
To understand where these odd ideas are coming from. I'm always surprised when people falsely attribute positions to others that they never took. In this case, talking about someone with a long publication record on important issues in audio and a great deal of work on audibility of many phenomena, this is particularly head-scratching.
 
My 'odd ideas' come from 'The Audio Amateur' from 1978-1980, when I actually had LTE interchanges with Drs' L and V. Unfortunately, I am missing a few critical issues temporarily, so I cannot answer a direct challenge at the moment, but I should find them soon, and we will get down to the true intent and 'lack of engineering precision' (my opinion) shown at this time period, which was the beginning of DBX testing.
 
Darth----, don't lose hope, there are a number of good, buildable designs on this website. It will just take a little time for you to find them. Perhaps some others here can give you some hints. My designs, developed over the last 30 years, are difficult to build by just about everybody, and that is why I discourage people trying to make them, unless they are fully prepared and with a lot of experience.

Well, I began by fixing my Dad's dead Heathkit preamp and mono amps when I was about 16 years old (back in the middle 70's.) Built the mandatory Hafler DH-200 and DH-101 when they came out. Started fabricating equipment from just the schematics; solid state and tube. Had a small audio repair business while in college. Wandered away for a few years while in grad school (Not! EE). Got back into it just a short while ago now that the kids are grown and gone and no grandkids are around to stick their fingers into places where high voltage lives, not yet anyway. So, I got a ton of raw components and a few unfinished projects to do SOMETHING with. :)

That dusty old pile of Audio Amateur and Glass Audio magazines are calling to me...
 
That still doesn't answer the question. Perhaps it's a language issue?

I´d say it was an implicit answer. :)

Obviously the listening tests for example of compression algorithms were mostly done according to ITU-R BS.1116 . This recommendations relies on an trained expert listener panel, which means that the participants should have at least half a day for training and half a day for the test.

OTOH if people were willing in the past to do some listening tests with proper training they were usually successfull in the test, see for example Mike Fremer, see the swedish guys who did a test on cd-players, see various tests from me and my collieagues on capacitors, cd-players and amplifiers.

Another way is to have a bigger sample size, see for example the stereophile amplifier listening session, the PCM/DSD comparison, the german stereoplay magazin on vinyl vs. AD of this vinyl or the cable test from Olaf Stum .

The third way is to hide the test from the participants and to let them simply evaluate two devices like they did normally, something that we did with preamplifiers.
 
Status
Not open for further replies.