John Curl's Blowtorch preamplifier part III

Status
Not open for further replies.
Ego corrupts. Money corrupts. We can often see ego on here. We can't always see the money, which is why we sometimes ask.
But I've never seen anybody on here suggest that if you are negatively biased towards there being an audible difference in amplifiers (for instance, could be DACs, cables, whatever) then you are likely not a suitable listener for ABX testing of amplifiers, as you will be biased towards a null result - have you?

What if people don't know or admit to their negative bias, how do you discover it? Should their results be accepted as fair & unbiased, just because they are doing a DBT?

An attempt at humour?

For sure, test results in sighted "listening test" must be far "better" (but for whom??).. And "listener" can be even deaf.:rolleyes:
Such "test" is ideal for people who must care for "right" results, usually for comercial purposes..

Have you got some acceptable evidence that the results from 'sighted' listening ala how Hoyt stated are more wrong than the results from ABX listening?
 
Last edited:
...Have you got some acceptable evidence that the results from 'sighted' listening are more wrong than the results from ABX listening?

In the Clark Amp Challenge intentionally associating sighted tests with the wrong amp resulted in a close to 100% positive correlation between sight and result, which statistically was a wrong correlation with actual sound. I will repeat, this was not done to spite the person being tested, it was done to determine the affect of sighted associations on test results and was performed following differential threshold setting via ABX testing..

As I have stated I have issues with ABX as well, which does indeed often give more null (or wrong) results, especially with listeners unpracticed in audio testing than one would think given the differences being compared. The sensitivity of positive results with ABX is proportional to training and repetition.

So to answer your question, it is apples and oranges. Sighted testing tests the listener's ability to preferentially ignore visual information, and concentrate on aural. ABX testing has so many variants, but in general tests the listener's aural memory acuity and training. For casual tests ABX may be problematic, but at least there is not a built-in positive bias.

Cheers,
Howie
 
In the Clark Amp Challenge intentionally associating sighted tests with the wrong amp resulted in a close to 100% positive correlation between sight and result, which statistically was a wrong correlation with actual sound. I will repeat, this was not done to spite the person being tested, it was done to determine the affect of sighted associations on test results and was performed following differential threshold setting via ABX testing..
I don't know the details of this but I'm talking about relaxed listening in many sessions over a long time period of days or weeks as opposed to one shot listening. I would not e that just the word "Challenge" in the name of that 'test' introduces a bias.

As I have stated I have issues with ABX as well, which does indeed often give more null (or wrong) results, especially with listeners unpracticed in audio testing than one would think given the differences being compared. The sensitivity of positive results with ABX is proportional to training and repetition.

So to answer your question, it is apples and oranges. Sighted testing tests the listener's ability to preferentially ignore visual information, and concentrate on aural. ABX testing has so many variants, but in general tests the listener's aural memory acuity and training. For casual tests ABX may be problematic, but at least there is not a built-in positive bias.

Cheers,
Howie
Mostly agreed - ABX testing sensitivity is reliant on reducing many biasing factors, not least of which is the "challenge" & unnatural nature of the listening which, as you say, requires participants training & ability to deal with repetition

Yes, it is apples & oranges in the sense that one type of listening is biased towards false positives & the other biased towards false negatives - both of which are incorrect/wrong results. That's why I phrased my question as "Have you got some acceptable evidence that the results from 'sighted' listening ala how Hoyt stated are more wrong than the results from ABX listening? "

The point is that we have no overall statistics as to how many false positives 'sighted listening' produces Vs how many false negatives ABX produces - the assumption is usually made that all the ABX null results are correct rather than false negatives & therefore 'sighted listening' audible differences are incorrect.

If we wanted to compare the effectiveness/suitability of these two listening approaches then something like this macro consideration is required rather than just making the above assumptions
 
Last edited:
I meant, "Agree it does not work for cancer," but seem to have left off the first bit. Sorry for the confusion.

The point I was trying to make that all sorts of things can be used to treat diseases: exercise, eating habits, yoga (yes, doctors prescribe yoga), and they use placebo all the time to treat patients. To say that anything that doesn't involve prescription drugs or radiation isn't treatment of disease would be silly.

The brain is one organ in the body that interacts in complex ways with various other systems. We don't yet fully understand most of those things, if any of them at all in complete detail. That we don't understand mechanisms doesn't mean they don't exist.

It is an old and obsolete idea in medicine at least to think of the mind as separate from the rest of the body. I won't get into prohibited subject areas where separation beliefs may be taken further.

You are right about DC. Quack.

No worries, but you did leave me scratching my head. :)

I'm not in any way trying to discount the importance of treating the whole person (it's incredibly important!), just trying to separate placebo from medicine. There are tons of non-drug, non-surgical interventions that are incredibly helpful to a person's overall well-being (and physical health). You guys don't have to deal with me harping on diet, exercise, and sleep all the time. (Hypocritical about the latter though I may be)
 
If someone says he clearly hears a difference in a sighted test and he was listening to two bit identical files, then it is a false positive result produced by the sighted test.
Perhaps but there can be other physical factors besides bit patterns that may not be the same between listening sessions & could result in audible differences. As I said spot listening is one of the many pitfalls of ABX listening & it's very different to relaxed listening to many different tracks/genres spread over many days/weeks, in many different times of day & moods - as Jakob2 said, it averages out many factors which can influence a spot listening test.

It's probably difficult to evaluate the number of false positives that result from sighted listening without introducing some other techniques for evaluating this but I'm suggesting that the null result of ABX does not necessarily mean the 'sighted listening' was wrong - it could be ABX is wrong -ABX should not be the yardstick by which 'sighted listening' is judged, as is often the case

How do I know that the ABX has produced a false negative result?
The use of some form of hidden controls, as recommended in ITU guidelines, is one possibility to ascertain is the participant sensitive to a certain level of known audible difference - for instance in one or more of the 16 trials of an ABX test, B could be an exact copy of A except that it has been adjusted by 1dB (or whatever is deemed appropriate) & if X is not identified correctly as A or B then we have an indication of false negative for this particular type of difference - does it generalize to lack of sensitivity to other small impairments? More than one trial & more than one run of ABX would be needed to evaluate the sensitivity of the participant/test to small impairments. Volume level is just one suggestion, easy to implement, to act as a hidden control - other differences are possible & other approaches have been suggested in the past
 
Last edited:
Really? Why is that?

Thinking out loud, I guess it is because each and every individual has a life and history that leads up to the very moment of the ABX test, and that in itself brings with it all of the phsychological baggage and inner psychic chatter into the listening test environment.

To filter out all that 'noise' and to achieve an observable and repeatable objective result would take either a considerable number of untrained participants, or a rigorously select few of highly trained participants. Both approaches have their advantages and disadvantages.

The main disadvantage is there are relatively few individuals out there at all interested. They like the music, and don't really care all that much how it gets into the ear canal.
 
^ The food guys manage to train folks fairly quickly to be at least decent, some/a lot of it coming in the form of "dry running the protocols" to gain familiarity with the selections and how the test will actually precede, which goes a long ways towards removing test anxiety(this goes for any protocol). On the other hand, there is still the mental burden of the test itself.
 
What happened? Did my 'coke' test reference confuse you? What I was relating to is how the managers at Coca Cola decided through some sort of tests to change 'old Coke' to 'New Coke' and then meeting a great deal of resistance from the public who actually like to drink the stuff. SO, they switched back to something close to 'old Coke' and called it 'Coke Classic' and quietly phased out 'New Coke' for all intents and purposes. Of course, they could not admit their mistake, so they double blind tested the major advocate for getting the 'old Coke' back and PROVED that that guy could not tell the difference, by getting a null in the test. Some test huh?
 
The use of some form of hidden controls, as recommended in ITU guidelines, is one possibility to ascertain is the participant sensitive to a certain level of known audible difference - for instance in one or more of the 16 trials of an ABX test, B could be an exact copy of A except that it has been adjusted by 1dB (or whatever is deemed appropriate) & if X is not identified correctly as A or B then we have an indication of false negative for this particular type of difference - does it generalize to lack of sensitivity to other small impairments? More than one trial & more than one run of ABX would be needed to evaluate the sensitivity of the participant/test to small impairments. Volume level is just one suggestion, easy to implement, to act as a hidden control - other differences are possible & other approaches have been suggested in the past

In our off line tests as they are suggested here (like my tests in "Everything Else" forum) the participants have free choice of

- volume level used
- number of trials used
- length of time interval used, one test may take 24 hours or more if the participant likes,
- number of trial repetitions
- sound system used that is familiar to the participant, headphones or speakers

So, IMO, all your suggestions are fulfilled, the participant has absolutely free choice of almost anything, no stress, he does not have to go somewhere and fulfill the protocol of the test in unknown place with unknown people.

So, taking into account this almost absolute freedom of choice and no pressure, why are there almost no positive ABX results in tests like wire x tube pre with 0.5 - 1% distortion, or even when testing the original file vs. recording through hybrid power amp with same distortion, loaded with speakers? Is it because of "stress", or for the reason that the sound differences are at the threshold of hearing resolution, hearing cleaned from sighted biases like look, brand, good highend story etc.?
 
@PMA, have you ever tried to test how sensitive your test is by embedding hidden controls in it, as I suggested? Until you do you will never know if it's the test itself that is supressing people's ability to discern differences.

Btw, the first two of your choices are not free
- if people use too high a volume, differences can be audible which would be inaudible at normal listening levels
- 16 trials is the normally accepted min number for statistical significance
 
Last edited:
@PMA, have you ever tried to test how sensitive your test is by embedding hidden controls in it, as I suggested? Until you do you will never know if it's the test itself that is supressing people's ability to discern differences.

Btw, the first two of your choices are not free
- if people use too high a volume, differences can be audible which would be inaudible at normal listening levels
- 16 trials is the normally accepted min number for statistical significance

I agree on both. Too high volume can pick up noise-only related issues. 16 trials at least, agreed.
For the reason of possible "too high volume" issue, I constituted both of my latest test to be immune to this manipulation. This is made by sample music choice, cutting no-signal beginning and ending of files and further tests and file comparing.

Regarding test to hidden controls sensitivity, IME I did what I could.
 
Status
Not open for further replies.