Can you tell original file from tube amp record? - test

Which file is the original and which do you prefer

  • Apricot is the original file

    Votes: 7 46.7%
  • Avocado is the original file

    Votes: 5 33.3%
  • I prefer Apricot by listening

    Votes: 7 46.7%
  • I prefer Avocado by listening

    Votes: 7 46.7%

  • Total voters
    15
  • Poll closed .
No, it's not the messenger who is being attacked. It's the lack of any good evidence or support for claims. If you are just blowing smoke, you'll be challenged on it. Rightfully so.

You simply object to online ABX tests. Fine, we understand that. Please stop trying to embellish that objection.

I object to these online ABX tests, yes & for very good reasons. As I said already, all the evidence & research backing it up was laid out in other threads by both Jakob2 & I & like SW, I'm not going to go back & rehash it.

If you have a problem with someone showing the errors in your 'tests' then you should stop the many objections to technical tests posted on this forum - just to mention one current one, I see many objections on the JC thread about the Bybee tests & reported results.

The problem seems to be that listening tests are perceptual tests & not many here seem to know (some really don't care to know) how to conduct these tests in order to achieve some results of value. If pointing out the issues & the research is a problem then just state that openly?
 
p > 0.05 😉 (I'm being a pedant, so please read that in a lighthearted manner)

In any case, a LOT of academics are moving away from this being a reliable marker unto itself, as it lets a lot of leakage through. (unless it's supplemented with orthogonal tests, positive/negative controls, etc.)

Of course tests like Pavels here (and I apologize for not taking the time to do the test myself) are highly problematic, because so much of the test is dictated by the end-user. That's obviously nothing PMA or anyone else can necessarily control. I would similarly be reluctant to combine trials, but would also say that it's going to err towards false negative rather than false positive. (unless there was some perverse file-specific playback effect)

Totally agree with this - pretty much what I have been saying & the only way I can see for removing the great problem of reliance on the end user being studiously rigorous by following a script (PMA would have liked users to test their systems, AFAIK), is to include hidden controls in such online ABX tests. I've suggested one such way to achieve this but I note that the ABX test utilities in Foobar & Lacinato ABX never even considered implementing such hidden controls even though it would have been simple to implement. Such hidden controls would be a far more useful 'test of the test' than the Sine wave test that was done here after the main listening test. It would have allowed the results to be evaluated in light of how many times the differences in the hidden controls were missed during the 'test' i.e how skewed that particular test (end user, playback system, environment, etc) was to false negative results
 
Last edited:
The thread was closed at least a month ago, I'm not going to waste time looking for it.

I asked for a link to any conclusive equipment tests that follow an approved protocol and you continue to say I ask the same thing over and over and then just refer again to the same articles. I gather there aren't any?

I read the ITU spec there was no appendix containing examples and results. I am assuming that it is necessary and sufficient and your other articles don't discredit it.

Yes, the ITU guidelines are exactly that guidelines for how to conduct certain types of blind tests but they have to be understood & translated into the procedures necessary for other types of blind tests, such as ABX tests
 
p > 0.05 😉 (I'm being a pedant, so please read that in a lighthearted manner)

Nothing wrong with being pedantic in this, but you missed that it is the _alternative hypothesis_ , which is in this case H1: p > 0.5 .
It is not the criterion like SL = 0.05 .

In any case, a LOT of academics are moving away from this being a reliable marker unto itself, as it lets a lot of leakage through. (unless it's supplemented with orthogonal tests, positive/negative controls, etc.)

Fortunately the NHST ritual is slowly but constantly changing;although SL = 0.05 is a useful pointer for something happening it should never be used as a hard threshold criterion for a decision if something is "significant" or "not significant", therefore the mentioning of needed replications.


<snip> I would similarly be reluctant to combine trials, but would also say that it's going to err towards false negative rather than false positive. (unless there was some perverse file-specific playback effect)

PMA raised the question in another context and i tried to explain under which conditions it is justified to combine results and when it is not.

As he still thinks it is a matter of "different thinking" i obviously failed miserably in my explanation attempt, but i´ll leave it there.

That it is not a valid approach to combine results from totally different tests/experiments (with zero knowledge about the actual detecting abilities and confounding variables), i´ve already explained at length in several posts in a discussion with DF96.

Btw, already explained then why it is otoh a valid attempt to combine results under the assumption that the null-hypothesis is true (for analysis purposes), with similar "success" 😱
 
Yes, the ITU guidelines are exactly that guidelines for how to conduct certain types of blind tests but they have to be understood & translated into the procedures necessary for other types of blind tests, such as ABX tests

I missed it too when reading Scott´s post the first time, but he is indeed asking for an example of an actual test _done_ (on whatever equipment or effect) where you (and/or me? ) was really satisfied with the choosen protocol and (i assume) the execution.
 
PMA raised the question in another context and i tried to explain under which conditions it is justified to combine results and when it is not.

As he still thinks it is a matter of "different thinking" i obviously failed miserably in my explanation attempt, but i´ll leave it there.

That it is not a valid approach to combine results from totally different tests/experiments (with zero knowledge about the actual detecting abilities and confounding variables), i´ve already explained at length in several posts in a discussion with DF96.

This seems to be a nice misinterpretation of my posts. I was wondering that it could be possible to combine the results of different members and I explained my strong reservations and disagreement with such combination especially when it was shown that some of the equipment used was unable to reproduce sine tone with distortion low enough. I would appreciate if you would not speak for myself. And it was not me who came here with the sum of results of the participants.

Can you tell original file from tube amp record? - test
 
Last edited:
I missed it too when reading Scott´s post the first time, but he is indeed asking for an example of an actual test _done_ (on whatever equipment or effect) where you (and/or me? ) was really satisfied with the choosen protocol and (i assume) the execution.

Yes

Am I reasonable to assume that same rigor (I guess without hidden controls) applies to sighted listening? I would then also assume most "forum" sighted listening deserves equal scrutiny.

BTW I never said all forum sighted listening was casual, but some by their own admission were not rigorous and lacked any controls.
 
This seems to be a nice misinterpretation of my posts. I was wondering that it could be possible to combine the results of different members and I explained my strong reservations and disagreement with such combination especially when it was shown that some of the equipment used was unable to reproduce sine tone with distortion low enough. I would appreciate if you would not speak for myself. And it was not me who came here with the sum of results of the participants.

Can you tell original file from tube amp record? - test

Sorry PMA, but you explicitely mentioned "thinking differently" and it is not a matter of "thinking differently".

You simply fail to understand the specifics of the statistical analysis in that point. No problem - it is counterintuitive sometimes - and i therefore asked a specific question to help you in understanding it; you refused to answer - for reasons i can´t imagine - and you did so after i repeated my question several times.
Why do you think i insist on it???

Btw, the line:

"And it was not me who came here with the sum of results of the participants."

shows exactly that you still don´t get it, but what can i do?
 
Still all smoke, no fire. Sorry, I'll pass.
I only became involved in this thread when the sine wave test was done & the results showed something of interest - a control listening test that showed the flaws with the main blind listening test. It was a chance to understand something about blind testing from an actual test

I already asked you these questions but you dodged answering:

"Is that what this listening test set out to do? What therefore was the point of the listening test? What would have been the outcome if PMA wasn't challenged to post sine wave test signal - would your & others test results have been considered valid?"

Care to consider & answer these questions? Did you ever consider your HDMI receiver was NOT sufficiently discriminating to hear such differences or is it your receiver that's at fault?