What kind of evidence do you consider as sufficient?

Status
Not open for further replies.
What challenge? Lets leave it at sufficient evidence = some participants take a test for which they agree on all parameters. If the test is not taken there is no sufficient evidence.

I've agreed to all your conditions, use the ITU spec, pick any controls you want, no complaints from me.

Again,I have to ask what you are talking about? You seem to be in your own bubble, talking about something which was never raised. Nobody suggested that a blind test would be the result of this thread or discussion - this is all your own imaginings.

What's this about "I've agreed to all your conditions" - when did I or anybody suggest that you need to agree to anything? Your understanding of the need for positive controls ala my explanations & Jakob's is welcome - if indeed you do understand this but your dodging of Jakob's question to you doesn't really signify that you have learned anything, really.
 
This is a bullet point on the front page of the software
  • Will play high sample rates and bit-depths (limited by your audio system only) but make sure your operating system isn't downsampling. E.g. in Windows you must set your fancy sound interface as the default interface, and you need to choose the default sample rate / bit depth to use for the device. Lacinato ABX/Shootouter will not change the sample rate or bit depth as it plays: it is at the mercy of your operating system. You can set your OS to the highest rate you will be testing, but if you play files of a lower sampling rate, upsampling will occur.

Yup, and in many cases you can do nothing about it. I have several sound cards that work fine with ARTA but used as a Windows default device do not work at high resolution. This bullet indicates that they use Windows default MME interface, the worst choice. Try it I might be wrong.
 
It's hilarious how people have argued tooth & nail with me about positive controls in ABX testing & now have switched tack to challenging me or Jakob to a blind test. For what? What's your logic, here?

What exactly do you think the controls do within a blind test & what are you hoping to achieve with your challenges?

It's completely nonsensical ranting
 
I guess you are right, I would think he would want the chance to prove he can hear the difference. I would certainly accept a reasonable test protocol.

It seems you still misunderstand what controls are for - do you think that including controls somehow makes for a positive ABX result? Otherwise what is the ABX challenge about?

Please explain your thinking
 
Not the stats, which would be a separate issue if one wanted to get into that. They did and presumably still do describe what the displayed stats mean about the chances of someone guessing. The description is incorrect, but it is not big deal.

The story I was referring to is Foobar could print out your score with a checksum included. If you feed the whole printout into a website they put up, it could verify that the score numbers were not altered by recalculating and verifying the checksum. This was believed to be of value to prevent cheating when reporting scores. However, cheating was still quite possible which I generally alluded to but didn't describe in any detail. A diyaudio moderator then said I would be doing a service to the community if I would disclose details of the potential cheat, because he said, they suspected there had been some cheating going on and they didn't know how it was possible. So, I complied with the request.

This is getting long and I hate to drag it all along, however,

I'm not interested in what other people tell me they can hear. I want to know what I can hear.

I attempt to be honest with myself, and evaluate myself and my ability without attempting to skew what others may or may not claim they can hear.

To this end, I am questioning whether an honest person can use Foobar 2000 ABX to determine whether or not they can hear a difference between two files with sufficient probability to say there are differences which can be heard.
 
To this end, I am questioning whether an honest person can use Foobar 2000 ABX to determine whether or not they can hear a difference between two files with sufficient probability to say there are differences which can be heard.

I would say yes, the cheats require some deliberate intervention, the basic ABX function if you take it as it is gives a valid result within its scope.
 
Good enough for me.

I will leave you to your discussion as it is a bit beyond me for the most part as I haven't used statistics and probability much since I graduated in 1981.

A quote from "The Point", by Nilsson, narrated by Ringo Star when Oblio and Arrow met the Rock Man:

"The thing is you see what you want to see
and you hear what you want to hear - dig.
Did you ever see Paris?" - Oblio said, " No". "
Did you ever see New Dehli?" Oblio said "
No".
Well that's it - you see what you want to see and you hear what you
want to hear", said the Rock Man and with that the Rock Man
fell soundly asleep leaving Oblio and Arrow once again all alone."
 
To this end, I am questioning whether an honest person can use Foobar 2000 ABX to determine whether or not they can hear a difference between two files with sufficient probability to say there are differences which can be heard.

If they are easy to hear differences then it works okay. If they require a lot more focused attention to differentiate, then it may be you can't tell a difference when you know very well that you can, but onlly if not distracted by the program. There are other ways that an honest person can test one's self blind that are probably more accurate is the differences are very small and require more focused attention.

Maybe look at it like this: most people can remember a series of around 7 digits and hold them in working memory. For some people it is more, for some less. If normally you could hold 7 and you were asked to hold 7 and then somebody knocked on the door to ask you for simple directions to the next block, you might find that the distraction caused you to forget the number. If you can normally hold 7 digits but this time I ask you to hold 8 and you are repeating them over and over so as not to forget with this harder task for you, then I ask you what you had for breakfast this morning and at what time, that might be enough to cause you to loose it.

So, we all have limits of how much detail we can hold for how long in the presence of how much distraction. Remember small differences between two files can be a lot of detailed information to hold in memory if you don't have any tricks to encode it. By that I mean you can remember a pattern of, say, 16 lit and unlit LEDs if you encode it in hex. Then you have only two pieces of information to remember instead of 16. With some sound differences we aren't used to hearing we don't have a good way to encode it so you have to remember all 16 things individually.

What it all means is that you can be quite sure you hear a difference and reliably tell which file as which as often as you want, so long as nothing distracts you to forget the information you are holding in memory. Running foobar, looping it by hand, finding a section to listen to, remembering which sound was A, which was B, which was X, which was Y, finding the correct button to push, fiddle around with the mouse, etc. and you may find you lose it very, very easily and with much frustration, all even though those things are normally as trivially easy for you as saying what you had for breakfast, or answering the door.

And, all that even though it was simple as pie to differentiate two files so long as you didn't have to be distracted.

I keep asking for a few very minor changes to Foobar ABX to ease the distraction and make it easier to find a section to listen to and memorize it. Never happens, probably won't ever happen. If you want to know a better way to test yourself blind, I could tell you but there is a learning curve so we probably shouldn't waste time with a lot of details if nobody really cares.
 
Last edited:
....To this end, I am questioning whether an honest person can use Foobar 2000 ABX to determine whether or not they can hear a difference between two files with sufficient probability to say there are differences which can be heard.
If you are referring to me, then yes despite changes introduced by Foobar ABX, I am able to reliably discern fine differences between two copies of the same file.
My method was to (importantly) first listen to X and I usually knew which one it is, and then flip to A or B depending on that decision to confirm (or deny) that first decision.
A further few quick flips from X to A, or X to B reinforced my decisions.
I did try two further experiments and scored 9/10 twice but by then I had become bored and the task became tiresome.....I was close to going to bed.
I say it is possible to discern very fine differences reliably with Foobar ABX, but the conditions need to be optimal...ie late at night so clean power, no environmental acoustic noise, complete attention to the task etc.
Just sayin'.


Dan.
 
Last edited:
Why do you NOT consider listening to a product at home in your setup, as evidence?

Because for it to be useful to anyone other than myself, it doesn't pass the "smell test". It is simply anecdotal evidence, which is not, taken alone, a valid form of evidence.

Secondly we don't know how valid my results actually are. A single example is not statistically significant. Even if we agree that a result could be valid when performed by a single participant.

For example the question of what, exactly, I prefer must be considered. Maybe I like excessive levels of 3HD (such as the typical profile of a recording on magnetic tape) versus near-zero 3HD. Maybe I like over-emphasized mid-bass over "flat" (no need to define "flat" for the purpose of this argument, aside from just agreeing that the commonly accepted definition will suffice).

It is possible to train the ear of almost anyone to recognize the usual distortions typical of audio reproduction equipment, but how do we know I have "passed" such training? With reviewers with generally accepted expertise, we have their body of work to use as a validation of that skill. And in my OP I cited such a skilled reviewer's conclusions as only one component of what I consider "evidence"; without the other two it is merely a trend or a subjective conclusion.

To be clear, I am quite willing to accept my own conclusions as sufficient evidence that I prefer one Device Under Test (DUT) over another, or even over all DUTs that I am aware of.

But no-one else should accept it; thus it is not, in and of itself, "evidence", as we understand the scientific method to work and with the caveat that I think we should agree that the OP is referring to a robust form of evidence. (If we don't accept it as such, the discussion is pointless, even if it might be entertaining).

At it's most generous, it is merely an input to what may, with other inputs, form part of an evidence. We must keep in mind that the evidence we seek must be useful and must be reasonably described as useful to everyone, not just me (or if we don't accept that, it none the less answers the OP's question as to what I consider "evidence' versus my "personal preference"). I can distinguish one from the other.

Put another way, I consider "evidence" to be a higher standard than my personal preference alone. That does not preclude the fact that I can accept my personal preference to put to my own uses and to form my own decisions when a decision is required to solve a perceived personal audio system problem or deficiency. I can accept "I like it" as the sole required factor for my own system, while at the same time accepting that maybe the whole planet may prefer my component choice to be better employed as a boat anchor.
 
Last edited:
Status
Not open for further replies.