What kind of evidence do you consider as sufficient?

mmerrill99 · 2018-07-21 2:16 am

planet10 said:
It is why i figure the DUT that requires less fill-in is likely the better.

dave

Yes, the best fit analysis requires processing power - the less processing that is required (because the auditory signals are better defined and/or the sound patterns are more stable), the more relaxed we find the experience

abraxalito · 2018-07-21 2:17 am

planet10 said:
It is why i figure the DUT that requires less fill-in is likely the better.

...or it could be that the DUT which permits the most fill-in sounds clearer. Given that its not just about 'filling in the gaps' - to paraphrase someone else 'its fill-ins all the way down'.

DPH · 2018-07-21 2:18 am

Evenharmonics said:
If you meant satire, no, those are not satire. They are accurate to real life events. That is unless you can debunk any one of the 10 with proof.

Neither you nor Peter, whom I generally respect can account for the very fact that depending on the protocol, people's performance changes. In a positive control test. How do you explain this body of work, which come from doing double blind differentiation testing? This goes along with the embarrassment you made of yourself in terms of test methodology back in November. Please, if for no other reason than to preserve your overinflated ego, just stop while you're behind.

I mean I get that you have a mindset and think you're the smartest guy in the room, and flash that by also being the most insufferable, but you're dead wrong about KNOWN effects and doubling down on your assertions. Which makes you a peer in fallacies to the very ideology ostensibly you oppose. It's like the folks that live on the polar opposite extremes of political ideology are closer to each other than anyone else.

mmerrill99 · 2018-07-21 2:23 am

abraxalito said:
...or it could be that the DUT which permits the most fill-in sounds clearer. Given that its not just about 'filling in the gaps' - to paraphrase someone else 'its fill-ins all the way down'.

You mean the better DUT presents signals which are cleaner & easier for auditory processing to reach its moment-by-moment auditory decisions?

Evenharmonics · 2018-07-21 2:23 am

DPH said:
Neither you nor Peter, whom I generally respect can account for the very fact that depending on the protocol, people's performance changes. In a positive control test. How do you explain this body of work, which come from doing double blind differentiation testing? This goes along with the embarrassment you made of yourself in terms of test methodology back in November. Please, if for no other reason than to preserve your overinflated ego, just stop while you're behind.

I mean I get that you have a mindset and think you're the smartest guy in the room, and flash that by also being the most insufferable, but you're dead wrong about KNOWN effects and doubling down on your assertions. Which makes you a peer in fallacies to the very ideology ostensibly you oppose. It's like the folks that live on the polar opposite extremes of political ideology are closer to each other than anyone else.

So this is what you do when you run out of rebuttal material, just throwing cheap shots.

DPH · 2018-07-21 2:31 am

Evenharmonics said:
So this is what you do when you run out of rebuttal material, just throwing cheap shots.

I'm glad you finally agreed you're dead wrong! We're making progress.

I don't need to provide a rebuttal, I haven't moved the goalposts and provided citations with evidence.

Where's your content, any content? Not an inflammatory article from Audio Critic, but something with some experimental backup.

Evenharmonics · 2018-07-21 2:35 am

DPH said:
I'm glad you finally agreed you're dead wrong!

I haven't agreed to anything. Only in your own imagination.

I don't need to provide a rebuttal,

You don't have any.

I haven't moved the goalposts and provided citations with evidence.

You sure did, evidence of irrelevant subject.

Where's your content, any content? Not an inflammatory article from Audio Critic, but something with some experimental backup.

It's inflammatory to those who believe in audio voodoo.

Greg Erskine · 2018-07-21 2:47 am

DPH said:
- I point out that you haven't done your homework: there's a body of research into human preference testing that points out that preference/difference detection *is* mentally hard. I go ahead and cite some of it for you.

I "thought" this applied only when there was greater than 2 choices. Our brain is wired to make instant decision between 2 alternatives. The more alternatives the more confusion and apparently results in errors. Now, did I hear this somewhere or just making it up.

I thought this is why I like to shop at Aldi.

DPH · 2018-07-21 2:47 am

Evenharmonics said:
You sure did, evidence of irrelevant subject

Willful ignorance? It's irrelevant because it's inconvenient to your argument.

Explain to all of us how the mental differentiation and decision engine processes in our brain suddenly change when it's a different sensory stimulus. Magnitudes are going to change depending on the exact test (that's something that will change with protocols, so a reason for positive controls), but this is a known effect and it's not *just* for food.

Here's one for audio memory where they did a a-x and an abx test. The abx underperformed the a-x even after accounting for the greater statistical power of the abx.
https://link.springer.com/content/pdf/10.3758/BF03204857.pdf
Edit, since most people don't have access behind the paywall: A common basis for auditory sensory storage in perception and immediate memory | SpringerLink

Again, where's your evidence?

planet10 · 2018-07-21 2:51 am

abraxalito said:
...or it could be that the DUT which permits the most fill-in sounds clearer. Given that its not just about 'filling in the gaps' - to paraphrase someone else 'its fill-ins all the way down'.

Filling in means your brain is doing more work and cannot be as relaxed.

dave

mmerrill99 · 2018-07-21 3:01 am

planet10 said:
Filling in means your brain is doing more work and cannot be as relaxed.

dave

I believe "filling in" is a misleading term - "resolving ambiguity" would be better & more explanatory of what's happening in underlying processing.

Resolving ambiguity implies analysis & processing, moreso than filling in might suggest

planet10 · 2018-07-21 4:06 am

Resolving ambiguity is probably a more appropriate term. Adding enuff to make it make sense to the listener. The Laurel/Yanny thing is a good example of living on the edge.

dave

Evenharmonics · 2018-07-21 4:09 am

DPH said:
Willful ignorance? It's irrelevant because it's inconvenient to your argument.

Explain to all of us how the mental differentiation and decision engine processes in our brain suddenly change when it's a different sensory stimulus.

You were talking about picking preference, not detecting difference. You've moved the goal post.

Magnitudes are going to change depending on the exact test (that's something that will change with protocols, so a reason for positive controls), but this is a known effect and it's not *just* for food.

Here's one for audio memory where they did a a-x and an abx test. The abx underperformed the a-x even after accounting for the greater statistical power of the abx.
https://link.springer.com/content/pdf/10.3758/BF03204857.pdf
Edit, since most people don't have access behind the paywall: A common basis for auditory sensory storage in perception and immediate memory | SpringerLink

Again, where's your evidence?

My evidence of what? Once you answer my question on post #26, I'll move forward.

abraxalito · 2018-07-21 4:23 am

planet10 said:
Adding enuff to make it make sense to the listener.

Right - sense is quite literally made in the brain. By made I mean constructed, out of the paucity of the data. Its harder work to create sense when some of the data is inconsistent or tending towards greater ambiguity than towards less.

planet10 · 2018-07-21 4:37 am

A good system will provide more information to make interpreting what is happening easier.

dave

DPH · 2018-07-21 4:45 am

Evenharmonics said:
You were talking about picking preference, not detecting difference. You've moved the goal post.

I corrected myself quickly and the testing methodology for both overlap, as does their characterization. Of course you're the ABX expert so you know that already. ;-) And playing games of semantics means you don't have any refutation. We call that grasping at straws.

My evidence of what?

That's exactly the problem. You have made assertions you cannot back up. The onus is on you to provide evidence for your own assertions. I am not ransom to your games.

Johnny2Bad · 2018-07-21 4:54 am

Well, firstly, we must gather evidence and come to conclusions. The preferred method (and the one practiced by business to make decisions) is to use the best available information and go with that. It is a relatively quick decision method ... you don't wait for every bit of data to come in, you use what you know now.

So you may take a position based on what you know now.*

You might come across measurements online. Firstly, you have to determine if they are valid ... it's not unusual for some offshore vendors to fake an Audio Precision readout, for example. You have to know what a genuine AP screen grab should have and to be wary of evidence of cut-and-paste elements.

Just an example ... the point being if you don't do the measurements yourself you must have reasonable confidence they were performed correctly and with what equipment (which might mean limitations that are important to a particular measurement).

Next we come to listening tests.

Anecdotal evidence is just that. It should be treated the same way as any "story" someone might offer; in other words it represents a low bar of evidential confirmation or assertion. Again it's better if you have an opportunity to listen for yourself. This is a much higher bar of validity. None the less, it cannot be said to be a perfect level of validity.

Group Listening Tests come in two distinct flavours. The essentially uncontrolled demonstration where listening impressions are made by the various participants. Although anyone can train themselves to listen for various known aberrations (eg to identify harmonic distortions) for the most part people don't do that so you have to rely on their experience as a listener.

Another issue is some listeners are very critical of accuracy (eg they expect the Yamaha Grand Piano to sound like a Yamaha Grand Piano versus a Steinway Grand Piano) while others are less critical of that and just prefer sonics that sound "good" to them. That may be a situation where they prefer the less accurate reproduction as there are many euphonic coloration's possible with HiFi.

Finally we come to the Blind and Double Blind Auditions. Be wary of those who advocate DBT testing but have never participated in one. They are espousing an opinion based essentially on the first line in this post ... gathering the best information (from others) and making a decision (or coming to a conclusion).

It is not particularly easy to conduct a proper DBT. Listeners must be relaxed, but it is essentially a higher-than-normal-relaxed-stress level exercise. There is a strong likelihood of a listener who seeks to "hear what everyone else hears" as the ego starts to play a role in conclusions. Listeners *must* be free to come to their own independent conclusions, something that is not actually easy to achieve in a room setting with multiple listeners.

It's common knowledge that our brains tend to prefer a presentation that is louder. How do you match levels? Is the response of the level pot or switch linear ... in other words would it change if you adjusted the levels higher or lower? Exactly how loud do you want to present the musical score? At what frequency should the levels be matched? Standard procedure is 1 KHz, but that is above the fundamental frequency of the human voice and most (not quite all) instruments playing their highest notes. What if the DUTs differ at some lower frequency than 1 KHz? Should you level match at some other frequency like A 440(Hz)? What does your DBT results tell you about that?

The test must be unsighted. No-one should be able to see the Device(s) Under Test, and that includes prior to the audition. No-one should know what, exactly, is being auditioned (no prior information about the DUT, no knowledge of whether they are auditioning amps or loudspeakers, etc). Auditioning loudspeakers is particularly difficult as the placement and interference of racks or competing speakers is a problem. Moving speakers means more personnel, as for a DBT test to be properly conducted, the person operating the test switching etc must not know which DUT he is switching to and from. He too should not know what, exactly, is being tested (same as listeners noted above).

Then there is the number of audience participants. Many DBTs do not use enough listeners to reasonably exceed statistical criteria. A DBT result with hundreds of listeners is far more valid than one with 40 listeners.

By far the easiest and least taxing method is to simply listen for yourself and decide. All that is required then is to properly understand that your conclusions are yours alone, and to offer them to someone else has a low level of reliability to that other person.

Many people do not trust their own ears. That is unfortunate, but they can be trained to listen for many known aberrations. This is not, as some might conclude, an exercise in making them "Golden Ears Reviewers"; it is simply an exercise in confidence in their own assessments.

Fear of "making a mistake" when buying or assessing audio components is real and can be reduced or eliminated. Once you find something that you confidently like, you can just quit the "work" and enjoy your HiFi. Some people have "Nervosa" where they are never truly confident of their choices. Ear Training might help those people.**

* John Kenneth Gailbraith, the renowned economist, was often criticized for changing his economic advice. He is credited with the following quote (the exact wording has been reported differently by different people attributing the quote to him, but any version is pretty succinct).

" ... When I discover new facts, I change my opinion. What do you do, sir? ..."

** Ear Training is not difficult. For example to teach someone to recognize 2nd Harmonic Distortion, you simply start with musical examples that contain obvious levels of 2HD and slowly reduce the % until they are able to recognize low levels. Just an example, but the basic principle works with regard to all aberrations that might be likely with different equipment.

Johnny2Bad · 2018-07-21 5:15 am

Greg Erskine said:
I "thought" this applied only when there was greater than 2 choices. Our brain is wired to make instant decision between 2 alternatives. The more alternatives the more confusion and apparently results in errors. Now, did I hear this somewhere or just making it up.

I thought this is why I like to shop at Aldi.

Absolutely true. A good HiFi salesman will ask the potential customer a set of questions designed to narrow down what the buyer is actually looking for (and because people do not trust salesmen in general, he may conclude it is something other than what the customer claims to seek. For some reason a common customer tactic is to lie about what they are truly seeking; they see it as some kind of "test" of the sales staffer) and then present three choices.

Never present more than three; the chances the customer will walk go up astronomically when presented with too many choices. Sales staff are trained in this.

One of the three probably will be an "outier" that is dissimilar to the other two. It is there to either be dismissed by the customer or perhaps it's a product that is generally believed by the Audio community to be superior but not a consumer favourite. Say, a Cambridge Audio receiver when the other two are well known mass market Japanese brands. (Just a quick example, not a reflection of the relative merits of those brands).

in either case, the customer will either choose or reject the third choice rather quickly. It is a sales mistake to offer three very similar choices, making this step difficult to achieve.

Then it comes down to two, and the final choice is simply done via comparisons. Each step is designed to progressively reduce the stress level of the customer.

mmerrill99 · 2018-07-21 12:24 pm

abraxalito said:
Right - sense is quite literally made in the brain. By made I mean constructed, out of the paucity of the data. Its harder work to create sense when some of the data is inconsistent or tending towards greater ambiguity than towards less.

planet10 said:
A good system will provide more information to make interpreting what is happening easier.

dave

Yes, & the question/debate is - what are the cues that are improved in the better playback systems which lead to better clarity? Better clarity, more realism of the illusion, seem to be the psychoacoustic results of better playback systems but what are the measurable differences that results in this better playback? It would seem that we are not currently measuring this correctly or we are interpreting our existing measurements incorrectly?

Jakob2 · 2018-07-21 12:33 pm

Folks,
although fascinating, can you please discuss "Aldi and liverwurst" in another thread?
Also the discussion about what our brain needs/takes to construct a convincing impression of the processed sound field, means convincing in a sense of compatible to our experiences, deserves its own thread.

Waly said:
Wrong question, to start with. You need to define a hypothesis before looking into a) what kind of test would fit and b) what would be considered proof (or disproof) of the hypothesis. There is no one size fits all approach.

Maybe, but im just asking for the personal criteria that members want to see being met before accepting any "result/procedure" as evidence although they believed something different.

So although there is no "one size fits it all" like procedure, there still seem to exist certain features considered to be mandatory, i.e. like "blind" "double blind" etc.

Narrowed a bit more, the question is related to the posted demand for "blind tests" in many threads. It is my impression that often only negative results from these "blind test" find acceptance, which leads to the additional criteria that must exists (seem to exist).
That "blinding" alone isn´t sufficient is well known, therefore let´s assume level matching as a given, but what else?

What kind of evidence do you consider as sufficient?

Member

Member

Member

Member

Banned

Member

Banned

Member

Member

frugal-phile™

Member

frugal-phile™

Banned

Member

frugal-phile™

Member

Member

Member

Member

Member