Test your ears in my new ABX test

Have you been able to discern the files in an ABX test?

  • Yes, I was able to discern the files and have positive result

    Votes: 3 20.0%
  • No, I was not able to discern the files in an ABX test

    Votes: 12 80.0%

  • Total voters
    15
  • Poll closed .
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.
It could be that foobar ABX is trying to level match the files, or something like that. No doubt there is a way to intercept the digital output of foobar ABX and capture the stream into file(s) for analysis. Maybe I will do it when I have time, but don't have time right now.
 
I would like to explain why I decided to make and post this latest ABX test. Few months ago, I built an AB box which has a line level stereo path and power stereo path for speaker signals. The reason was to make blind tests of power amplifiers. The line level path serves to precisely match volume of both amplifiers and the relay path switches outputs from two amplifiers to one pair of speakers. I started to test the amplifiers. In case they had low output impedance and flat frequency response, they all sounded same even in A/B test. No difference. The OPA549 chipamp based amplifier was the one with worst measurements, so I decided to make this test of the original data vs. recorded output from the amplifier (loaded by real speakers). And this is the result. So far, I do not see any valid proof that the under test files were discerned. This is interesting, because in my A/B testing it was always amp1 vs. amp2, not ampX vs. data.

Years ago, when I first came to diyaudio, I was more on the subjectivist side regarding sound evaluation, though I was always doing all measurements. As times go by, I am moving to objectivist position and I tend not to believe any audiophile impressions that are not supported by verifiable proof.
 
Hi Pavel
You seem to have done a nice job, my compliments.
The "probability" number in the screen shot, however, looks seriously suspect.
I don't think this is your work, do you have any information on how it is calculated?

Best wishes
David
It appears to be the cumulative binomial probability. It's statistically valid, but mislabelled - it's the chance of getting at least x correct answers out of n trials by guessing, not the probability that the result is due to guessing.

I am sorry, because something seems to go very wrong with this foobar. I have just made the test with 11/12 result. First 4 trials I did by listening, very seriously. Then I just continued by clicking X, X, X ... without any listening, until I reached 12 attempts. The result is 11/12 and the protocol is valid ...
Something is going very wrong.

Code:
foo_abx 2.0.2 report
foobar2000 v1.3.7
2017-11-03 13:28:01

File A: cc.wav
SHA1: 4f7cea25c5c93af637dc3dd7ad416402aa40eac4
File B: oo.wav
SHA1: eb0584e9e01746a0ee00ef60decf8eaca5832fcb

Used DSPs:
Resampler (PPHS)

Output:
WASAPI (event) : Speaker (USB Sound Blaster HD), 24-bit
Crossfading: NO

13:28:01 : Test started.
13:30:00 : 01/01
13:30:17 : 02/02
13:30:34 : 03/03
13:30:59 : 04/04
13:31:01 : 05/05
13:31:03 : 06/06
13:31:04 : 06/07
13:31:06 : 07/08
13:31:08 : 08/09
13:31:09 : 09/10
13:31:10 : 10/11
13:31:13 : 11/12
13:31:13 : Test finished.

 ---------- 
Total: 11/12
Probability that you were guessing: 0.3%

 -- signature -- 
3bb6ce0ca03becfeda5f0864820c400539af5c98
Given the 0.3% probability, one would expect about 1 in 300 trials to have at least this many right by chance alone. That may seem like a low probability, but given that the Foobar ABX Comparator is used a lot, it is pretty much guaranteed to happen eventually. Thus the apparent improbability of this result could just be due to "the law of truly large numbers".
 
Last edited:
It could be that foobar ABX is trying to level match the files, or something like that.

No, because all of these suspicions are easily measurable during playback. In case there is an audible difference, then it is measurable and also distinguishable by ears. In case that the difference is only measurable and not audible, it remains measurable and inaudible.

IMO this is just an excuse how to explain unsuccessful ABX results.

I was complaining on a randomizer (order of files played), which seems to me not to be very random.
 
Administrator
Joined 2004
Paid Member
My first pass. Thought I could tell, Foobar says otherwise. :)

Code:
foo_abx 2.0.4 report
foobar2000 v1.3.12
2017-11-03 12:35:47

File A: cc.wav
SHA1: 4f7cea25c5c93af637dc3dd7ad416402aa40eac4
File B: oo.wav
SHA1: eb0584e9e01746a0ee00ef60decf8eaca5832fcb

Output:
DS : Primary Sound Driver
Crossfading: NO

12:35:47 : Test started.
12:38:21 : 00/01
12:38:47 : 01/02
12:39:24 : 02/03
12:40:48 : 03/04
12:41:55 : 04/05
12:43:22 : 05/06
12:44:00 : 05/07
12:44:50 : 06/08
12:45:19 : 06/09
12:46:37 : 07/10
12:47:55 : 07/11
12:48:59 : 07/12
12:50:06 : 07/13
12:50:48 : 07/14
12:51:24 : 08/15
12:51:53 : 09/16
12:51:53 : Test finished.

 ---------- 
Total: 9/16
Probability that you were guessing: 40.2%

 -- signature -- 
e1c486f467d3f8dfd35bda60595b0e88124ff8a4
 
I'll be slightly provocative without intention to be impolite: generally, is ABXing the correct method? If you take one single file, make a copy with different name and let audience in good will make the ABX test, you will force results reflecting individual/psychological response to test method only - not the quality of the file itself. If you study how listening (as every individual cognitive process) works, you will recognize how difficult it is to get "objective" results from subjective process. Even more complicated is to interpret the results correctly...;-)

That's a good point, but we already had it here. The audience was asked for preference, not for ABX, the ABX was an option only. The same files with different names were considered sounding different in an impression driven test. I do not want to mismatch this now.
 
Could easily be a wrong answer getting counted right, too. Impossible to say with how it reports on the summary.

After first four honest clicks with listening I was clicking 8 attempts without listening and received 11/12 success. This is alarming, or just accidental?

This goes to nowhere. If there is a real audible difference, like more noise or additional pop/click, foobar ABX result is always 100% right. No wrong answer getting right, no excuses. Excuses start just at the moment when there is no listening success, when the difference is inaudible. Without ABX, people are just self-assured they hear the difference. Psychoacoustics, normal, standard situation. Some are willing to admit they just fool themselves, some are not.
 
... The audience was asked for preference, not for ABX, the ABX was an option only. The same files with different names were considered sounding different in an impression driven test...

Yes, that's what I mean. Asking for preference in usual human communication implies there IS a difference! It makes me look and FIND the difference the more eagerly and successfully, the more I strive to comply to the social context (even if there is objectively NO difference!). Social acceptance in my pack counts much more than subtle (and futile :) truth. So we live, the naked apes! Not really ready to stand the truth - google for Asch conformity experiment...
 
Last edited:
Administrator
Joined 2004
Paid Member
If you want to find the difference in mastering against your files, then, please, measure DR of your file or show the time envelope.
I did just that and found that the version you have has been compressed about 6dB to make it louder. Such is the beauty of "remastering". :rolleyes:
Below are the waveforms of your rip and my rip. Both have a peak value of -2dB FS
 

Attachments

  • Gaucho.png
    Gaucho.png
    55.7 KB · Views: 125
Compression adds distortion and reduces HF. Look-ahead limiting would be another matter, but 6dB is a lot of squeezing in any case. Adds distortion, needs redithering. Added distortion may help mask other distortion such as from an amplifier.

I sure don't see how the files could sound the same given the claim that one is a source and the other has been through a DAC and amplifier and an ADC. I had a DacMagic+ here on trial and I sent it back. It had a sound compared the the Benchmark DAC-1, and it wasn't cleaner, it would be a step down.
 
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.