AES Objective-Subjective Forum

Status
Not open for further replies.
janneman said:



So, you wouldn't recognize a picture of your mother if I held the picture upside down?

I just tried it. I can. I can also read upside down (slower), and tell time from my watch when it is upside down.
Maybe you were raised in down under?😉

Jan Didden

See Kanwisher, et al, Cognition 68:1 (1998).




Inversion severely impairs the recognition of greyscale faces and the ability see the stimulus as a face in two-tone Mooney images. We used functional magnetic resonance imaging to study the effect of face inversion on the human fusiform face area (FFA). MR signal intensity from the FFA was reduced when greyscale faces were presented upside-down, but this effect was small and inconsistent across subjects when subjects were required to attend to both upright and inverted faces. However when two-tone faces were inverted, the MR signal from the FFA was substantially reduced for all subjects. We conclude that (i) the FFA responds to faces per se, rather than to the low-level visual features present in faces, and (ii) inverted greyscale faces can strongly activate this face-specific mechanism.
 
Wavebourn said:
Jan;
if scientists believe that abilities to hear are lost forever they can't explain why people tell the difference between real and recorded sounds. But when scientists believe that what lost is possibility to pay attention on details we hear subconsciously that explains everything.

There is a tribe in Russia, living close to North Pole, they have 40 names for a snow! They can tell the difference. They can tell the difference consciously! But that does not mean that somebody from African tribe can't understand that kinds of a snow are different...

In 2004 I attended DHE (Design Human Engineering) seminar with Dr. Richard Bandler and John La Valle. Among other things we were trained to calibrate our sensory perception. For example, to see in darkness. The trick was simple: stop expecting to see colors and you will see in darkness!
Stop expecting to see what's written on the money and you will know it's value by touch! And so on.
Funny case happened after one German guy calibrated his ability to tell the distance, an American ultrasound device was used to measure the distance. Smocking outside during the break he laughed: "Now, what I will do with your foots and inches?!"

So, what spoils result, is expectation
We "don't see" and "don't hear" what we don't expect to see or hear. But actually we see and hear, but ignore.
[snip]

Very interesting! Again, it shows the major difference between hearing (that is, getting air vibrations converted to impulses going into your brain) and perceiving, which is the 'sound' you are aware of. There's often only a very loose correclation between the two.

As to your snowflake example, I think any baby human can learn to distinguish 40 different snowflakes. It all has to do with the very flexible brain wiring right after birth; a lot of the connectivity is uncommitted and although there are of course some 'pre-wired templates', a lot of the actuall connectivity takes place early in life under the influence of your experiences. Learning to walk, with the very intricate balance between motor commands and negative feedback signals, is a good example. The basic structure is there, but you need to train and use it to finetune it.

There is a school of thought that everybody is born with absolute pitch but that in most it just degenerates because that particular wiring setup is really never used in youth, so the capacity is taken over by other functionalities like skateboarding 😉 . There's a constant, agressive competition in the brain for resources.
An interesting example is the new findings in treating people with stroke that, say, lose the use of the right arm. Most treatment focussing on using the good arm in a better way. But that appears to be the wrong way, because the unused brain capacity from the bad arm is very quickly taken over by the good arm. That reduces any chances to regain that bad arm significantly. In some trials, what they did was to immobilize the GOOD arm, to keep it from taking over resources, and forcing the bad arm to regain control as much as possible. Use it, or lose it.


Wavebourn said:
[snip]There are 3 major things our conscious mind does in order to be able to navigate in ever changing environment: Distortion, Deletion, Generalization. Thanks to this 3 things we can tell the difference between say apples and tennis balls, and we can tell that all apples we see are apples, regardless of shapes, colors, sizes.

That's also how I learned to understand it. It's a matter of survival under the onslought of a myriad sensory inputs. You NEED to process it, like you say, distort, delete and generalize.

Jan Didden
 
Jan;
after learning NLP and DHE I don't believe anymore in "wiring" and "rewiring", so fast learning and re-learning happens if to speak a language native to subconscious mind. For example, a phobic reaction from which a person suffered many years may be "rewired" in 5 minutes. Persons who could not tell the difference between notes can sing after 10 minutes of a proper calibration. Not enough to grow nervous cells. :whazzat:

My point was: we still have "lost" abilities, but we are not aware of them consciously.
 
Wavebourn said:
Jan;
after learning NLP and DHE I don't believe anymore in "wiring" and "rewiring", so fast learning and re-learning happens if to speak a language native to subconscious mind. For example, a phobic reaction from which a person suffered many years may be "rewired" in 5 minutes. Persons who could not tell the difference between notes can sing after 10 minutes of a proper calibration. Not enough to grow nervous cells. :whazzat:

My point was: we still have "lost" abilities, but we are not aware of them consciously.



Anatoliy, indeed it is not rewiring in the sense of growing new nerves. It is estabblishing a specific connection pattern or network, which can happen very fast.

For instance, if you call up a memory, you don't get some data from a memory location like a computer would. Instead, you quickly connect existing neurons in your brain to bring up the same 'pattern' as when you had the initial experience you try to remember. The better the pattern resembles the initial experience, the more accurate the memory recall. Since this is never 100% accurate, your memories are never 100% accurate either.

Jan Didden
 
Another interesting DB test. This one was done by a guy called 'billmilosz' on the Yahoo group for the DEQX. You can find this particular post, and a whole discussion, there.

/start of quote
"Here was the test:
I used a few different pieces of music, recorded professionally at 96 kilobits per second with 24 bit depth, in stereo. For example,
Dvorak violin concerto from http://01688cb.netsolhost.com/samplerdownload/

Then, using Adobe Audition I did high-precision resampling down to 44.1 kilosamples per second with 16 bit depth, same as a CD.
So I had two files of the same music- one which had a lot more information in it (24 bit / 192 khz sample rate) and one in which this extra information had been removed, leaving only the mount of information that one finds in a normal CD.
I wrote a little program for my computer using the C++ language with which I have some familiarity. This program allowed me to start two software audio players at the same time - one playing the 96 / 24 bit version of the file, the other playing the 44.1 / 16 version. They stay in perfect sync whilst playing.

Then this same little bit of software I wrote waits for any key on
the PC keyboard to be hit. When a key is hit it either keeps the same version playing (96/24 or 44.1/16) or it switches from one version to the other- there's a large table of true random umbers
that's used to randomize the action, so this is truly double-blind. Then it waits for the test subject to enter "Y" or "N" using the keyboard. The test subject is told to enter Y if he / she heard a difference between the two versions, and an N if they did not. The software keeps a text file as a log of the files played and the answers from the test subject.

The output of the PC sound card (a Creative X-Fi with quite respectable performance at both 192 / 24 and 44.1 /16 rates) was fed to a pair of Monarchy SM-70 Pro amps in mono, these are good class-A amps. These were driving a pair of ESL-57's, with refurbished panels and HT sections by highly-regarded Quad guru Wayne Piquet. This was in a fairly small, quiet room. The speakers were placed faily close to the listening position, so listening was essentially nearfield. Detail, linearity, transient response etc of this amp / speaker system is very good. Quad ESL-57's are very revealing.

I did this with around 45 test subjects over the past year or so. It's very easy to do, the gear is always set up in one of my rooms because that's where I use I, I just have to select the PC as the source to the power amps and fire up the program, so pretty much any visitor to my home gets badgered by me into doing the test.

I know a lot of musicians, sound engineers, producers, and also a lot of guys who consider themselves highly-skilled "golden eared
audiophiles." I also used some non-music / non-audio types, and
also a few children (around 7~10 years old. They can hear far higher in frequency. Very few people over 30 can hear well above 16 kHz, this is a medical fact.) I didn't use any rock type musicians or producers, typically their hearing is pretty shot from listening to loud shows. The musicians were a mix of jazz, folk and classical. Some were professionals and the rest were studying in MFA programs at one of the local colleges. One was with the Chicago Symphony (I live in Chicago)- some of you may have heard of them. The producers were mostly radio (NPR) types, with some film and one theater audio designer. There was also a composer / music professor. Some of these guys were also audio nuts. The other audio guys were just audio hobbyists, they are accountants, lawyers, a cab driver, software engineer, and one art museum curator. There was one audio store owner.

All the test subjects are asked is to try to see if they can hear a
difference in the sound.

Correct answers averaged out just below 50%. No one listener got more than 53% correct. This is pretty much what you'd expect from chance.

To me, this experiment shows that a fairly decent sample of folks who make a living with music and sound, along with people who consider themselves skilled at listening, simply cannot hear the differences between 96 /24 and 44.1 /16 audio.

I suppose it could be argued that using a better audio card is necessary, but I reject that argument. This is not about SOUND QUALITY, it is simply CAN YOU HEAR ANY DIFFERENCE. The Creative X-Fi is a well engineered card with low noise and distortion, etc., and if the differences between 96 / 24 and 44.1 / 16 are REALLY audible, then SOMEONE should have heard a difference.

But no one did.

By the way, about half of the audiophiles said they COULD hear a BIG difference between 96/24 and 44.1/16 files when they KNEW which file they were listening to. But this was apparently just a
psychoacoustic effect of expectation: they THOUGHT a 96/24 file
SHOULD sound better, and in their brains IT DID.

Once they no longer knew which file they were listning to, this "BIG DIFFERENCE" in sound TOTALLY VANISHED. That tells me that, in fact, they COULD NOT ACTUALLY HEAR A DIFFERENCE.

I am guessing that upsampling a 44.1 /16 CD to some higher rate would also prove to be inaudible in a double blind test. I will make a pair of files for this and add it to the tests I subject my dinner guests to.

Most of the people involved in the test were quite interested in the results. Only one is no longer speaking to me.....

FYI I am considered a good cook. So, even if they have to sit still for a few minutes of batty psychoacoustic testing, people I invite over for dinner rarely turn down the invitation. Punjabi lamb or chicken with artichoke in mole apparently compensate for the audio test mania."
/end of quote

Jan Didden
 
This protocol I much prefer over M&M's. Quibbles in order of severity:

System - Headphones would be preferred over dipole speakers in an unfamiliar room if the objective is to strictly determine the audibility of 96/24. However as a test of the format's value in a very good home system the choice of gear appears perfectly valid.

Expectation Loading - 'billmilosz' doesn't specify how test subjects were prepared beforehand. Was it set up as a challenge? Did they know of his and prior results?

Re-sampling - Was 'Bit-Matched Playback' enabled on the X-Fi? It's not by default and the card would have been re-sampling as required to the default setting otherwise. Given his conclusions, 44.1/16 wouldn't be an unreasonable guess. The high competence of his protocol suggests such a detail was captured, however it also raises the expectation such an obvious one would be explicitly noted.

Statistics - "No one listener got more than 53% correct." Bolding mine. That's an extremely unlikely result in a sample of 45. Even pure chance predicts a bell distribution which, if the two truly couldn't be differentiated, would tend towards the mean in retesting. His short blurb doesn't clarify. Does he explain?
 
janneman said:



Anatoliy, indeed it is not rewiring in the sense of growing new nerves. It is estabblishing a specific connection pattern or network, which can happen very fast.

Ok, if it happen very fast I buy your theory except accuracy of recall that may be achieved in state of a deeep traaance. 😀

janneman said:

I used a few different pieces of music, recorded professionally at 96 kilobits per second with 24 bit depth, in stereo.

This is an usual weak point. What means "recorded professionally"?

Also, whit if his software had a bug so switched nothing? 😀
I honestly don't understand how starting and stopping players he could keep them in sync.
 
Wavebourn said:


Ok, if it happen very fast I buy your theory except accuracy of recall that may be achieved in state of a deeep traaance. 😀



This is an usual weak point. What means "recorded professionally"?

Also, whit if his software had a bug so switched nothing? 😀
I honestly don't understand how starting and stopping players he could keep them in sync.


I don't know the mechanics of that comparison and the switching, I've just quoted his post. I'll see I can get him to post here.

Jan
 
I think the usual problem with these tests is the incomplete reporting.

I can see how he could write a program to play two files simultaneously in sync (relative to the system clock) at two different rates. He doesn't explain how he does this.

Neither does he show how he verified that the DAC was in fact switching between 44.1 and 96KS/s as you posit.

As with the M/M report, what was the measured response of the system in the room?

So in the end we still have doubts and our own experiences.
 
I don't have a problem accepting the results of Jans 'billmilosz' experiment.
I believe the electrical signals from both sampling rates were different and the resulting sound in the room was different.
It was just below the subject's hearing threshold.

How maddening to have a real difference that you can't hear.
 
It's trivial to verify the DAC was switching. There are two FILES. Each has a different rate / depth. The way the data flows, if the DAC doesn't match the sample rate, it WILL NOT PLAY. Period. End of that discussion.

The point is:

No one heard a DIFFERENCE.

Let me repeat that.

NO ONE HEARD A DIFFERENCE.

If there is an audible difference between these two sample rates / depths, it damn well ought to audible when the same hardware is asked to play each in turn. If the difference is so bleeding subtle as to be inaudible except under very very special conditions, then that audible difference is far too tiny to be worthwhile.

If you need half a million dollars woth of gear to hear the difference, the difference is pointless.

If the difference can't be heard by a population of people which includes a fair concentration of professional musicians and audio engineers / producers on VERY GOOD GEAR, then as far as I am concerned there is NO DIFFERENCE. Quad ESL-57's are very good. Monarchy SM70 Pro's are very good for driving Quads. Creative X-Fi DAC is more linear / lower noise than the DACs in some very costly outboard D/A boxes, especially at the higher sample rates.

Even if the recording is only mediocre, if the difference is AUDIBLE it should STILL be audible when listening to the downsampled version!

Anyway, have you listened to the recordings I used? They're all available for download. None of the professional musicians- including the guy from the Chicago Symphony - said they were bad recordings.

Perhaps you've heard of the Chicago Symphony? Many consider it the best orchestra on THE PLANET. You think they have guys with tin ears in the Chicago Symphony?

You know, if the difference is really audible, it should show up in statistics even with REALLY LOUSY GEAR and HOPELESSLY BAD RECORDINGS. If you take a cassette and compare it to a CD, the difference is ALWAYS audible on all but the most DREADFUL equipment. I can hear the difference between the CD and the cassette of something on a $100 boombox!

SO: No one so far- in any decent study- has shown that these formats have ANY audible differences. I am certainly willing to admit that maybe some study can be concocted to show some kind of audible difference does exist under very, very special circumstances. However, I am satisfied that no SIGNIFICANT audible differences exist. Certainly nothing that would warrant the cost of replacing my 5000 44.1/16 CD's with a high rez format.
 
milosz said:
It's trivial to verify the DAC was switching. There are two FILES. Each has a different rate / depth. The way the data flows, if the DAC doesn't match the sample rate, it WILL NOT PLAY. Period. End of that discussion.

Perhaps I wasn't clear enough: BY DEFAULT THE SOUND CARD RE-SAMPLES ALL AUDIO TO A SINGLE RATE UNLESS SPECIFICALLY SET TO 'BIT-MATCHED PLAYBACK' MODE. Other ways are available, were they employed? Anyone with prior Soundblaster product experience knows the cards re-sampled everything to 48kHz and were often avoided for music because of it. I ran into the issue at work connecting it SPDIF to a Ramsa digital console. We dropped them for M-Audio.
Iain further brings up a very valid point; what special measures were required to keep the card in 'Bit-Matched Playback' mode while the hardware was accessed by two players simultaneously? Is it possible? Was it monitored and confirmed?
 
Don't get me wrong - I too believe that there is no audible difference between 44.1/16 and higher rates/resolutions. My informal, unscientific explorations have borne that out but I've never used sufficient rigor to be able to present any results.

My rationale is that properly implemented 44.1/16 gets very close to perfect playback of 20KHz signals and 90dB of dynamic range when the proper reconstruction filters are used and proper dithering is applied to the recording. If I like to listen at levels of 80-90dBSPL average and assuming a 20dB peak-to-average ratio (i.e. old school uncompressed recordings) then the peaks are at 110dBSPL. 90dB dynamic range means that the low level detail will be at 20dBSPL. The ambient noise in my room is 30-40dBSPL on a good day which will mask much of the low level detail. Why do I need 24bits?

I am also aware of many potential factors in CD playback, amplifiers and speaker-room systems that can significantly degrade from the ideal 44.1/16 capability, the Microsoft-soundcard interface probably being the worst offender.

I'm not disputing your (or M/M's) results, but if a test is to be accepted by the community then all potential pitfalls must be addressed. The equipment you use is perfectly appropriate and the methodology may well be correct. My point was that there is insufficient information in the write-ups to assure me that all the bases were covered.
 
I agree with Ian. I've seen too many streams resampled on the fly by the player/computer to assume it's working right.

He may well have taken care in this matter - we would hope so, and the rest of the set up seems very good. But at this point we just don't know.
 
There obviously could be problems with the resampling used which could invalidate the conclusions, but for me a telling part of the quoted experiment was this,
"By the way, about half of the audiophiles said they COULD hear a BIG difference between 96/24 and 44.1/16 files when they KNEW which file they were listening to."
 
janneman said:
...edit...
The output of the PC sound card (a Creative X-Fi with quite respectable performance at both 192 / 24 and 44.1 /16 rates) was fed to a pair of Monarchy SM-70 Pro amps in mono, these are good class-A amps. These were driving a pair of ESL-57's, with refurbished panels and HT sections by highly-regarded Quad guru Wayne Piquet. This was in a fairly small, quiet room. The speakers were placed faily close to the listening position, so listening was essentially nearfield. Detail, linearity, transient response etc of this amp / speaker system is very good. Quad ESL-57's are very revealing
....edit...

Jan Didden
An interesting test, but only one D to A was used and this one on a computer sound card. This is usually a very poor environment for analog signals.

Redbook can achieve a 96 db signal to noise, if the card/DAC can't, then any difference will be obscured by the quality of the sound card.

The sound card should have a digital out stream, feed this stream to one of the audiophile recognized high end DACs (Levinson, Enteck, Burmester, etc) and repeat the test. Even if the sound card is quite good until the test sequence is repeated on multiple brands of equipment you have only proved that with a Creative X-FI card, no difference was heard, not that no difference exists.

I add this post because I used to own a Denon 20 bit DAC, with this piece of equipment no difference could be heard between a 14 bit and a 16 bit version of a song on a Stereophile test record. With my Levinson (No. 36 and later a No. 360) the difference is quite noticeable.
 
Status
Not open for further replies.