Measuring the Imaginary

If those statements were true then I would not (rpt not) be able to get a very wide, deep and high soundstage using headphones, the signal arriving at my ears contains all necessary ambient information from the original recording.
Your comments re depth and width are true; Your comments re "high" require some qualification...

If you are listening to binaural material, then as I stated in my first response, the situation is different and height information is retrievable - best so if you have a head the same shape as the dummy one used for the recording (and less well known, the same sized shoulders too).

If you are listening to stereo derived from say a B-format recording with height included, or derived from say a coincident crossed pair, then you could perceive some height information much like the early reflections imply depth but to give an impression of height instead. The impression you perceive will, however, be significantly influenced by your experience of like acoustics and not necessarily related to the actual information in the recording - an effect if you like.

And if you are listening to stereo derived artificially or from a dry recording, then you are definitely perceiving an effect - deliberately engineered or otherwise.
 
Different recording methods require different reproduction means for best results.
True. Its true in many ways. Maybe someday programs like Roon will contain databases that provide various information on optimal playback processing of particular recordings. Right now we don't have that. Given what we do have today, I still don't find shuffling useful in the general playback case.
 
I still don't find shuffling useful in the general playback case
Purely as an example, Nimbus have a whole catalogue of recordings made with a single soundfield microphone. They are also UHJ encoded, but you can find a UHJ decoder if you wish, or use the so-called "superstereo" mode. Nevertheless, you can use a shuffler with these recordings to move from greater ambience (being there) to greater focus. The choice is yours, and there are no correct settings. Nimbus also supply recording locations and photographs too, so you can, for example, compare reproductions of your recordings of Haydn's symphonies to actual performances given in the Haydnsalle, or visit it - that is hear the very environment in which the symphonies were intended to be performed. Then you will certainly learn to appreciate the value of a shuffler.
 
May I ask if you convert CD to DSD256 or higher with a recording optimal algorithm and then use a SOA DSD RTZ FIRDAC, or possibly with SDPWM instead of RTZ, for conversion to analog? If not, you will surely learn to appreciate it when you do. Or maybe you see it as unnecessary and or even unwanted?

That said, I might consider trying the Nimbus recordings. However, I am a little concerned that there are not correct settings. Doesn't that mean its an effects box sort of thing? Not that it would bother me personally, but we have some very purist types here in the forum who are in general opposed to effects in playback systems.
 
Last edited:
Just read through and thought I’d drop in a thought. How about:

Set up a system. Then have a listener map their head position (distance from walls etc and height) in multiple places throughout the room. Have the listener note soundstage on some scale/gradient. Then use a dual mic system like mindsp make with fake ears to take measurements in each position. Then use the detailed measurements to train an AI to “learn” differences. Once 50 or so room coordinates have been measured rearrange the system by lowering and raising the speakers. Then from short wall to long wall maybe or rearrange the furniture. Then use a different system in a different room or house.

🤷🏻‍♂️
 
Member
Joined 2009
Paid Member
Exactly my interest. Even slight changes to width paid a price with damage to the center image. And it didn't sound natural at all.
Of course there is a crossover frequency adjustment, but any frequencies chosen for shuffling were adversely affected.

Of course the center image is affected: it's the whole principle at play. It's the same with the varying width in simple 'MS matrixer'. Hence my comment on the source message, if it's not needed then don't use it it can only be detrimental one way or another. It's not a 'pleasing' or 'aesthetical' effect as used during production process, but a corrective one. And 90% of commercially availlable music have already been treated 'the good' way at mastering stage.

As you have a background in sound engineering let me take an analogy: old LC eq ( eg: the one found on Neve 1073) will sound pleasing on most sources but will lack on 2 tracks ( for multiple reasons: lacks of finesse on freq choice or Q, subtle distortions induced by L which are 'too much' on a 2 tracks,...). On the other hand if you use a GML 9500 ( or a Medicis from Mr Neve) you might miss 'the character' ( or it'll be great because of that!) but it'll offer much more options on a two tracks thanks to it's lack of sound of it's own and accuracy. Two cases, two different tools.


Regarding the known science, little if any of it was conducted on systems with Sound Lab ESLs, along with other very high quality reproduction equipment. Unlikely anyone would bother even if they could afford the cost, since most listeners will be listening on more modest equipment anyway. Moreover, the best reproduction equipment available today is measurably better than the best of 40 years ago. Don't even get me started on evolution of high quality CD playback.

In my view it is a misconception about the target science have. What is the point of the study? Human behavior/ way to work/process or to evaluate the gear? If the first general quality of gear have no real importance ( inside some obvious limitations about what is studyed).
If it is to evaluate gear then human subjectivity have to be taken outside the equation, hence either measuring tools or double blind test if human have to be used to evaluate. As the later is not easy to implement, then measuring tools are most often used which lead obviously to erroneous interpretations of results as people think they know how to read graph and compare things without understanding the limitations. And people then say 'look this sound better than this despite better 'numbers' so measurements are a lie...' 🙄 weird world we are living in.

That said, I might consider trying the Nimbus recordings. However, I am a little concerned that there are not correct settings. Doesn't that mean its an effects box sort of thing? Not that it would bother me personally, but we have some very purist types here in the forum who are in general opposed to effects in playback systems.

So you think there is a correct location of mics during a recording? How would you define that? ;)
There is no such things as there is preference in that. Such a system enable people to find their OWN best location, or better said rendering. It's not an effect as such.
 
If it is to evaluate gear...
Its not to evaluate the gear. But the gear should be evaluated first to make sure it is not overly skewing the effort to measure humans. If all we do is say an amplifier with .01 THD+N should be good enough. A speaker that is +-2dB from 20-20kHz should be good enough, etc., then we are making a number of assumptions to the effect that what is easy to measure and commonly measured is all that matters for doing quality research on humans. IOW, its the "streetlight effect" as it affects scientific research. Its a problem, so say some people, me included. https://en.wikipedia.org/wiki/Streetlight_effect

So you think there is a correct location of mics during a recording?
No, not exactly. There is only the best someone knows how to do with the mics they have. Again, in case it wasn't clear, I am not a purist myself. But a lot of people are. They want a wire with gain, no coloration anywhere, just play back the recording strictly as it is, with no added effects. Now RIAA is okay, because purists understand that. But shuffling? I don't think so.
 
Last edited:
imho the very best soundstage can only be achieved by using a test CD with a speaker set up track e.g., XLO Test CD, which eliminates a lot of guesswork. “Sufficisn’t” room treatment is a must, too. Furthermore, I’d opine that generally speakers should be closer together rather than farther apart. A good place to start for small to medium size rooms is 4 feet apart initially then work your way out. Also, when the ideal locations are determined the speakers shoukd nit (rpt not) be toed in or out. Toeing in is usually a sign the speakers are too far apart.

Trying to find the ideal speaker locations without a methodology is like trying to solve x simultaneous equations in x + n unknowns. - audiophile axiom
 
Suppose I want to hear it from the best seats in the house as though I were in the audience. Maybe 6th row, or thereabouts. Maybe first row of the first balcony. Best you can do with that in mind.

OTOH, I believe its fashionable today to capture some more small, close-up details of the instrument sounds for audiophiles who like that. So maybe more product can be sold if that type of recording?
Maybe the solution is to capture two two recordings at once and make it double CD album?
 
Stereophonic reproduction contains NO height information whatsoever. Where height information is perceived, it is often ascribable to distortions introduced by the loudspeakers, for example; The same effect is unlikely to be reliably reproduced over different loudspeakers or with different listeners.
My initial caveat: I see you put a caveat or two on this in a later post, but I wanted to discuss this point further.

We can add height cues via processing on the reproduction end, either in DSP or via the frequency response of a passive speaker. With Atmos bouncers you get some degree of both in my understanding, with the speaker response tuned to the HRTF as it pertains to height and additional processing in the AVR to provide the same.

The majority of stereo content we hear is produced, engineered, mixed. In that process, material of any origin can be manipulated along the same lines to provide height information.

All of this could be called distortion, as could a lack of it, depending on perspective. Without listening in the room at the time of mixing and/or final approval, it's hard to determine the intended height of the recording. If an accidental distortion exists in the studio reproduction system playing the final mix, then we should be listening with that same distortion at home--the lack of distortion is distortion.
 
you are listening to stereo derived from say a B-format recording with height included, or derived from say a coincident crossed pair, then you could perceive some height information much like the early reflections imply depth but to give an impression of height instead. The impression you perceive will, however, be significantly influenced by your experience of like acoustics and not necessarily related to the actual information in the recording - an effect if you like.
I’m saying the Soundstage is - ideally - a 3-dimensional sphere or hemi sphere - with coordinates x, y, z -that‘s produced by the acoustic information captured during the recording, including echo, decay, secondary reflections, etc. That’s why on a lot of those early RCA and Mercury recordings you can identify the symphony hall where they were recorded. It’s just that retrieval ambient, dimensional information like “air” is usually hampered by a great many factors in practice. I suspect few people get to experience the complete holographic 3-D imaging because of all the problems I’ve already detailed, such as room anomalies. That’s why I’m jumping up and down like a little girl regarding my phenomenal new soundstage on headphones. 20 feet in diameter of the expanding sphere! And it’s so much more than soundstage, it’s dynamic range p, tonality, definition and depth of low frequencies, separation of instruments. Hel-loo!
 
Last edited:
Of course the center image is affected...
Not clear to me if it has to be that way with a shuffler. What is the waveform of the shuffling control signal? Is it sinusoidal, triangle, what? What if it were, say, sort of hyperbolic? Slow or compressed in the center but a little more intense at the extremes. Maybe it could be shaped with Bézier curve control points? Would it be perceptually any different from simply changing the width statically with MS processing?
 
However, I am a little concerned that there are not correct settings. Doesn't that mean its an effects box sort of thing?
There are no correct settings because it is a matter of preference. The balance of direct to reverberant energy is generally an engineering choice. The advantage of soundfield recording is the ability to change the balance "in the mix". What makes this possible is preserving the energy of the content i.e. the information. So changing the balance is still of the highest fidelity reproduction regardless of personal preference - which might even change from listen to listen.
 
then you hear your room playing a trick on you, not your loudspeakers which image outside the stereo triangle.
Conventional: yes, True: no. On my system simple 2-mic recordings employing physical acoustic shadowing techniques can create immense enveloping sound fields in nearly all directions. DG multi mic classic/horror shows on the same system are flat as a pancake, musicians on a clothesline affair. Same room, same walls.
Two mic, two speaker stereo is a very intuitive and very broken protocol for delivering much of the auditory localisation information available to a listener at the original event unless intra-aural delays and acoustic head shadowing at a minimum are captured by the recording. It's basic to how we hear.
 
we are making a number of assumptions to the effect that what is easy to measure and commonly measured is all that matters for doing quality research on humans.
I would suggest the most important misconception is the assumption of linearity in our hearing. Our learning capabilities are (predominantly) irreversible and therefore very non-linear. Unless rigourous blind testing is carried out on a per listener arrangement (which will often be practically extremely difficult), there is little reliable information to separate the well-trained listener from a delusional one when we approach commonly accepted hearing thresholds. Regardless of our physiology, we all have a capability to train our hearing on minute details that an untrained listener would find inaudible. Likewise we all possess a substantiative capability to delude ourselves into hearing soemthing we are not. Perception and sensation are not the same thing...
 
  • Like
Reactions: 1 users
I’d opine that generally speakers should be closer together rather than farther apart.
Actually the best imaging results when different drivers in a multi-way system are different distances apart because we localise sounds via different mechanisms at different frequencies. A shuffler would likely be a better way of achieving the desired target, however (and does not preclude single diaphragm drivers from gaining the same benefit either).
 
As I’ve been commenting the last few weeks the Schumann frequency is critical to getting all the information out of the recording. Synchronizing the brain’s neurons and synchronizing the left and right hemispheres of the brain improves the brain’s functionality in very interesting ways for audio. The brain processes information better, focus is much improved, sensory perception is improved, vision and hearing. A key element in achieving a detailed, dynamic and living soundstage.
 
Last edited: