what is the point of expensive coaxials with ragged response?

p.s. More than once at an art gallery, two paintings side-by-side, viewed with one eye shut, one painting would be holographic and the other emphatically not. The former probably had been painted using a technique of single-eye perspective. Both Western art and Chinese -- especially a famous Early Qing painter Shitao whose bamboo was stunningly 3D-correct like a model. Shitao called his method The One Method but misleadingly never disclosed his actual techniques. Now, AI has been successfully trained to reconstruct a 3D spatial scene from a 2D picture....
 
Last edited:
  • Like
Reactions: grindstone
I think this is right, the ears-brain (and eyes-brain, see below) has evolved to snap-to-attention and localize predator/prey even amidst a background-noisy environment. The coherence of a sound not its loudness is probably what toggles the brain.

How do you define 'coherence'? I mean i kinda agree but the term probably have a different meaning for both of us.

Since measurement tools and methods haven't fought for survival over millions of generations (in fact only four human generations), stereo sound perception by machine analysis is primitive at best, compared to animals.

This is highly questionable because recorded and reproduced stereo is an illusion. You can't directly compare reality to the illusion created by recording/reproducing the effect.

When a stereo sound is both coherent and to a high-enough frequency (hence more directional and attenuated by distance), soundstage gains holographic, focused depth; otherwise it sounds flat like how a 2D picture is seen through both eyes (brain detects no parallax).

Sorry but no. It's all about the type of microphone couple used and where it is located wrt source. It is an illusion... and you can't make // between how our brain react to sight and to sound. There is some similarity for sure but thats all.

Here is some example of different rendering of said different microphone couples recording the same instrument, as you'll experience some have 'depth' some not... same room, different location and type of techniques used...
http://recordinghacks.com/2010/04/03/drum-overhead-microphone-technique-comparison/


This psycho-acoustic effect is maximally facilitated by so-called stereo-triangle placement with speakers toed-in to aim axially at the ears from front-L/R directions where hearing is most acute (more than straight front or 90deg side). When everything clicks the effect can be incredible.

Equilateral triangle placement of loudspeakers/listening position is not because of hearing: it's a prerequisite for the illusion to happen.
Toe in have nothing to do with how we perceive this illusion or how our auditory system react to this: it's a technical tradeoff/compromise of loudspeakers own parameters/properties ( eg: a design goal of system is freq response optimised wiith a toe in of arbitrary angle, or this toe in ( in conjunction with size of room and loudspeaker location enable a given delay of early reflections which will offer a 'kind of rendering' - see page 7 of this document(which is worth a read as a whole imho- https://pispeakers.com/Pi_Speakers_Info.pdf ).

We all know this. Yet the sum-quality of stereo imaging has never been machine-measured -- only some putative component parts like frequency response and phase.

I think it's wrong. First because many don't know this ( 😉 ), second because i think i've met some tools in pro worlds which helps assess stereo.

What has surprised me, is that monophonic sound can have depth well-beyond the speaker, in the sense of a psycho-acoustic effect (when things click, coherent phase and high frequencies both), a palpable spatial sense where every sound has its natural place in relation to another.

Depth in a recording is dependent from clues from the space in which the recorded message is initially played. Iow it depend of the location of microphones wrt to source and room. Early reflections ( and comb filtering/coloration they brings) play a big role in how we perceive this.
It is easily simulated for many years thanks to reverberation effects ( mono or stereo).


I can detect this phenomenon if, and only if, the speaker is coherent (such as a decent fullrange driver, especially pointing up).

I doubt about this: i can send you mono files from synthesized source ( including synthesized human voices) where you'll easily hear depth whatever the loudspeakers used.

With fullrange pointing to your ceilling, you are generating a given pattern of early reflections which will give a sense of 'envelopment', your room+loudspeakers works together to generate a feeling of depth, not really different than the one i can create using a reverberator unit. It can be pleasing, but it's an 'effect' you add to the initial message. Can work wonderful on some source, not at all on others.

The fact your loudspeakers are 'coherents' ( depend on your definition of it) can have an effect on your feeling about it, maybe. But most of this effect come from the interaction with room.


Then when two such speakers play stereo the soundstage is spread out horizontally as well.

No. This effect is then present from lateral early reflections too . This have a dramatic effect: as there is 99% chance the distance from loudspeaker/ceiling is lower than from loudspeaker to any of the sidewalls ( or let's say at least one of the side wall) you shift time delay of said early reflections which ends up into a different 'flavor' of 'envelopment'. Here again it can be beneficial or detrimental to your preferences regarding the reproduced signal.

If you allow me a food //, it's like using spices, sometime a bit too much of it can kill the meal you coock, sometime it'll makes it sublime.
If your goal was to have the raw flavour of ingredients used then it's no good.

The easiest (only?) way I know of to achieve this level of imaging is by the sequence of steps given earlier: first, time-align acoustic centers; second, tweak XO to align phase around XO frequency (1st-order being best for linear phase and amplitude) and to ensure flat high frequency response.

Easiest if using passive. Dsp with FIR allow this with with even greater accuracy imho.

Is it possible to time-align acoustic centers without correcting/normalizing both drivers' phase first? Well I checked the KEF LS50 Meta 1793 coaxial for acoustic-center alignment, by using two speakers one midbass the other tweeter, without XO, and confirmed test-tone max sum at zero offset (it's possible I got lucky by playing intended XO frequency). So I think time-alignment this way works

You got lucky to own a 1793. 😉 I know because i own a pair too and some Tannoy and used a number of coax from P.A. origin in various setup and can confirm this is one of the very few coax which is almost coincident for both drivers. Iow you can use them without the need of dsp to time align. But still even Tannoy offer dsp preset for time alignement in their loudspeakers offer.

That said as pointed before it's not because you can observe max summing or nulling that drivers are time aligned. Frequency domain cannot display time domain information.


Now stereo vision. Twenty-five years ago I serendipitously discovered a surprising psycho-visual effect, that monocular viewing of a picture with just one eye had a strong sense of depth. Back then I was involved in 3D stereo-photography (won an internet 3Dphoto contest for macro) so I had true mixed feelings -- the cheapest 3D viewer just shut one eye! With both eyes open and no parallax the brain says "flat" no second thought. With just one eye open the brain amplifies the information and reconstructs the scene in real-time, to a perceived depth of ~65-70% that of a true stereo-pair of images seen through a 3D-viewer.

With sound, maybe it's similar -- sound-field depth can be perceived from a monophonic source, under the right conditions of information content and internal consistency, or "coherence".

You can't compare vision with ears.

Think about the fact you can trick the eyes/brain with 25 picture by second. I can tell you will hear a very real difference between 25hz sampling rate and a 44100hz or 96000hz one. 😉

Another example is morphing: when M.Jackson released the video of 'Black or White' in 1991, most people couldn't believed what they seen.

Circa 2005 we had a bunch of audio plug ins claiming morphing between sounds... never been convinced by the effects at that time. Technology is now mature and you can trick our ear/brain with such things but it require a lot of computing power and almost 30 years research more than for the eyes....

In a way our ears are way more difficult to trick about some things than for visual. Stereo is not. Blumlein experiments ( first stereo couple ever experimented) date from end of 1920's iirc. But like morphing, it's an illusion, a trick which is not this difficult to achieve but not easy either.

If you are interested on how this illusion is created an the issue with it i suggest you to read this which is a nice introduction to it for 'real source':

https://web.archive.org/web/2011071....com/images/uploads/The_Stereophonic_Zoom.pdf
 
Last edited:
Assuming a frequency band of interest where a driver's acoustic center and phase thereof are unchanging (approximately), then elementary geometry allows both (i.e. time) to be solved given phase measurements at just two frequencies (easier to think in terms of wavelengths). I've never taken a course on audio but this is just math. For two drivers without XO to be phase-aligned at two frequencies (within said band of interest) they must be time-aligned and in-phase at their common acoustic center; else there would have been group delay.
 
Not in this thread, but elsewhere I've conjectured the active ingredient to holographic sound-field may be the background "air" (whatever that is, sense of space, ambience, hall sound/echoes, reverberrant field, room-effect speaker design...), because the fundamental frequency range (and harmonics) of musical instruments played (e.g. doublebass, cello, or violin) made no difference at all on the optimum driver offset to achieve a holographic effect. If optimizing for time-alignment coherence etc. relies on paying attention to, or amplifying this "air", for example by up-firing, perhaps a recorded track just of ambient "air" would help even more as a tool.
 
For two drivers without XO to be phase-aligned at two frequencies (within said band of interest) they must be time-aligned and in-phase at their common acoustic center; else there would have been group delay.
For the more general case of unknown acoustic center offset and unknown phase thereof, I've thought of a solution using only max-min-sum phase-alignment at two frequencies against a driver with known values serving as a standard.
 
Last edited:
What you describe as holographic sound resembles awfully lot what I have here when I reduce listening distance, and what David Griesinger lectures about. Basically, when auditory system picks up an important sound it gets it's own neural stream, and rest of the sounds, the room, gets another background stream. Only when this happens, there is envelopment in perception, the room sound what you call background air. When this stream separation is active brain gives you full detail on the foreground stream (direct sound) and kind of suppresses the room noise into background, but you can actively listen both.

If you back out further from speakers, I mean move physically further from speakers, so that early reflections become loud enough compared to direct sound you lose it. Auditory system doesn't pick the direct sound from "noise" anymore, basically considers the sound is noise, not important, and room and direct sound are both in one neural stream not worthy having their own wasting resources. Perceptually the sound now localizes in front as a bit hazy kind of not that well defined thing, and doesn't have the envelopment. Your recording, electric signal chain and speakers are still the same as one step ago, but your brain processing changed as you moved further!

Could you make a listening test, start slowly oving backeards from you listening spot, eyes closed, do you lose the holographic sound you describe?

To my experience this happens fast, in one step, on constant directivity system I have. It feels like stepping inside the holographic sound, or out of it. Perceptually quite exactly what Griesinger calls Limit of Localization Distance, thus I think it's that what I have here. Brain switching state.

This is missing link in audio playback in my opinion, people do not consider their own brain processing before the perception. It is in the audio chain just like electronics, speakers and room, the last link, and processes everything into your conscious perception. Thus in my opinion holographic sound is part of speakers, but also the room and also the brain, and it doesn't happen when brain cannot lock in so brain is the most important thing on audio blayback chain and it means speakers and room must be so that this can happen in brain.

Imagine this, people chase this perceptually great audio all their life, changing speakers and amplifiers never understanding it might be right front of their nose, just tahe few steps forward, shrink listening triangle until your brain can pick it up from all the early reflections (assuming speakers are fine).

Most hifi setups are setup so this cannot happen, they are optimized to "enhance" sound utilizing lateral early reflections to widen the sound stage, which is one good sound, but it completely prevents this better one happening. Which is fine, this is relaxing sound as brain doesn't pay attention to it 😉 and some records sound better this way.

Hence, best thing hifi enthusiast can do is to find the transition (listening distance) where brain switches state and mark it down. Now one can switch the sound moving the listening chair a bit, and can do this by recording and by mood. There is no reason one should optimize just for one of these perceptual sounds.

There is lots of implications from all this and I've been writing about it many posts here and even more in ASR so not gonna make this post any longer 😀

If you find this relevant to what you perceive there, please comment.
 
Last edited:
I dug out the 15" EchoTech big-basin (actually more than meets the eye) and realized I could just bounce the tweeter off of the dustcap, so I gave it a try. Whipped up a series 1st-order XO around 4khz (tweeter reverse polarity for convenience or subconscious rationale I'm not sure), and simply moved the ceramic dome around, pointed toward dustcap, while playing test tones 1k-12khz or high violin music. Got feedback very quickly -- high frequency does bounce, and this way I got pretty even response to 11.5khz (limit of my hearing confidence) and even dispersion to ~25deg on the side not in-line with the tweeter. I fitted a steamer rack and also tried flat disk over the dustcap etc. For this quick test listening distance up to ~1.2m (standing with head lowered looking sideways). The tweeter was a couple dB less sensitive than the widerange but acceptable once time- and phase-aligned; before, tweeter facing front, I had made a plast-tape whizzer for it. Sounded great. Next holiday I might try comparing the two versions.

Of course, if I kept the 15" up-firing I'd just point the tweeter toward me (LX).
View attachment 1353882

I suspect that might be indeed the best possible nearfield setup. But my wife would kill me.
Have you ever heard anything close to this by quality but in more practical shape?
 
Near
I suspect that might be indeed the best possible nearfield setup. But my wife would kill me.
Have you ever heard anything close to this by quality but in more practical shape?
Thanks. Near-field wide-dispersion correct tonality without noticeable comb-filtering, many of my "minimalist" LX sound good enough (say 1/3 to 1/2 as good as PrimeRadiant Axia). But very high frequency >10khz, necessary for perfect transcients, finest detail, and holographic imaging, is beholden to the "tweeter" geometry and entirely directional -- in my case solved by reflection off of a convex spherical surface. (I obtained a large Peavey constant-directivity horn for comparison; excellent but has limits such as vertical dispersion -- none.) The "snow-cone" MAOP10 I posted last night has several caveats (to be detailed later) but otherwise is potentially 2/3 as good, perhaps using a more sophisticated reflector.

(Sadly, after a final early morning listen, I damaged the cone rushing to put it away....)

 
Last edited:
  • Like
Reactions: androxylo