MTM vs coax for point-source design

e.g. a step response. The problem with it and its lack of any popularity and credit is that there is no reference nor units of quality measurements.
The benefit of a good looking step response, or square wave etc.. is that it is appealing to the eye. The ear is another story.

As it turns out group delay is often used, even if in a slightly different way than above. We know that we cannot hear group delay variations below a certain amount so it is a figure of merit regarding transient response.

I just have a tendency not to take prisoners 🙂
Any conscientious designer will do that wherever they don't have a reasonable awareness of the consequences.
 
Last edited:
The benefit of a good looking step response, or square wave etc.. is that it is appealing to the eye. The ear is another story.
Right Allen, how would you back this slightly authoritative opinion ? 🙂 Is it your experience, and/or some publications ?
I'd be more than happy to have a look, I love reading new stuff in this subject. From what I have read and deliberated it actually is very much the story for the human ear 🙂
 
Could you please share a link to a report you think most convincingly proves this point ? It is quite interesting for me as I would rather say the timing aspects are severely underestimated as a condition for high sound fidelity.
Not easy to do a search here. But the posts following yours do a better job of exploring the question from a listener's point of view although still theoretical (AKA speculative).

The main band of concern is the crossover region because that's where drivers overlap. But the performance of the electric crossover circuitry, the walls (AKA DI, directivity index), where you are sitting, and assuming you are using both of your ears, suggests that moving a tweeter around (mechanically or electrically) is just a little kludge in a large mess of impacts.

The fix for anybody seeking the best in fidelity is having no XO by using an electrostatic speaker covering 5 8aves or so (esp if you can reach as low as say, 120 Hz). Or at least to choose a design with a single driver covering a broad middle band so there is no problematic XO in the middle. Easy and inexpensive to find such a driver and then easy to add good bass and treble with specialized drivers..

B.
 
how would you back this slightly authoritative opinion ?
You'll see I also mentioned group delay, and said we can't hear variations below a certain amount. Now, group delay and step response and square waves are related to each other. Therefore these variations are allowed to show up in the other forms. Do you also feel the same way about this group delay issue or is it something else?
 
Yeah there is about 1/4 wavelength leeway around crossover not to have issues in design axis frequency response. I mean if the phase is past about +/-90 degrees between the drivers towards direction one is inspecting at it shows as dip in frequency response, simple as that. If the drivers are not coincident there is going to be dip anyway to some directions the drivers are stacked. One needs to take account path length difference from each driver to ear, if tweeter is at ear level and woofer below, then the distance to woofer is longer and this makes the difference in phase. Or just lift up the speaker to equalize distance from both drivers to ear. With coincident drivers there is not much issues with this except if the tweeter is crossed over too high, eg. the woofer is too big and response starts to narrow below crossover.

Example in numbers: 2kHz is ~17cm long so path length from drivers to ear can vary +/-4cm or +/-0.125ms and still have pretty much flat response towards the ear. Basically its frequency response to other directions, or lobing, one sees on polar maps that is the result of phase varying to an angle at crossover. This in turn affects response at listening position due to first reflections having different frequency response than direct sound.

Then there is the effect of group delay but the studies have demonstrated it needs to be quite much to start be audible, meaning that if there is single low order crossover at 2kHz on a speaker system then the group delay shouldn't matter much for perceived qualities in which case any perceived difference would come from the frequency response. Either design axis or off-axis through first reflections, basically watch out for the power response / directivity index.

To me this simply means that one should make the speaker so that DI is smooth. In addition try make sound towards first reflection points as much the same as direct sound as possible. Now the timings shouldn't matter mucho, they were fixed just by watching frequency response plots. Its pretty easy to notice anomalies in frequency response, linear distortion. There is case when high order crossovers and multiple ways on the speaker (or any stupid mistakes for example) add excess group delay so much that it starts to be audible but again this shouldn't be much of a problem if one is doing passive crossovers because the slopes are probably quite low order. With modern DSP it is possible to wipe out the excess group delay at will (FIR) but its also possible to make bad excess group delay, just be aware of what that is on the system. Hence first priority is to build the thing so that with crossover the system makes very smooth DI and nice smooth frequency response to important directions, then just checkout there is no stupid things going on with the filters so that excess group delay is reasonable and one should be golden. If not, its time to plan the system, the physical construct again so that such sound is possible!

Well, fact check please, thanks 😀 Here some quick sims with ideal drivers coincident. As crossover order is low even small change lends itself to frequency response quick and even with quite steep filters delay past 1/8wl shows up in design axis frequency response as >1db. For example +/-2cm or ~0.06ms at 2kHz for 4th order LR filters makes the axial response vary ~1db. Still its about double what I guestimated on the previous post, or half the leeway.

Its also very interesting to have impulse response window open in VituixCAD and play with the delay see where the hair is. Note also group delay when "tweeter" is leading or lagging the "woofer". I have no idea how audible these are. Peak pressure in step response initial hit varies some, it seems to be much less if tweeter leads woofer and not that different if tweeter lags some. Group delay looks worse the otherway around, with more distinct bump with tweeter lagging woofer. Frequency response stays about the same with either "positive or negative phase difference" as the drivers are coincident. Non-coincident case the lobe would steer up or down depending which driver is leading.

Contrary to my previous post quoted above "timing" needs be much closer than I thought, especially with low order filters in order not to show up in frequency response plot.

distance-LR12-pos.pngdistance-LR12-neg.pngdistance-LR24-neg.pngdistance-LR24-pos.pngdistance-LR48-neg.pngdistance-LR48-pos.png
 
Last edited:
Yeah we need first and foremost agonize over perceived sound in a room. This means figuring out the off-axis response of any system, which as one lump is just the power response. And to be able to make good judgement what kind of off-axis response is needed one should know about psychoacoustics and perhaps know some preference sound to attach to, perhaps optimize some angles over some others and its somewhat frequency dependent as well.

Those who like MTM style speakers probably like it because of less vertical reflections and more horizontal reflections, some where on critical midrange. Coaxial is different, its got roughly as wide vertical as horizontal directivity depending on the enclosure so about as much vertical and horizontal reflections coming in at listening spot.

We gotta remember this mostly affects mid range, above the speaker size wavelength frequencies and below beaming of the treble, or more likely up to few kilohertz above which "comb" happens so tight hearing system averages it out. This could be roughly between say 300Hz - 3kHz or something, make WMTMW with narrow response on this bandwidth and most advantage is reaped. Conversely, reducing vertical reflections past few kiloherz is diminishing returns in this regard ( not knowing all effects it might have on perceived sound, just looking at effect to timbre as in frequency response anomaly ).

Attached simulated response of combfilter similar to what single floor reflection would make with direct sound (ideal sources in vituixCAD, other delayed 50cm in relation, but its irrelevant as the intention is just to demonstrate hearing system would smooth out the tight comb). REW:s psychoacoustic smoothing it smooths out past some kiloHz.
View attachment 1062878View attachment 1062879

I'm not very educated on this, just what comes out from the hobby by reasoning and reading random bits from here and there. I don't have any MTM speakers, or coaxials, to compare. I liked coaxial and fullrange systems I've had in the past but current prototype trumps them all and its not either, just controlled directivity. As always, its not direct comparison because so much more difference than just the topology but it just means not all point-sources are best, just make problem free speaker system that works with the room and its good no matter what the topology.
I quoted the above to go a bit further into the effect of the room on what we perceive...

singlevsarray.gif

The orange line is a Vituixcad prediction of the effects of floor and ceiling reflections of a single driver vs an array of (25) similar sized drivers.
Having more vertical sources spreads the reflections and averages out the single space dips only one source creates. Basically it makes it seem
like floor and ceiling bounce don't exist for the array. Not included in the above graph is side walls etc. They will also have their own influence
that needs to be considered.

Even a coax will still suffer from floor and ceiling reflections. As will a MEH (multiple entry horn) as that can be seen as a very tightly spaced MTM and it acts as one source.
A WMTMW can be created to largely avoid the floor and ceiling reflections. As does the Fractal array demonstrated here: Fractal Array CBT by bbutterfield.

Which is best? That's the biggest question of them all I suppose. I have been on a similar quest as the OP, be it for different reasons. I wanted the sound to have the proper
time alignment as it arrives at my ear(s) within my room. The lesson I've learned from that quest it that the room dominates what we hear by a large margin. However our
brain is very good at hiding that fact, making it way harder to believe it's influence is that large/big. Record the sound you get at the listening position and play it back on
either headphones or, if one dares, on the same speakers you just recorded. That will make it quite obvious what the brain is hiding from you. (doubling up the room effects)

I'm still on the quest of keeping the time alignment in check, but it isn't the key player with indoor listening. So my bigger quest has become to control the direct vs indirect
sound and even manipulate it to create the most attractive rendition of the stereo effect. Things like inter aural cross talk etc. begin to play a role once you soften up/reduce
the effects of reflections far enough.
 
  • Like
Reactions: deanznz and tmuikku
^^ Yeah issue with just two driver array like MTM is it can be optimized for about few octaves, to produce narrow directivity. More drivers required for wide bandwidth narrow directivity, basically just more size, a tall array. MEH could also do it, but would have to be Tall And Deep 😀

For what its worth here is a WMT speaker and situation if the mid is duplicated and made an WMTM system. Also with floor and ceiling reflections only.
WMT.pngWMTM.png
Here REW psychoacoustic smoothed.
WMT-psychosmooth.pngWMTM-psychosmooth.png

Much of the midrange <1kHz big dippers can be made go away which is nice. If crossovers were a bit higher, in other words smaller waveguide and mid transducers with reduced spacing the system would perhaps perform a bit better, narrow directivity bandwidth bit higher up to clean up better some of that 1-2kHz region. Perhaps narrower vertical coverage angle waveguide as well. Obviously much more interference seen here than with your full height array.

edit.
Just for sake of completeness here is the above WMTM with front and side wall first reflections included as well 🙂 But, its missing all the other corners in the room so not sure if this tells anything else than rooms and speaker positioning have huge effect on response, at least in simulation.
front-and-side-included.png

And here ideal point source in same location in simulator, so, to illustrate even ideal flat response point source (omni) got some issues with the room 😀 Some narrower coverage than omni is probably benefitical in any room. Unless, hearing system is just clever enough to mostly process the reflections away as they have exactly same response as direct sound, only delayed and attenuated some.
idea-point-source-same-location-pshychosmooth.png
 
Last edited:
I'd like to see such a tall and deep MEH 😀. If only in simulation. I've often wondered about a MEH within an array (MEH being one of the contributors for the array).
But it is easier said than done. I quite like the fractional array shown, as it accomplishes a lot for the space it uses. In my humble opinion whatever you choose, make
sure it can work with the room you're in.
 
  • Like
Reactions: tmuikku
The fix for anybody seeking the best in fidelity is having no XO by using an electrostatic speaker covering 5 8aves or so (esp if you can reach as low as say, 120 Hz). Or at least to choose a design with a single driver covering a broad middle band so there is no problematic XO in the middle. Easy and inexpensive to find such a driver and then easy to add good bass and treble with specialized drivers..
Do you ever ponder why electrostats can sound so good?
Is it possibly because they are time and phase aligned over a wider range of octaves than any other speaker design? I believe so.

It seems to me you contradict yourself saying timing between tweeter and woofer doesn't matter, and then offer a speaker solution that eliminates that issue.
Also the timing offset which is a constant delay, has far more impact on relative phase between drivers, than the phase rotation of the xovers.
If phase rotation from xovers is worth eliminating, then logically the grosser error.... fixed timing issues.... are too.
 
I think both answers from you, Mark100 and Wesayso. We could add too the fact there is no box and ( i suppose) the diffraction which must be low ( after all, those are minimal baffle design).

About time alignement, sure it works for only one point in space but there is some evidence that for some frequency ( low end) even if not perfect, time alignement works for a relatively large space ( Mitchba present measurement of his own time aligned system which are valid over a whole couch area in His ebook ).
 
Or does it have something to do with the limited vertical dispersion they have 😉.
Very true !!!

I've always wanted to hear the tall Soundlab electrostats....over the years I've heard about all the rest.
For a while i had two pair of 4 ft high Acoustat X, one stacked on top of the other, to make an 8ft high stat.
Didn't work as well as hoped, because the room was too small causing a lower frequency imbalance (and it was in my audiophool days when i shunned EQ).

Which gets to your point about fitting to a room. I think lines are a great solution for many rooms.
I'd like to someday build a multi-line speaker, something with separate ribbon, mid, and sub lines.

That said, my latest syns sound so good, both indoors and even more so outdoors (I've done all the indoors experiments you suggest 😉)
that I'm having a heard time getting motivated by another speaker project.
Also must say I'm getting less convinced that floor bounce is a big deal (my indoors has fairly high vaulted ceiling causing late reflects only).
Currently putting holding horizontal pattern control as low in freq as possible, ahead of narrowing vertical pattern.
Another reason I'm having trouble finding further motivation is....as you know, pattern control just comes down to size, and i really like to keep things portable for outdoors.
 
A frequency sweep recorded at your chair and plotted without smoothing looks fiercely jagged, even with planar drivers. And that's the simple truth of what reaches one of your ears (and a bit different plot for the other ear, assuming there is some meat object separating the two). All the influences (such as floor and wall reflections, broad DI and micro-DI variations, 10% tolerance in the caps in your XO and 25% in the coils.....), all these influences add-in together to the final delivery to your ear.

Various factors can make sound better in a room*. But I doubt the adjustments in this thread matter much.

Since ESLs have entered the thread, many of us who listen to ESLs think about the old wisdom, "Like trying to make a silk purse out of a sow's ear" when we question waving floppy cardboard to make sound waves.... instead of waving SaranWrap.

B.
* just about every photo of a music room posted on this forum looks like it would sound truly awful due to absence of absorbent old-time carpets and furnishings. Fix room before moving tweeter.