Ambiophonic optimal source distribution experiment (part 1: Introduction)

Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.
This is a report on some fun I have had over the last couple of weeks trying OSD. It was stimulated by a remark by "dwk123" in the thread:-

http://www.diyaudio.com/forums/showthread.php?s=&postid=1647170#post1647170

I'll post in 3 sections:-

Part 1: Introduction
Part 2: Equipment
Part 3: Setting it up

The idea of optimal source distribution instantly appealed as a way to make an ambiophonic setup with a normal total radiated power spectrum. So that, even though the ambiophonic experience can only be had along the centre line, and more or less at the right distance, the overall tone balance is unremarkable most places in the room.

Normally with 2 full range loudspeakers the filter needed to generate the cancellation produces a quite horrible off axis response. This was always a big negative to me when trying ambiophonics (as I have for some time, on and off).

The idea with OSD is that the spectrum is broken up into bands that are dealt with by speakers positioned such that the cancellation filter is friendly. After a brief and remarkably pleasant try with two bands (split at 2kHz or so), it seems that 3 bands can work very well. Better than I would ever have imagined given the at first nonsensical arrangement of drive units around the room!

In brief the plan is

tweeters: 5kHz and up, spaced such that a 1 sample delay can be used in a RACE like filter (no strongly audible response artifacts). For my listening distance the spacing is 40cm (i.e. both 20cm from the centre)

upper mid: 2kHz to 5kHz spaced 4 times as much, again the filter need not have any strong artifacts

mid: 300Hz to 2 KHz: spaced 9 times as much as tweeters

bass: no ambiophonic effect applied below 300 Hz (described later).

The filters were generated using MATLAB following the recipe hinted at at ambiophonics.org. In the first approach (which is quite good but not perfect) the "crossovers" applied to the speakers were put in the simulation. This leads, as might be expected, to some peaks and notches - some refinement work is needed.
The impulse response from MATLAB was entered into the stereo-convolver Foobar2000 DSP plugin.

So the RACE filters, done with a "loss" factor of 0.7 (also tried 0.85 for the Mid), had delays of 1,4,and 9 samples for the 3 bands (they don't need to be integer values, but it worked out that way).

The shocking thing is that this works tolerably well on 9/10 recordings, excellently on many, and sounds bad only on a few (that are usually not that good in stereo either). How is that possible when the tweeter for each channel is 60cm from the upper mid that is in turn 1m further out?

A large number of measurements shows quite a reasonable overall sound field throughout a wide spot near the middle of the room (where I listen). Nothing very remarkable about the peaks and troughs compared to the usual stereo case (differ in detail, but not really when taken as an ensemble - without statistical analysis!)

In case the description is unclear, the rough layout is

B .............................................B

M.........UM......T...T .....UM........M

though the speakers are in fact on an arc, quite precisely centred on the usual listening position.
 
Part 2: Equipment

Brief notes on the equipment:-

PC with Foobar 2000, stereo_convolver and crossover plugins. Output via Echo Gina 3g.

Channels and processing:-
Analog 1 and 2 go to Behringer ultradrive which sets the crossovers (and minimal EQ) for B,M and T. Analog 3 and 4 drives the UM amplifier. Digital 1 (mono) drives two subs via a home made DAC and a parametric EQ. The subs see the same signal, but one has a low pass, gain and phase adjust.

Drivers, enclosures and amplifiers:-

T: SEAS H1499 (27TBCD/GB-DXT) - the only part bought specially for this. I needed a small baffle tolerant tweeter, and the horn loading makes it easy to use with only minimal care regarding diffraction. Two are mounted 40cm apart on a custom stand, at ear level. Driven by a tripath TA2024 amplifier (plenty for >5 kHz), 4th order highpass (near butterworth). These have mderately controlled directivity in the frequency band in which they are used.

UM: B&C DE250 on 18Sound XT1086 with heavy stuffing of horn with absorbant material (wadding of unknown type). This gives a beautifully smooth response well beyond the 2 to 5 kHz band used (no eq needed in that band). Same type tripath amp as T. Positioned at +/- 0.8m from the centre line, at ear height. These have moderately controlled directivity in the band.

M: B&C 8PE21 in a small box with short front-back distance and heavy stuffing. Works well from 300Hz to 2kHz (just). Driven by Hypex 180st. 4th order acoustic filters at both ends. Positioned +/- 1.8m. These are omnidirectional at the low end, dradually narrowing towards 2kHz.

B: PD12SB30, in 120l, reflex loaded. Response to 45Hz in room -6dB. 4th order low pass (no high pass except "rumble filter". Another UCD amp. Omnidirectional.

Subs: PD1550 Tapped Horn (mentioned in the Collaborative Tapped Horn thread in subwoofers), covers 20Hz to 50Hz then falls off (all the higher peaks removed. This is in the middle of a long wall. Signal also goes to a BK Electronics Monolith (100l 12" reflex sub) in a corner working from about 30Hz to 60 Hz. These subs partly overlap the B. (Sort of Geddes-like.)

I think those are the main points.
 
Part 3: setup

To recap, the RACE filters in each band are designed for the speaker spacing in each band such that they all "come together" at the same listening point. This seems to have worked, but required extremely precise speaker positioning.

The rough optimum distance for each band was checked and was close to 3m in each case (the in-out direction is least critical in the RACE method).

The listening position was fixed and a string used to mark the speaker positions to within a couple of mm. The effective position of each unit was obtained by impulse response measurement at two distances, and taken into account in placing the speakers on an arc.

The tweeters must be within ~1mm of their desired front-back location (relative, not absolute), and to make this easier they are mounted on a single "baffle", on a stand that can be rotated finely.

Since there are no traditional crossovers (!) the units were individually balanced within their respective bands. The surprising thing was that, they integrate very well. (The target power response was the usually desired gentle fall off with increasing frequency, and is achieved.) A pair of UniQ speakers (opposite extreme in terms of time coherence) were used as a listening reference for overall balance - not that they sound particularly wonderful, but a reference is needed.

The bands were as described in part 2, with the UM band defined by the Foobar crossover, as was the low-pass for the subs (needed to kill the TH resonances above about 100 Hz).

The SEAS tweeters give a very smooth response at most measuring positions (slight sign of the expected diffraction dip), the RACE filters have an artifact of rising response above 10kHz, which nicely matches the SEAS. No EQ.

The DE250s give a deliciously smooth response on the horns (no EQ needed in band, HF droop used as part of desired low-pass).

The Mids show a slight sign of the first reflection coming through the cone, this was EQ'd (not spatial).


Result:

Apart from the power response which was measured, the rest is subjective (hence of almost no value, but ...)

provided my head is within a couple of cm left-right of the correct position, and within perhaps 50cm of the correct front-back position, there is a solid, detailled and robust image. A predicted (by others) and remarkable property is that the image is robust against head rotation by +/- 20 degrees at least.

There is very little attachment of the image to the speakers. Recordings without tricks give an image about +/- 40 degrees (roughly) though with tricks and room echoes coming from just about anywhere (sometimes convincing). Quite often the sense of "space" is pleasant.

I've listened to ~100 recordings from large scale orchestra (Reference Recordings, Waterlily included), chamber works, pop, rock, jazz,... and most seem to work about as well as I've ever experienced: not what I was expecting (of course it could all be just novelty that wears off, but not so far ...).

Ken

once again thanks to dwk123 for the remark that led to all this fun.
 
  • Like
Reactions: 1 user
SWEET!

Nice job. Good to know that my aimless ramblings are good for something :)

Very ambitious setup for a first cut IMHO - I was thinking of a simple 2-way plus sub for the first try, but I guess in-for-a-penny in-for-a-pound.

I'm intrigued by your results. On one hand, +-40 is much narrower than I experienced in my casual ambio setup. On the other hand 9/10 success rate is off the charts compared to what I experienced. For me it was 'reference recordings only need apply'.

Arrrgh. Now I *really* need to find time to try this out.

[edit] I tried to email you through the forum, but you have it disabled. I'd be interested in chatting offline about this - drop me an email through the forum if you are so inclined.
 
dwk123 said:
SWEET!
I'm intrigued by your results. On one hand, +-40 is much narrower than I experienced in my casual ambio setup. On the other hand 9/10 success rate is off the charts compared to what I experienced. For me it was 'reference recordings only need apply'.

I'm probably not quite there yet with the image, but what I find is that many recordings have a solid well defined "core" (say +/-40 degrees) with a surrounding, less well-defined "cloud" - often due to effects but sometimes giving at least the impression of something genuine.

The 9/10 distinguishes those that are listenable from those that really do not work, perhaps my acceptance criterion is too low. If I look for any flaw in the image then it is fair to say that very few are utterly convincing. The main thing is that tonaly most recordings don't fall apart in the way that many did with my implementations of simple ambiophonics.

I've got ideas for tweaking the filters - when (if) I get a really reliable recipe I'll publish it. At the moment there is a bit of judgement needed.


Ken

ps. I minimise email (I regard it as a necessary evil), the one I use for the forum is never read now.
 
More ramblings: doing it "properly" now

I eventually got time to digest "Optimal Source Distribution for Virtual Acoustic Imaging, T. Takeuchi and P. A. Nelson, ISVR Technical Report No 288 February 2000" and made up the filter to do OSD correctly as per the original idea of those authors.

In very brief summary, the required transfer functions are Li to Lo = Ri to Ro = 1 (Li = left in etc.), Li to Ro = Ri to Lo = -i*g. Here i is the complex unit and g is an amplitude scaling factor to account for the left ear hearing the right speaker a little quieter than it does the left speaker.

Even with the animations at ISVR, I didn't find it all that easy to see exactly how this works, compared to RACE, which is why I hesitated to try it out.

Implementing the 90-degree (i) phase shift, i.e. a Hilbert transform, was done using a 151 tap FIR filter generated by the remez function in MATLAB (using the fdatool). The lower and upper limits (for the target function) were 100 Hz and 19kHz. The lower 1dB point came out about 200 Hz. These parameters were found as a result of trial and error - I did not want the filter to be too long, but felt that it would need to fit to better than 1dB over the important range.

A Simulink diagram, used to generate the 2x2 impulses for the Foobar2000 Stereo Convolver contained the following:
4th order low and high pass at 200 Hz, low passed signals are fed straight through to the output summation, high pass signals go through the OSD matrix.

The OSD matrix is as follows:

Li to Lo (etc.): simple delay of half the Hilbert FIR filter length.
-Li to Ro (etc.): gain factor g followed by the FIR filter.

The outputs of low-pass, delay, and FIR, were then summed for each channel.

This also works! It is much easier to generate than the approximate RACE version I reported earlier (which needs quite careful choice of some parameters: I was a bit lucky first time round with it as it could have worked much less well with different crossover frequencies etc.).

The only adjustable parameters in the correct method are g and the LF cutoff (which is not all that critical, I think). I tried g = 0.95 and g = 0.85, independent of frequency. I think the former is better (but am at an early stage in evaluation). The truth is g should depend on frequency so I need to find a reasonable HRTF model to study to find a good choice of g. Suggestions gratefully received!

The speaker setup is (so far) exactly as described earlier (the positions are probably not optimum for the chosen crossover frequencies, but they are not far off).

Ken
 
update

In the event that anyone is interested ...

i) the Hilbert transformer was 150 taps (not 151)

ii) my bass speakers are not far enough apart to do the OSD down to 200Hz (needs almost 180 degrees), so the filters were adjusted up to 400Hz

iii) the speaker positions (mid and up) are close to optimal for 4m listening distance (and my head size)

iv) the 0.85 g-factor is better, and studying HRTFs shows that it is a better fit to typical examples (and a constant value is not as bad an approximation as it might seem at first thought - because the speakers subtend a larger angle at low frequency)

I'm enjoying this!


Ken
 
glad to see other people interested in OPSODIS. I ran it for about two years using a foobar FIR crossover and waves crosstalk cancellation for each channel. Ultimately, I abandoned the project because of connectivity limitations, audio latency and room setup.

Since everything ran through foobar and audiomulch for processing, everything needed to stay at a fixed sample rate (in this case I chose 48khz). I could only have one analog input and one digital external input, and it was constantly resampled using creative X-fi. Further, latency using FIR based crossover filters showed latency in 100ms+ range and was noticeable while watching TV. I moved back to IIR crossover filters, but the sound was not as convincing as FIR.

Lastly, having speakers arcing across a room looked, well, rather ugly. With the precise setup required, you really have no choice on where you can place the speakers.

I hope marantz continues implementing OPSODIS in products, as I really want the technology to go mainstream.
 
"glad to see other people interested in OPSODIS."
I'm surprised how few (well in retrospect, perhaps not). It is so much easier to live with than ambiophonics, yet that is more often discussed (of course it is older).

"Since everything ran through foobar ...."
Indeed it is hard to avoid latency (no concern to me, as I only listen to music played in Foobar).

"I hope marantz continues implementing OPSODIS in products"

I'll be pleasantly surprised if they can pull it off on a significant scale.

The OPSODIS extension (mentioned earlier) is hard to find out about - no freely published info at all (beyond one talk abstract).

Ken
 
I´ve already ordered a 400hz passive line level crossover from marchand so I can implement osd. I´m really looking forward to it. I´m actually very happy with the stereo dipole as it is.

In my opinion OSD could be an addition to an ambiophonic set-up but cannot substitute for it. Ambience through convolution is still a must for ambiophonics as its name suggests.
 
poldus said:
I´ve already ordered a 400hz passive line level crossover from marchand so I can implement osd. I´m really looking forward to it. I´m actually very happy with the stereo dipole as it is.

In my opinion OSD could be an addition to an ambiophonic set-up but cannot substitute for it. Ambience through convolution is still a must for ambiophonics as its name suggests.

Hi poldus,
say more: I'm all ears!

My problem with the crosstalk-cancelled stereo dipole was that only a few recordings struck a good tonal balance (as opposed to with a barrier which was one step too far on a permanent basis). Perhaps this is worse in my relatively "live" room.

As you know OSD allows the cancellation to be done so very simply, and at the same time the 90 degree phase shift minimises tonal issues.

Are you suggesting this should be convolved with 2x2 ambience impulses? Concrete suggestions would be most welcome.

I'm still studying HRTFs, and am starting to conclude I'd better measure my own as a starting point for further experimentation.

Ken
 
kstrain:
I know nothing about the technicalities of crosstalk cancelling through software.
I use a physical barrier and it works for me. After a few years I´ve learnt to live with it (I may even miss it if it was removed).

I also need all my diy room treatments (absorptive panels and lately a large number of bass traps) and all the speakers surrounding me providing a concert hall simulation through three yamaha ax1 processors (absolutely frowned upon by purists, as you know)

I listen to orquestral music and for me this is the most satisfying way of doing so. However, the range from (+-)90 to 400hz has room for improvement spatially speaking since the barrier cannot space it out. Delivering it through standard stereo holds a lot of promise since the fundamentals of many notes will then be also coming from the direction where the harmonics are.

I´ll comment when I try it this christmas. Meanwhile I´m following your effort with great interest. I find it surprising that so few people are disappointed enough with standard stereo as to try other ways. I guess we don´t all hear the same way.
 
Hi kstrain,

Would you mind to answer some questions?

My biggest concern, since i also want to try this kind of setup, is the practicality. For example: Is the sweet spot big enough to allow the inevitable movements of the head, lets say by 20cm in all directions, or can one identify artefacts in the sound by doing this? Generally said, how compatible is this speaker setup with comfortable listening?

Some questions about your system design:
Why didnt you use the DE250 as the tweeter, with a crossover around 1-2khz, or said in a different way, how do the supertweeters help?
Why mono below 300hz? Couldnt one use +/- 90 degree spaced midbass speakers in the 80-300hz area?

As a bottom line, i would like to say thanks for sharing your experiences in such a great detail, as you have done.
 
Btw... this OSD should really harmonise with a fully horn loaded sytem, since you actually NEED the driver spacing, which is otherwise forced onto you by the big horns. This would also help to negate the efficiency losses due to the crosstalk cancellation filter. And finally, having another reason to build big horns cant be wrong. ;)
 
Hi MaVo,

"Would you mind to answer some questions?"
Happy to do so.

"Generally said, how compatible is this speaker setup with comfortable listening?"
Not very! The worst case is left-right movement of the head - there even a few cm movement swings the image around in a disconcerting way. Modest amounts of head rotation, height, and front-back movement are all no problem at all. The image is quite stable with head rotation, which nis nice.

Of course, I write from very limited experience, and there may be ways to trade slightly reduced cancellation for larger position tolerance. That may be as simple as just moving 0.5m closer than the optimum, for example. (I'd say it is definitely easier than with a barrier though!)

"Some questions about your system design:
Why didnt you use the DE250 as the tweeter, with a crossover around 1-2khz, or said in a different way, how do the supertweeters help?"

The source angle would, optimally, be a smooth function of frequency. The trickiest part to get right is above a few kHz, which needs a pair of tweeters spaced only 5 or so degrees apart. Then the various midranges can be much further apart, as I described. Since the power requirements of tweeters for use above ~5kHz are very modest it seemed easiest to get the domes (and the "waveguide" ones have quite reasonable directivity above 5 kHz). That left the DE250/XT1086 to cover the upper mid. In other words those dome tweeters are quite adequate for the very limited demands placed on them - and cheap too. Also I'm not sure I can completely eliminate higher order cavity modes in the XT1086 - even though I have stuffing in them that absorbs more than Geddes' foam. I feel there are problems still (it is very hard to measure this though).

Perhaps it also helps to explain that I just don't get the expected bandwidth out of my 8PE21s - they have a quite smooth fall off from just below 2kHz, without the peak shown in the data sheet and in independent measurements. Together with the XT1086 lower cutoff (they lose vertical control at ~1.6kHz) essentially forces me to put a crossover somewhere around 1.6 to 2 kHz.


"Why mono below 300hz? Couldnt one use +/- 90 degree spaced midbass speakers in the 80-300hz area?"
I need to check what I wrote, but the impression I've given you is wrong: the 80-300Hz band is stereo. I'd love to extend OSD down to ~100 Hz, but my room is not wide enough, and there are obstructions where the bass speakers would need to go.

"As a bottom line, i would like to say thanks for sharing your experiences in such a great detail, as you have done."

I've learned a great deal from these forums, and am happy to try to give a little back, especially in this area which is so rarely discussed.

"Btw... this OSD should really harmonise with a fully horn loaded sytem, since you actually NEED the driver spacing, which is otherwise forced onto you by the big horns. This would also help to negate the efficiency losses due to the crosstalk cancellation filter. And finally, having another reason to build big horns cant be wrong. ;) "

Yes, that thought has definitely crossed my mind! With normal stereo the inter-driver spacing is such a big problem. Yet here I have large horizontal gaps between speakers, and so a room full of interference "combs". The hilbert transform/crosstalk matrix changes everything. It is quite surprising how normal the sound is when listening casually from a random point in the room (e.g. when walking around).

Note that there is no efficiency loss with the 90-degree filter - that is what is so special about OSD (and why the reflected field is not messed up as with an ambiophonic dipole).

Building big horns is beyond my skill level (I tried a few basic ones), but yes, I think OSD could work beautifully with the right array of horns!

Ken
 
Simulink diagram for OSD

In case this helps someone get started (I can also provide MATLAB code for most of the subsystems, and help with the others).

I've added a few labels to the diagram.

Ken
 

Attachments

  • osd.png
    osd.png
    13.4 KB · Views: 534
Looking back I realise my description of how the speakers are set up is not all that clear. I'll try to adda photo later and here is a schematic diagram.

The crossovers are shown notionally, the sub and umid ones are done in Foobar2000 crossover (including delay correction for both, and there is parametric EQ and some other filtering on the multiple subs). The other 3 bands are done in a Behringer ultradrive box, along with a few parametric tweeks - the basic filters are about 4th order but of course it is not as simple as that.

The sketch shows the simplified idea, for 1 channel and the subwoofers.

Ken
 

Attachments

  • filtersketch.png
    filtersketch.png
    7.3 KB · Views: 495
This thread really got my on the track to make my own OSD system. I am especially interested in building my own crosstalk cancellation in a modular environment like Reaktor.

Implementing ambience channels would also be great, because one can mess with the sound of the recording alot by placing it in different environments. I think one can get quite creative there.
 
MaVo said:

Implementing ambience channels would also be great, because one can mess with the sound of the recording alot by placing it in different environments. I think one can get quite creative there.

I finally got hold of some excellent binaural recordings of music that I like (from Mike Skeet of cornucopia-music.co.uk). The result with OSD is very impressive with as great a sense of "being there" as I can imagine in my listening room.

Though I was stimulated to thought by Poldus' earlier post, I can't really see a reliable way to "implement ambience channels" other than as an effect.

Ken
 
small update (big change)

Eventually, I managed to find time to get my bass speakers (i.e. those that cover up to 300 Hz) spread out to provide a 180 degree span. I changed the OSD filter to provide OSD down to 150 Hz, with stereo below that on the bass speakers (from around 50 Hz, so hardly any stereo, and it could probably be mono) and mono on the subwoofers as before.

The result was a big expansion of the image "volume" (particularly on those Mike Skeet recordings). Of course there is no height information so I should say "image width x depth".

The change in image shape (as opposed to extent) tells me that the solution I had before was probably not the right one when the speakers do not span 180 degrees at LF (just in case someone trys to follow my meanderings - it is best to aim for 180 degree span from the start, even if that means refurnishing the room).

The result is so very different from what you might expect when looking at the locations of the speakers ...

Ken
 
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.