CamillaDSP - Cross-platform IIR and FIR engine for crossovers, room correction etc.

If I understand correctly, you would still have the timing issues mentioned in my response.

You have L1 and R1 coming in and L[1-4] and R[1-4] going out (vertical case). LN outputs will still not be synced with RN outputs because they are 2 different Async USB devices, 2 different clocks and 2 unsynced streams.

Last I checked, simultaenous versions of AlsaCDSP will not play nice.

If each copy of CamillaDSP is started with its own config file (like what AlsaCDSP does), simultaneous copies may run if they don't share any temp files or mmap areas.
OK, let me talk my way through it and see if that makes sense (maybe not!):

Two channels of external analog audio are available to the ACD and are perfectly synchronized. CamillaDSP gets samples from there (e.g. via ALSA), does its DSP stuff, and then puts samples again via ALSA to the DAC side where the interface spits it out according to its clock/oscillator. Since the ADC and DAC may not run on the same clock (even within the same interface) Camilla will either add or subtract samples via its async resampling mechanism. If you think about it, the sample rate is really not all that relevent, only the ratio of samples in to samples out. The async resampling is a way to account for the imbalance. Under this scheme, I would think that the whole computer and software is like a black box. Analog data comes in and analog data goes out and overall timing is preserved but with delay added due to the internal buffering and processing lag, however, that delay should remain fixed and not change over time. If it did there would be drift and eventually Xruns of some kind. Is this correct?

So far I have talked about one CamillaDSP process and one audio interface. Now we plug another interface into the same computer and we allow another CamillaDSP process to run it, just like when we had only one interface. If one process can run without drift, why would two processes drift apart?
 
Last edited:
OK, let me talk my way through it and see if that makes sense (maybe not!):

Two channels of external analog audio are available to the ACD and are perfectly synchronized. CamillaDSP gets samples from there (e.g. via ALSA), does its DSP stuff, and then puts samples again via ALSA to the DAC side where the interface spits it out according to its clock/oscillator. Since the ADC and DAC may not run on the same clock (even within the same interface) Camilla will either add or subtract samples via its async resampling mechanism. If you think about it, the sample rate is really not all that relevent, only the ratio of samples in to samples out. The async resampling is a way to account for the imbalance. Under this scheme, I would think that the whole computer and software is like a black box. Analog data comes in and analog data goes out and overall timing is preserved but with delay added due to the internal buffering and processing lag, however, that delay should remain fixed and not change over time. If it did there would be drift and eventually Xruns of some kind. Is this correct?

So far I have talked about one CamillaDSP process and one audio interface. Now we plug another interface into the same computer and we allow another CamillaDSP process to run it, just like when we had only one interface. If one process can run without drift, why would two processes drift apart?
Maybe a diagram would help my feeble perception skills:

[ADC] -> CamillaDSP -> [USB DAC_1] is analogous to
[PIPE] -> CamillaDSP -> [USB DAC_1] and should work fine.

The pipe acts as a bungee cord and as long as it doesn't empty (over or under), timing is fine. If the input pipe runs empty, the DAC output will suffer accordingly.


"... There are two interfaces. Each has four inputs and four outputs. ..."

I take this as also saying, each of the 2 interfaces have their own independent clocks.

The problem comes in here:

[PIPE] - CamillaDSP ---

+--> [ASYNC USB DAC_1 "four outputs, CLOCK_1", timeline_1]
|
+--> [ASYNC USB DAC_2 "four outputs, CLOCK_2", timeline_2]

2 async USB output devices will drift apart from each other since they are not Master/Slave clocked together.
 
No that is not correct. Remember I am asking about running two completely separate camilladsp processes. You only show one. I am talking about this:

[USB INTERFACE_1 ADC] -> CamillaDSP_process_1 -> [USB INTERFACE_1 DAC]
[USB INTERFACE_2 ADC] -> CamillaDSP_process_2 -> [USB INTERFACE_2 DAC]

Two Camilladsp processes, each running independently under their own pid, etc. Each one is getting/putting data to/from a single interface, never two.

My question is really whether two separate camillsdsp processes can be run concurrently. Since I do not know all the details about how it runs behind the scenes I do not know whether the two processes would step on each other in memory, or whereever they do business in the OS.
 
No that is not correct. Remember I am asking about running two completely separate camilladsp processes. You only show one. I am talking about this:

[USB INTERFACE_1 ADC] -> CamillaDSP_process_1 -> [USB INTERFACE_1 DAC]
[USB INTERFACE_2 ADC] -> CamillaDSP_process_2 -> [USB INTERFACE_2 DAC]

Two Camilladsp processes, each running independently under their own pid, etc. Each one is getting/putting data to/from a single interface, never two.
That is irrelavant. Both interfaces will still suffer the same random drift (thus the reason I left those details out of the diagram). Adding separate inputs AND separate processes only exacerbates the problem.

If the ramdom drift is not an issue for the intended use case, disregard my comments.
 
Last edited:
OK, then what is the magnitude of that (random drift)? It cant' be unbounded or that would lead to xruns. So I assume it would lie within the buffering available to CamillaDSP? So the playback can speed up or slow down within the buffer as needed?

TBH, if the synchronicity between left and right can be kept to a low value, some drift would not really matter. Low means +/-100 usec which is only a couple of samples!
 
Last edited:
OK, then what is the magnitude of that (random drift)? It cant' be unbounded or that would lead to xruns. So I assume it would lie within the buffering available to CamillaDSP? So the playback can speed up or slow down within the buffer as needed?

In my experience rate adjust will result in changes of +/- a few hundred samples from the nominal buffer level. I think the net effect of running two separate interfaces with their own free running clocks for left / right would be shifting of relative delay between left / right of about 0-10 ms.

Michael
 
So at this point what I am hearing is that the mechanism within CamillaDSP allows for speedup and slowdown of a certain amount, which may be as much as +/-10 milliseconds? Seems like a lot of deviation.

I could actually measure this using ARTA. I did that previously to test the synchronicity between two independent clients using Gstreamer receiving RTP streams over my LAN.
 
There are at least 2 problems at play, inter-sample sync and sample sync.

Think of how most streams are written to a multi-channel DAC using a single clock. Each sample is interleaved.

If your 2 x 4-channel DAC supports 32-bit, each sample would look like (8 x 4-bytes or 8 x 32-bits or 32 bytes).

Example Ideal Format seen by a single device. Inter-channel synchronization and error recovery is done at the sample level.

S=Sample
C=Channel
I=Integer


Code:
S1 = C1_I32, C2_I32, C3_I32, C4_I32, C5_I32, C6_I32, C7_I32, C8_I32
S2 = C1_I32, C2_I32, C3_I32, C4_I32, C5_I32, C6_I32, C7_I32, C8_I32
S3 = C1_I32, C2_I32, C3_I32, C4_I32, C5_I32, C6_I32, C7_I32, C8_I32
SN = C1_I32, C2_I32, C3_I32, C4_I32, C5_I32, C6_I32, C7_I32, C8_I32

Now break that up into 2 devices, each carrying half the channels. Now synchronization and error recovery is complicated by inter-sample recovery issues assuming both streams were started exactly the same time (more on that later).

Code:
Device 1:
S1 = C1_I32, C2_I32, C3_I32, C4_I32
S2 = C1_I32, C2_I32, C3_I32, C4_I32
S3 = C1_I32, C2_I32, C3_I32, C4_I32
SN = C1_I32, C2_I32, C3_I32, C4_I32

Device 2:
S1 = C5_I32, C6_I32, C7_I32, C8_I32
S2 = C5_I32, C6_I32, C7_I32, C8_I32
S3 = C5_I32, C6_I32, C7_I32, C8_I32
SN = C5_I32, C6_I32, C7_I32, C8_I32

Now, add in separate tasks. Separate tasks add separate OS interrupt latencies, separate write timings and inter-process synchronization issues further complicating and worsening the problem. This can be measured using various OS specific latency tests and that doesn't account for start/stop/sync stream issues that are stacked on top of that. Both processes would have to know when AND WHERE the "Nth Sample" is to recover.

Now add separate inputs feeding separate processes. Each input would have to guarantee the exact same start of stream. What would happen if one channel was silent and the other was non-silent ??? Would you stall one channel waiting for signal or just assume the channel was silent ? How would you determine start of streams since an ADC probably can't distinguish external stimuli from background noise ? One method would be to add sample identifiers into the initial base streams to serve as sync markers to be used downstream.

The further you separate the streams backwards the more issues are involved and we haven't even begun to address the async in "async USB".

Commercial DSP software vendors do NOT recommend using USB mics and USB DACs to generate DRC/Speaker filters because of clock drift in the attempt to develop accurate FIR filters and driver distances.
 
In my experience rate adjust will result in changes of +/- a few hundred samples from the nominal buffer level. I think the net effect of running two separate interfaces with their own free running clocks for left / right would be shifting of relative delay between left / right of about 0-10 ms.

Michael

Half wavelength at 10ms is @ 5.65 feet or @ 200Hz meaning that frequencies from 200Hz and up could be completely cancelled out with up to a 10ms delay between channels (assuming the same frequency is played in both channels, e.g. center image).

Audio Wavelenth chart
 
Half wavelength at 10ms is @ 5.65 feet or @ 200Hz meaning that frequencies from 200Hz and up could be completely cancelled out with up to a 10ms delay (assuming the same frequency is played in both channels, e.g. center image).

Audio Wavelenth chart

I am aware :).

My main point was that in practice the buffer level is not that consistent which implies that the differences between capture clock and playback clock are not that consistent and vary a non-trivial amount. This to me suggests if you are using different USB DACs for left / right rate the relative delay can be quite large and will not be consistent even if rate adjust is able to manage the relationship between capture and playback clocks for each DAC.

Michael
 
I am aware :).

My main point was that in practice the buffer level is not that consistent which implies that the differences between capture clock and playback clock are not that consistent and vary a non-trivial amount. This to me suggests if you are using different USB DACs for left / right rate the relative delay can be quite large and will not be consistent even if rate adjust is able to manage the relationship between capture and playback clocks for each DAC.

Michael
That is another significant variable that I left out of my previous post about the signal chain.

Your example shows a random variance so it is even worse than a constant offset.

It is interesting to listen to constant delays to a single channel playing a pan track and perceive to what it does to the sound stage. This Stockfische Djembe walk track is a good test track. Adding delay to one channel can switch the right side start to sound like it starts from the left and then jumps immediately to the right. Add variance to that and you have a total mess to cope with.

Stockfische Djembe track
 
I am considering a new CamillaDSP setup that uses two USB 4-channel audio interfaces to obtain 8 total output channels. Two input channels need to be processed. Each channel will undergo the exact same DSP steps (left and right channels of a loudspeaker crossover). The audio interfaces are asynchronous, so each will have its own clock. I'm wondering what is the best way to use CamillaDSP under this scenario.

You must first synchronize both output cards: You must resort to a master-slave setup: One card has to run as a clock master, the other as it's slave.

So you will have a hardware problem as long as the two soundcards run asynchrously in both master modes. Different clocks clocking different DA's is a no-go for phase coherent outputs. So your planned system is hardware-wise flawed right from the beginning, and you will fix this neither by a twin camilladsp approach, nor by a config's or buffer's tweak. More precisely, your 8 (=4+4) outputs must not be fed from asynchronously running, two soundcards. Otherwise you will have a time variant shift between the two 4-ch output sets. This will not necessary provoke xruns, as long as both buffer's sets are filled enough. But for shure it will comletely mess up the the phase coherence of the 8 channels.

You may get a more or less valid system instead in another setup with two different, asynchronously running soundcards: When there is a master clocked input soundcard, and another master clocked output card. So, e.g. 2 channels in from Soundcard A, 8 channels out to soundcard B. There, at some time, there will be xruns because of buffer over- or underruns between A and B. But at least, phase coherence is maintained.

I once had a system with two RME 9632 working along with brutefir. One card for each R and L channel. You can firmware/software-wise set these 9632 either as master, or as slave. If one card is set as slave, it will need a clock from outside, in this example from the master clocked card. As in your propoed setting, at first and erroneusly, I set both cards up as master. So each one had it's own clock, as in yours planned system: They run asynchronously. The system worked fine at startup, but slowly and steadily the stereo image got more and more blurred. No xruns, because each buffers were kept filled enough. But the result was not pleasent at all. This twin-card system worked flawlessly as soon as I switched one on these two cards as slave to the other as master in order to synchronize both of them. And of course I had to provide a cable connection between them in order to feed the clock signal from the master to the slave.

So then, there is no lunch for free. Either, you will have to resort to a dedicated 8 channels soundcard, or you will have to modify your setup in terms that one of these cards can be slaved to it's master counterpart, in order to synchronize your both soundcards. Well equipped souncards provide a choice of either be run as master's or as slaves, and maybe your 4-channels card provide this option. If not so, then have e.g. a look here. The info you will find is both very outdated in terms of hardware, but still valid in terms of principles. And in case you don't use Linux/Alsa, then only look at the first part about the modifications of the soundcards needed:

http://quicktoots.linux-audio.com/toots/el-cheapo/

So this is definitively no more a camilladsp topic, if I understand well. It's a basic multicard computer audio setup matter you are struggling with. And have a look at master-slave-setups: There is a lot of infos about master-slave setups in studio environments available on the net, such as on the site or RME (related to a product of theirs) :

https://www.rme-audio.de/downloads/fface_uc_e.pdf
 
Charlie, I think I kind of understand where you are aiming. Having two separate soundcards for each channel (1ADC in -> 4 DAC out), keeping the processing separate. From the single POW it would work OK. CDSP can run in any number of instances, just configure individual network ports for the websocket server. But inevitably each process will have a different initial latency, between capture and playback start. The two threads are synced at start with rust barrier facility https://github.com/HEnquist/camilladsp/blob/master/src/alsadevice.rs#L738 https://github.com/HEnquist/camilladsp/blob/master/src/alsadevice.rs#L831 . But while the capture thread starts reading right away, the playback thread does not start writing until first processed audiochunk arrives from the processing thread https://github.com/HEnquist/camilladsp/blob/master/src/alsadevice.rs#L418 . The initial processing delay will vary for each process due to OS scheduling. I do not see any way to solve this, apart of processing all channels in one audiochunk - i.e. only one CDSP process running.

Even when capturing 2 channels from card A, processing both channels at once, and asynchronously resampling one quartet of the resultant channels for the other card B - this would have to be added to CDSP, the playback start on card A would have to wait for the resampling for card B to be finished so that both cards playbacks can start sort of simultaneously, but also the resampling always fluctuates a bit, resulting in some delay fluctuation B vs. A playback. I do not think this is a viable path.
 
Hi Pavel, thanks for your thoughts, especially about the variable delay to first sample output. I didn't think about that.

Anyway, I am already running Gstreamer based software that I wrote that does all of this using RTP+RTCP and runs my ACDf LADSPA filters. This allows me to set up each of the left and right speaker using completely separate clients (two separate computers each with their own audio interface) and send audio to them over WiFi from the source (another computer on my home LAN). Each client receives one channel, does all the DSP crossover processing, and spits out 4 channels to the amplifiers. The Gstreamer RTPbin includes some mechanisms in which the client reports back to the sender about the playback status and timing, and this is somehow used to update the playback WRT the server's clock. I had to tweak it a lot, and it is not perfect in that there are occasional audible ticks where samples are added, removed, or the playback pointer is moved. But this happens only occasionally and becomes less with time (the system seems to stabilize after some time). The stereo image remains pretty stable. It's about the best I can do using an all software solution.

Regarding my idea to use two 4-channel interfaces, this was just because I happen to have a few of them and I am putting together a new system for live demos that would use live (not LAN streamed) analog audio inputs. I thought that CDSP would have some internal magic that would make it work, but the devil is in the details and I can see that it is probably not going to work out any better than what I am using now, with the exception that the resampling in CDSP would eliminate the audible issues I mentioned above.
 
There are good answers already, I don't have much to add.
CamillaDSP doesn't support outputting to (or capturing from) more than one device. That can be worked around with an Alsa multi plugin, but that doesn't work well unless the devices have their hardware clocks synced to each other somehow. Some people are using this successfully with simple USB sound cards that lock to the USB bus. But it's no good for asynchronous usb devices meaning basically all nice modern DACs and interfaces.
Analog input makes things a little easier. Running two camilladsp instances, one for each 4-channel interface should work. Assuming that each interface uses the same clock for capture and playback, the only problem is that there will be a random difference in latency between the two processes, of up to a couple of milliseconds. This difference stays constant until something is restarted.
Enabling rate adjust and resampling can fix that, but the cost is that the relative latency can drift. Probably not more that a few milliseconds but that is probably already too much.
 
I can imagine a few use cases in which it could be helpful, like having two channels to your speaker system and two others to a headphone system (different eq) or a bluetooth transmitter so you could swith without the need to get into the settings. Or the other way around having two inputs always active so that you could just start/stop sources with no need to do any other adjustment...