CamillaDSP - Cross-platform IIR and FIR engine for crossovers, room correction etc

Perhaps the combination of avail instead of delay (which should include the device latency, unlike avail) and the pipewire device may be the cause. Maybe the pipewire device behaves differently, causing large fluctuations in the loopback rate adjust. I have not looked at the trace logs, tomorrow.
 
Yeah, the problem is that I had 50 config files to convert (different experimental settings, different FIR filters, etc), and it was a lot easier to do it in bulk with the right regexes rather than stepping through the gui for each of them.

When it comes to documentation, more clarity is always a good thing.
 
@QuickDraw McGraw : Looking at your trace log with target level 6k - the silence checker seems to kick in several times, interrupting playback while keeping the capture running. The code seems to correctly avoid inserting new captured chunks into the processing queue when being paused, but there is no trace log around https://github.com/HEnquist/camilla...de888c24c17182b63d051/src/alsadevice.rs#L1039 to log insertion of the new chunk into the queue - @HenrikEnquist please can you add a trace log there?

@QuickDraw McGraw - let's try disabling the silence checking (removing the silence_xxx params from your config) and retest with the larger target_level + trace logs. Thanks.

Logically, if the playback side has its soundcard buffer of 8k samples which corresponds to some 170ms, it cannot accumulate a latency of many hundreds of ms. But if chunks were being accummulated in the queues between the threads without being consumed/played, that could increase the latency significantly.
 
Last edited:
Yes, please, trace logs for both presumably working and non-working target level.
Thanks.

Ok you already have the non working version log. I posted it yesterday.
The working log is attached.
I do not see how a larger target level well within the buffer size could cause a gradually growing latency, apart of larger latency to start with.
Looks like something performing aritmetic operations on float data and dropping the decimals. That would introduce that kind of bias.
 

Attachments

@QuickDraw McGraw : Looking at your trace log with target level 6k - the silence checker seems to kick in several times, interrupting playback while keeping the capture running. The code seems to correctly avoid inserting new captured chunks into the processing queue when being paused, but there is no trace log around https://github.com/HEnquist/camilla...de888c24c17182b63d051/src/alsadevice.rs#L1039 to log insertion of the new chunk into the queue - @HenrikEnquist please can you add a trace log there?

@QuickDraw McGraw - let's try disabling the silence checking (removing the silence_xxx params from your config) and retest with the larger target_level + trace logs. Thanks.
Configured with
Code:
  #silence_threshold: -80
  #silence_timeout: 5
Under evaluation with an audio file (completed) and a video (ongoing).

Log attached
Logically, if the playback side has its soundcard buffer of 8k samples which corresponds to some 170ms, it cannot accumulate a latency of many hundreds of ms. But if chunks were being accummulated in the queues between the threads without being consumed/played, that could increase the latency significantly.
 

Attachments

Very good. Then the assumption that the pauses by the silencer cause the extra latency seems valid.
I suspect the current silencer has some issue with pausing the capture and playback threads, which gets into effect when the target level is larger than one chunk. Something like the processing queue gradually grows with extra chunks sent from the capture while the playback is already paused.

I asked Henrik for a few added traces. I hope that with those we should be able to diagnose the issue.

What settings would you recommend?
If you do not need the silencer (which I think is the case), I would opt for target level with more chunks, it's safer on the timing. But of course the latency is bigger.
 
If you do not need the silencer (which I think is the case), I would opt for target level with more chunks, it's safer on the timing. But of course the latency is bigger.
Thanks a lot. I used the silencer with the CMI8378 card because it was so noisy. Now I can live without it. Would you recommend 6143 or 6144 for the target level?

With 6143, I noticed in the log lines like that:

Code:
2025-01-17 14:10:56.914588 TRACE [src/countertimer.rs:153] Averager: added value 6144, nb. 399
2025-01-17 14:10:56.960539 TRACE [src/countertimer.rs:153] Averager: added value 6144, nb. 400
2025-01-17 14:10:57.001941 TRACE [src/countertimer.rs:153] Averager: added value 6144, nb. 401
2025-01-17 14:10:57.047838 TRACE [src/countertimer.rs:153] Averager: added value 6144, nb. 402
2025-01-17 14:10:57.085237 TRACE [src/countertimer.rs:153] Averager: added value 6144, nb. 403
2025-01-17 14:10:57.128229 TRACE [src/countertimer.rs:153] Averager: added value 6144, nb. 404

It's an old habit I kept from my programming era to subtract 1 from a power of 2 e.g. 127, 63, 255...
asked Henrik for a few added traces. I hope that with those we should be able to diagnose the issue.

I'll be happy to test with the enhanced verbosity.
 
@QuickDraw McGraw and @phofman I have an idea about what is going on. If I'm right, the difference between target level and alsa buffer size needs to be larger than one chunksize. The value in the log, 6144, is simply 8192 (buffersize) - 2048 (one chunk). That is the largest value it is able to measure as it works now, even if the true level may go higher.
Use a smaller target level for now, and I'll get back when I have had a proper look at it.
 
Would you recommend 6143 or 6144 for the target level?

With 6143, I noticed in the log lines like that:
You should be getting basically the same output with 6144 too.

Target level is the average output buffer fill, denominated in audio frames, which the feedback rate control (direct for loopback/usb gadget or async resampling for other input devices) strives to reach and keep. When the first processed chunk arrives from Processing, the Playback thread is put to sleep for target-level time https://github.com/HEnquist/camilla...8c24c17182b63d051/src/alsadevice.rs#L118-L120 . After this delay the first chunk gets written to the output buffer and the loop continues. Assuming that the processing fetches chunks at steady pace (no major CPU load variations to affect the processing time), a target-level of samples has accummulated in the queue to the playback thread during this target-level time sleep (including the first chunk). This first sleep basically determines the latency of the playback thread because once the soundcard starts running and has no xruns, it by principle keeps the latency constant.

Then the feedback controller (Henrik has added a nice PID regulator in v.3) regulates the capture samplerate towards the buffer-fill target at target_level.

As you see it really does not matter if your target is 6144 or 6143, it's just some target level the feedback aims at. It's never precise.

Now you can ask why the measured buffer fill (as listed in the "Averager: added value 6144") is so constant, when the timing of the measurement can never be so exact and the soundcard consumes the samples from the buffer continuously. The precision of the buffer fill value in time depends on the soundcard device driver/alsa plugin. Some drivers update the value continuously, e.g. the usb gadget which updates the value after every incoming USB packet with the data (e.g. every 125us). But most drivers update these numbers at period boundaries when they get awoken by IRQ from the soundcard. The period time thus determines granularity of this number - the more of shorter periods, the more exact this value is. It's nicely seen in your log:

Code:
grep -o -e "Averager: added value [0-9]*" camilladsp.log  | awk '{print $NF}' | sort | uniq -c
      2 2048
   3083 4096
   5317 6144
   2323 8192

Only chunk multiples were read from the playback soundcard (the pipewire alsa plugin in this case). Your period size is in fact only 1024, but the buffer fill reading is taken right after writing the chunk (= 2 period sizes), hence the multiples of chunks sizes = 2 period sizes. IMO if your target-level were set e.g. at 2.5 of the chunksize, we would see the odd 1024 multiples too, but again wrapped to multiple of 1024 because that's th smallest granularity of the reported value.

This shows how difficult it is to keep the feedback control precise, because the input data are extremely granular for some sound devices.
 
Last edited: