So what channels and rates are configured? Your CDSP config is quite important 🙂I do have a Mac mini M2, 3 DACs combined in a main audio unit. Sampling rate can be 768khz 32bit.
I do have blackhole 16ch, which should also be capable of 768khz 32bit.
However when I set CamillaDSP to 768khz, Chunksize: 8192, resampling enabled AsyncSync, Accurate, Target Level 16383, I get pops and stutters. Using 384khz is working just fine.
First I would suggest to check the threads CPU load when running (e.g. top -H). One of the threads may be near 100% load of CPU core.When I tried doing 10 channels @48000 taps the thing just got stuck.
Yes. That's what I did.
It's not supposed to happen at such low requests like 10ch 12k taps, when you can to crazy stuff like 60k 24 channels@192khz on inferior hardware (rpi4 eg) at something like 50% load.
It's not supposed to happen at such low requests like 10ch 12k taps, when you can to crazy stuff like 60k 24 channels@192khz on inferior hardware (rpi4 eg) at something like 50% load.
And the load was what?
You talk about async resampling too. That's why I asked about your config.
You talk about async resampling too. That's why I asked about your config.
I do have 6 Channels. What rate Setting in specific? sample rate should be 768khz for all virtual and real devices. Is that what you wer asking for? I certainly can Share my config, if that is helpful. 🙂So what channels and rates are configured? Your CDSP config is quite important 🙂
Well, async resampling to 768kHz for 6 channels may take quite some CPU load, I do not know your configuration and thread loads. It's quite difficult to do any troubleshooting without this most basic info.
Thanks for testing and sharing the numbers! The higher total cpu load with multi threading is to be expected, there is some overhead involved. I get similar numbers.I tried the multithreader option and will keep it for the moment.
The Sinc resampler is quite heavy on the Accurate preset, it's very likely too heavy at 768 kHz. Either lower the sample rate, or use the Balanced preset.768khz, Chunksize: 8192, resampling enabled AsyncSync, Accurate, Target Level 16383, I get pops and stutters.
That seems a bit odd. Please share your complete config.I get 40-50% dsp load (on one/4 cores) with 10 channels of 1200 taps@48khz and 10 channels 4800 taps@48khz
Or recompiling CDSP with the 32bit feature, reduces load on poorly HW-accellerated platforms substantially and the noise deterioration is inaudible (-150dB noise floor in sox spectrogram).
Did some testing with dummy FIR, 262k, 8 channels, added in async resampling from 48 to 192khz.
I increased the chunk size form 128 to 1024 and the problem disappeared.
Chunksize has to be 342 or above, if 341 or below, program stalls and dsp load goes to 400%, cores do nothing.
Attached is a config with 128 chunk size. I do async resampling from 48khz to 48khz with target level 128, this and silence at -70db is the only thing that helped combat stalling and not starting on silence.
Thanks!
I increased the chunk size form 128 to 1024 and the problem disappeared.
Chunksize has to be 342 or above, if 341 or below, program stalls and dsp load goes to 400%, cores do nothing.
Attached is a config with 128 chunk size. I do async resampling from 48khz to 48khz with target level 128, this and silence at -70db is the only thing that helped combat stalling and not starting on silence.
Thanks!
Attachments
IME at low chunksizes it's good to check the actual alsa period and buffer sizes vs. the chunksize. I have seen buffer size equal to chunksize which cannot work without permanent xruns. The issue is chunksize is fixed, but period/buffer sizes are determined by the HW driver, the values requested by CDSP are just hints (xxx_near alsa methods). The easiest solution is then using chunksize for which CDSP generates usable period and buffer values.Chunksize has to be 342 or above, if 341 or below, program stalls and dsp load goes to 400%, cores do nothing.
128 is very small for convolution. When the chunk size is smaller than the fir filter length it does segmented convolution. The more segments it needs, the less efficient it gets.I increased the chunk size form 128 to 1024 and the problem disappeared.
The FFT used in the convolution is also more efficient at easy chunk sizes such as powers of two. 341 is a particularly bad value since it's the product of the two primes 11 and 31.
"When the chunk size is smaller than the fir filter length it does segmented convolution."
I've read this in your readme at github, just didn't expect it to be such a strong effect.
"The easiest solution is then using chunksize for which CDSP generates usable period and buffer values."
I'm at that point now, just help me understand one other thing 🙂
Since camilladsp requires static inpuit samplerate, I'll add a hardware ASRC before it's toslink input at a fixed 96khz, soon when parts arrive.
Do I calculate latency from chunksize from the input samplerate or the internal operating samplerate of camilladsp?
Thank you both!
I've read this in your readme at github, just didn't expect it to be such a strong effect.
"The easiest solution is then using chunksize for which CDSP generates usable period and buffer values."
I'm at that point now, just help me understand one other thing 🙂
Since camilladsp requires static inpuit samplerate, I'll add a hardware ASRC before it's toslink input at a fixed 96khz, soon when parts arrive.
Do I calculate latency from chunksize from the input samplerate or the internal operating samplerate of camilladsp?
Thank you both!
If you have the time for a quick 30s explanation as to why that would matter, that'd be great! If it's too complex to explain, don't bother, thanks!341 is a particularly bad value since it's the product of the two primes 11 and 31.
I just find that particular info very interesting.
Most FFT algorithms work by dividing a long transform into shorter ones, for example length 512 is 16x32, and can be calculated with length 16 and 32 FFTs (plus some really clever math tricks). Those 16 and 32 long FFT can be further split, until each transform is trivial.
341 can be split to 11 and 31, but then we can't split those any further and we have to calculate those transforms without any fancy tricks. But that is still better than for example 347 that is a prime itself, so can't be split at all.
341 can be split to 11 and 31, but then we can't split those any further and we have to calculate those transforms without any fancy tricks. But that is still better than for example 347 that is a prime itself, so can't be split at all.
Hi @HenrikEnquist and @phofman,The Sinc resampler is quite heavy on the Accurate preset, it's very likely too heavy at 768 kHz. Either lower the sample rate, or use the Balanced preset.
I think I got it working based on your feedback.
I thought resampling is needed in to order to calculate the chunk size properly.. I was wrong 🙂
I recently changed the chunk size to 4096, disabled rate adjust and resampling.
It is way better now. My CPU Load (Mac mini m2) is quite relaxed. about 2 percent for the CamillaDSP terminal 🙂
Thank you both 🙂
If your capture and playback devices run asynchronously, eventually they will hit buffer issues without the async resampling in CDSP.I recently changed the chunk size to 4096, disabled rate adjust and resampling.
@marixfifteen8 are capturing from Blackhole? If yes, enable rate adjust (and leave resampling disabled). That will let Camilladsp sync the virtual clock of Blackhole.
- Home
- Source & Line
- PC Based
- CamillaDSP - Cross-platform IIR and FIR engine for crossovers, room correction etc