Linux USB-Audio Gadget (RPi4 OTG)

well the endgoal is to use something like spdif/rca so a rpi + hat (or a usb bridge, tho a usb bridge mostly uses the same usb implemenation as a dac so a rpi might even be better) so i can avoid usb altogether and get also better clock performance since my dac doesnt have "fancy" clocks and better isolation to the pc
I am not sure I understand the clock part. Majority of RPI DAC hats use the by-design jittery clock from RPi, and do not have a proper master clock. Unlike most modern USB DACs with do have proper clock and use proper USB async mode.

im pretty much speculating since i havent tested for differences here but the endgoal is to improve the pc output where i can do everything i want, even heavy resampling which isnt possible on a pie
CamillaDSP on RPi4B should be able to resample at best quality with no problem, the tests show so.

but specially before using the fedora/easyeffects/pipewire combo the desktop pc was basicly unusable for good audio compared to a dedicated rpi streamer
Did you check whether pipewire was resampling? You can always tell PW to avoid your USB DAC and use alsa directly, possibly with CDSP if needed. IMO no need to complicate with usb -> RPi -> USB just for that purpose.
 
I am not sure I understand the clock part. Majority of RPI DAC hats use the by-design jittery clock from RPi, and do not have a proper master clock. Unlike most modern USB DACs with do have proper clock and use proper USB async mode.
pi2aes and ian canada hats seem to have quite good clocks/performance, but you are right, there are alot of "low-end" hats

CamillaDSP on RPi4B should be able to resample at best quality with no problem, the tests show so.
ah true... i was running camillaDSP on one core and with just one core on the rpi4 its not possible to resample to 192k in the best quality
i think with 384khz/768khz 4 cores might even struggle with it, a desktop pc just has more juice but its more noisy..

Did you check whether pipewire was resampling? You can always tell PW to avoid your USB DAC and use alsa directly, possibly with CDSP if needed. IMO no need to complicate with usb -> RPi -> USB just for that purpose.
yes, you can tell pipewire to connect to your usb dac in the native samplerate ... a second stream in another samplerate gets then resampled to the first one, so if you listen to only music pipewire avoids resampling, for me this is "bitperfect" enough since i do EQ processing anyway and it gives you all positives of a pc setup (multi-stream) with the least SQ degradation, atleast i havent found a better solution yet (also sq wise there was a big difference going from windows/equalizerapo to fedora/pipewire/easyeffects)
with 24bit/32bit dynamic range reduction of (modest) software volume shouldnt matter that much either


but back to the alsaloop issue... there seems to be alot of reports with cpu usage going high and it seems to be an unfixed issue... so i will probably try to setup pipewire later today
 
pi2aes and ian canada hats seem to have quite good clocks/performance, but you are right, there are alot of "low-end" hats
pi2aes has proper master clock. Is there a Ian Canada hat with master clock and slaved RPi I2S? I have seen only FIFOs but I may have missed it.

i was running camillaDSP on one core and with just one core on the rpi4 its not possible to resample to 192k in the best quality
i think with 384khz/768khz 4 cores might even struggle with it, a desktop pc just has more juice but its more noisy..

IIUC resampling in CDSP runs in a single thread (the capture thread) = core.

also sq wise there was a big difference going from windows/equalizerapo to fedora/pipewire/easyeffects
DSPs vary a lot and can affect the sound (it's their job after all :) ). I thought you were striving for a bit-perfect chain.

but back to the alsaloop issue... there seems to be alot of reports with cpu usage going high and it seems to be an unfixed issue...
Yes, alsaloop has that CPU-load problem, I hit it number of times too. AFAIR the shorter the latency requested, the higher the chances of the lockup. There is some bug in the code. Since alsaloop does all its work in a single thread, it's main-loop code is quite difficult to understand and troubleshoot.
 
pi2aes has proper master clock. Is there a Ian Canada hat with master clock and slaved RPi I2S? I have seen only FIFOs but I may have missed it.
TransportPi Digi seems to have master clock mode
is master mode actually superior to FIFO? or does implemenation matter?

DSPs vary a lot and can affect the sound (it's their job after all :) ). I thought you were striving for a bit-perfect chain.
well, what im really after is the "perfect dsp/eq" chain, of course its not bitperfect but it should be quality wise on par minus the equalizer/dsp effects themself
after all doing it on pc should be in theory way better than implemenations like minidsp where it gets converted between analogue/digital a bunch of times
i dont wanna live without EQ/dsp either, but implementation seems to matter quite a lot

for example i found recently out that easyeffects offers FIR/SPM processing of the EQ plugin... and both sound superior to IIR processing (for me SPM is actually the winner, it even sounds better than FIR and i never heared of it before)

Yes, alsaloop has that CPU-load problem, I hit it number of times too. AFAIR the shorter the latency requested, the higher the chances of the lockup. There is some bug in the code. Since alsaloop does all its work in a single thread, it's main-loop code is quite difficult to understand and troubleshoot.
kinda mind boggling that something like this isnt fixed asap but yea... i will look for other solutions... jack seems to be a easy alternative too
 
TransportPi Digi seems to have master clock mode
It has master clock (like most FIFOs do), but IIUC again it's a FIFO behind master-mode I2S, instead of slaving the I2S to the clock and avoiding the FIFO.
is master mode actually superior to FIFO?
This is my view https://audiosciencereview.com/foru...-distortion-on-spdif-input.41709/post-1506815

kinda mind boggling that something like this isnt fixed asap but yea...
It's not trivial to fix, very few people use alsaloop in a production mode, and even fewer are capable of fixing that.

i will look for other solutions... jack seems to be a easy alternative too
IMO the best current option for merging clock domains is CDSP.
 
hmm atleast from my (limited) perspective they are the same or FIFO has even an advantage
masterclocks "connect" to the RPI to "replace" the existing clocks
FIFO buffers the i2s stream and basicly reclocks it "freshly" by better clocks
in the end both should provide good clocking but masterclocks have the disadvantage to connect to the "noisy" rpi itself while fifo can isolate the pi more
correct me if im wrong tho

IMO the best current option for merging clock domains is CDSP.
hmm i guess you hinting to the rate adjust feature to sync capture and playback device to avoid buffer under/overruns
alsaloop has the same function, i wish it would just work...

not sure how pipewire/jack does it tho

tried to install jack but i cant get it to start the server... "ALSA: cannot set period size to 1024 frames for capture"
 
FIFO has even an advantage
masterclocks "connect" to the RPI to "replace" the existing clocks
FIFO buffers the i2s stream and basicly reclocks it "freshly" by better clocks
in the end both should provide good clocking but masterclocks have the disadvantage to connect to the "noisy" rpi itself while fifo can isolate the pi more
correct me if im wrong tho
A clock cannot be replaced, unless the outgoing clock is tied somehow to the incoming clock, typically via PLL. Otherwise you will always experience buffer under/overruns. The delay till the buffer issue depends on the buffer length (i.e. inserted latency) and the difference between the two clocks (which is never zero, unless one clock is derived from the other one).

Some solutions check for low levels in the stream and try to add/remove samples in those stream sections. Again - it's a hack, a general stream does not have to have any silent parts.

You can use I2S isolation and reclocking (just flip-flops or very short fifos) for the slaved I2S interface too, if you are after isolation. But most modern DACs need clean master clock for the DA conversion and the I2S bus just delivers data.

These things have been discussed here and other sites many times.

hmm i guess you hinting to the rate adjust feature to sync capture and playback device to avoid buffer under/overruns
CDSP runs in (at least) three threads, it's safer, unlike alsaloop. It covers all alsaloop features, and MUCH more.
 
A clock cannot be replaced, unless the outgoing clock is tied somehow to the incoming clock, typically via PLL. Otherwise you will always experience buffer under/overruns. The delay till the buffer issue depends on the buffer length (i.e. inserted latency) and the difference between the two clocks (which is never zero, unless one clock is derived from the other one).

Some solutions check for low levels in the stream and try to add/remove samples in those stream sections. Again - it's a hack, a general stream does not have to have any silent parts.

You can use I2S isolation and reclocking (just flip-flops or very short fifos) for the slaved I2S interface too, if you are after isolation. But most modern DACs need clean master clock for the DA conversion and the I2S bus just delivers data.

These things have been discussed here and other sites many times.
hmm oh, i never read about FIFO being inferior in such a way, this seems to be a big deal then, thanks for clearing this up

i guess the same happens to all connection software without rate adjust? so the only real options are camilladsp and alsaloop?

i got jackd running now, period size needs to be 512 for the gadget device capture card and i had to play music to the rpi4 gadget device before jack was able to connect to alsa for some strange reason...
i certainly get pops sometimes ... already raised period count from 2 to 16 and it helps but it doesnt seem perfect
 
hmm oh, i never read about FIFO being inferior in such a way, this seems to be a big deal then, thanks for clearing this up
Many users never experience any issue since FIFOs tend to be long (up to a second) and their playback sessions short, so the clock diff does not have enough time to eat up all the FIFO margin.

i guess the same happens to all connection software without rate adjust?
Yes, all chains where two independent clocks are merged without proper resampling. Rate adjust in CDSP is not resampling, but slaves the capture clock to the playback clock instead. In linux it works only for alsa loopback (by tweaking the loopback internal timer) and the audio gadget (by sending appropriate async feedback messages to the USB transmitter to adjust the transmitting samplerate).

so the only real options are camilladsp and alsaloop?
Only CDSP and alsaloop can tweak the loopback and gadget rates to avoid resampling. Other options do async resampling, such as pulseaudio combine plugin, jackd zita-njbridge/zita-ajbridge, combining audio devices in OSX, etc.
i had to play music to the rpi4 gadget device before jack was able to connect to alsa for some strange reason...
Clearly jackd does not like that the gadget alsa device does not fetch any samples while being open - that's what e.g. https://github.com/pavhofman/gaudio_ctl tries to help with. CDSP also can handle the "stuck" device gracefully, AFAIR.
 
Iancanada FIFO_Pi products are galvaically isolated from the input I2S source (i.e. RPi GPIO bus). At that point the RPi I2S information is data only, as though it were still in a computer. IOW, it can jitter as much as it wants and it shouldn't make any difference to the FIFO output (so long as there aren't buffer underruns or overruns; presumably the buffer is reset during periods of silence to help avoid such problems, ...and there are other tricks as well). Also, after I2S data leaves the FIFO as clocked out by the dac clock, the I2S bus signals are reclocked once again by D-flip flop chips also using the dac master clock before then being sent on to the dac chip.

So operating RPi I2S bus in slave mode should arguably be of little or no benefit. Again, RPi I2S is basically just data at that point, not a time reference for the dac. The RPi clocks are used only to clock data into the FIFO, that's it.

The only theoretical disadvantage from the FIFO solution is the time delay latency of the FIFO, and the risk of possible buffer under or overruns.

Maybe the best solution of all would be asynchronous USB. No FIFO latency to speak of (a tiny bit in the XMOS processing). No GPIO bus jitter. Off the shelf USB isolators are available if needed to isolate dac ground from RPi ground. And the dac can be located far enough away from RPi to not be affected by radiated EMI/RFI.
 
Last edited:
IOW, it can jitter as much as it wants and it shouldn't make any difference to the FIFO output (so long as there aren't buffer underruns or overruns)
But my whole point is about these buffer under/overruns :)

So operating RPi I2S bus in slave mode should be of no benefit.
The benefit is the elimination of the buffer issues, that's quite a major one.

The only theoretical disadvantage from the FIFO solution is the time delay latency of the FIFO, and the risk of possible buffer under or overruns.
Well, that's not theoretical, but very real. Both the large added latency (which changes in time as the FIFO gets gradually filled/emptied) and the certainty of the buffer issue, eventually. That's simple logic, any two clocks will eventually deviate enough to exceed any buffer size. It's only a question of the buffer size and the clock difference, how long it's going to take. Look at that audioscience thread about the Wiim Pro issues - most likely the very same cause, only in software buffers. The principle is identical.

Slaved I2S bus signals can be easily reclocked with the same master clock, this time with no latency nor buffer issues. Also galvanic isolation is the same, just the bitclock isolator goes in the other direction. And the RPi I2S driver DTS config needs one parameter changed, to switch it from complete master to bitclock slave (while keeping the wordclock master to utilize the built-in clock divider of the I2S peripheral).

Maybe the best solution of all would be asynchronous USB.
The UAC2 async receiver also runs its I2S transmitter in slave mode, by design. The very same can be provided by the SBC/SoC directly, no need for further USB interfacing in this regard.
 
Understand about potential buffer under/overruns. However, I have never heard a complaint from an FIFO_Pi user about it. There is a jumper to increase the buffer size (and latency, if needed). Other than that the buffer is probably reset to half-full during silence between songs and or between clock family changes. Beyond that it may be possible to gracefully degrade the audio by occasionally either dropping or repeating a sample. At higher sample rates something like that may not be noticeable, maybe not even at 44.1, don't know. All I do know is I don't see complaints about under/overruns from users.

EDIT: That said, I would agree that it should be possible to operate RPi in I2S Slave mode. Perhaps Ian was concerned that he might not always be able to do that if the I2S source device was not RPi. If that was a concern, then he may have looked at other ways to deal with the problem more generally.
 
Last edited:
The UAC2 async receiver also runs its I2S transmitter in slave mode, by design. The very same can be provided by the SBC/SoC directly, no need for further USB interfacing in this regard.
Understood. However, there may be issues with conducted and radiated EMI/RFI. Certainly there are with RPi. Of course LVCMOS I2S can be converted to LVDS to allow for some physical separation, but that adds complexity and cost too.

Moreover, USB allows optional use of off the shelf dacs. The user need not be wedded to I2S input dacs only. As someone pointed out earlier, its very hard to compete with Topping, Behringer, etc., on cost. By the time a custom dac is designed, implemented, tested, etc., it doesn't usually end up being lower cost for a given quality of data conversion.
 
hmm i guess even something like "scream" (the linux program that streams audio over ethernet) would suffer from the buffer issue... i kinda cant believe how hard it is to get audio from one pc to another in the digital realtime domain

i will try camilladsp later since i wanna avoid resampling
if this doesnt work reasonable well i kinda cant see a way around a usb bridge/ddc like the gustard u18

btw my reasoning behind this is/was that i might get better audio quality going trough the raspberry pi into i2s and a good quality hat instead of going trough a xmos chip, tho i have no proof of this yet, well second reason was that i might get better band for the buck with going with a hat where i can exchange clocks myself
but xmos chips seem quite fragile or people wouldnt hear differences between cables/filters/isolators/usbsources etc...
 
I would agree it has to do with noise. Question is whether or not the noise is primarily affecting the XMOS chip or maybe going around it to other parts of the dac. There are more sensitive things in a dac than the XMOS chip, is all.

Its that data is digital up until some point where it gets converted to analog. That's a very sensitive part of the circuitry. Its not just the dac chip, its also the dac's time reference, the dac's voltage reference, the dac's other power supplies, and of course its ground.

Another part of the circuitry that is sensitive is the analog part, the output stage and its power supplies. The clocks are analog too, more so than they are digital.
If there were a PLL and or ASRC those things tend to be sensitive to noise too.

IOW, the XMOS may be one of the less sensitive things. Which is not to say it couldn't be made to glitch, but maybe not as easily as some other circuitry.
 
Last edited:
hmm i guess even something like "scream" (the linux program that streams audio over ethernet) would suffer from the buffer issue... i kinda cant believe how hard it is to get audio from one pc to another in the digital realtime domain
The reasoning is quite simple. There can be only one clock in the chain, typically located by the DAC. So the chains which do not require adaptive resampling are typically "pulled" from the sink to the source for playback, and from the source to the sink for capture (i.e. always from the HW clock onwards)

As of network protocols - some include the feedback from the receiver to the transmitter - just like USB async does. E.g. DLNA, the squeezebox protocol (very well designed at that time), IIRC pulseaudio as well as jackd network streams. Typically these are point-to-point protocols with only one receiver. But nothing prevents from selecting one receiver as the master, ale slaving the remaining receivers to that one. The slaves would inevitably have to solve their clock-domains synchronization somehow, if they had their own independent clocks.

BTW e.g. mplayer synces video to audio sink, because it's less obtrusive to drop/duplicate video frames. First I was surprised but then it started making sense...
 
As of network protocols - some include the feedback from the receiver to the transmitter - just like USB async does. E.g. DLNA, the squeezebox protocol (very well designed at that time), IIRC pulseaudio as well as jackd network streams. Typically these are point-to-point protocols with only one receiver. But nothing prevents from selecting one receiver as the master, ale slaving the remaining receivers to that one. The slaves would inevitably have to solve their clock-domains synchronization somehow, if they had their own independent clocks.
if i understand this right some network protocols are able to request more samples if the output dac needs them to avoid under/over runs, yes?
thanks for mentioning jack, i didnt know netjack, i thought scream is the only good low latency one, i might give it a shot
i wish dlna/upnp would have less latency... its only good for streaming video and audio at the same time unfortunaly

BTW e.g. mplayer synces video to audio sink, because it's less obtrusive to drop/duplicate video frames. First I was surprised but then it started making sense...
yea i would definitely prefer this over dropped audio samples
 
So, i tried camilladsp, unfortunaly i still get pops, even with "Rate_Adjust = 10" set, also i noticed that chunksize 8192 doesnt work, up to 4096 it works but with 8192 camilladsp cant open the alsadevice(usb dac) anymore

also those dont sound like your regular pops but as if you have bad wifi connection for 1 sec where sound goes in and out rapidly every few minutes
any suggestions here?? (could resampling inside camilladsp help? then i can test the resampling quality of camilladsp too(see below))

what seems to work quite nice is zita-ajbridge + jack but its resampling and i noticed a change in sound signature, im also upsampling on my desktop machine with pipewire with highest quality (really nice resampling by pipewire, best i heared so far, a discovery i made because its tricky to get dynamic samplerate trough the raspberry pi... another quirk :/)