Path to noiseless Linux streamer...

Actually every USB communication is bursty as audio frames are packed into USB packets. It's true that UAC1/2 requires the flow to be as "smooth" as possible, but from the audio stream POW it's bursty. A buffer is required at the recipient side to convert the bursty stream to continuous DAC-clocked I2S.

UAC3 is focused at low consumption - large short bursts of data interleaved with the USB HW sleep (while the DAC keeps consuming the cached samples from the previous burst).
 
Member
Joined 2010
Paid Member
In USB audio host sends about the same amount of data in every (micro)frame. This is regardless of whether the isochronous audio endpoint is synchronous, asynchronous or adaptive. In USB audio only small variations of data rate are possible.

The data transfers are not necessarily the same as the audio processing rate.

In a typical smart phone's SoC, the multimedia block will DMA big chunks of music data into its own cache while the DAC proper uses a different clock to load the data into the audio converter. Same for the video. If you use a USB OTG connection to an external DAC using USB-2 or above, then the DMA drives the chunks of data to the USB controller not the multimedia block. The DAC's controller clock is internal to it, not shared by the datalink, nor the network layers... not even the application ( the media player ).
 
Member
Joined 2010
Paid Member
Actually every USB communication is bursty as audio frames are packed into USB packets. It's true that UAC1/2 requires the flow to be as "smooth" as possible, but from the audio stream POW it's bursty. A buffer is required at the recipient side to convert the bursty stream to continuous DAC-clocked I2S.

UAC3 is focused at low consumption - large short bursts of data interleaved with the USB HW sleep (while the DAC keeps consuming the cached samples from the previous burst).

Not only that, but long bursts are more efficient in terms of transmission. The overhead of the transmission control, set up and final take down are fixed, so the longer the burst of data, the lower the cost, in percentage terms, of the transmission control.
 
(1) I drive Tidal HiFi masters at 24/96 rates off the Internet with an i7 laptop into a Nuforce DDA-100. No drops whatsoever even though I'm also doing video, email and browsing on the same machine.

Sure, no reason for that, I am not disputing the internet bandwidth is sufficient. The question is how much gets cached on your PC.
(2) Slim, Squeebox, Roon, etc... are all closed development products and based on four or more years old technology. Many of those devices were stuck on isochronous Red Book USB-1 for over a decade after USB-2 had come out. PC based players and DACs had bypassed those products eons ago. I just don't bother with them at all.

Please let's unify the terminology - DAC is either I2S+DAC or USB+I2S+DAC. A DAC cannot stream directly from internet (if so, it's PC + DAC in one case, in our terminology :) )

Still protocols like that are being used and perfectly relevant today - all the network renderers need them. otherwise they need to implement some sort of resampling - e.g. see the effort of @CharlieLaub in synchronized networked playback.

(3) High speed Internet is common nowadays. Streaming data from source brought is throttled by the receiver over TCP/IP. No resampling done by the receiver. Data is then sent to DAC where it gets sampled as needed. Mostly it is native nowadays.
Again - it depends on the protocol. E.g. internet radios are network streams too and you need to have resampling to the audio clock somehow.

Latency and cache size are not necessarily coherent... it is possible to fill the cache at a higher rate than it is being drained for playback.

Yes, if the sender can send the stream faster than audio clock. This is possible for cloud music services with pre-recorded content, but not possible for interactive services (internet radio, online streaming, etc.).

We are not in dispute, just each of us talking about a different type of streaming :)
 
IMO that is not what is normally meant with bursty communication (as in varying data rate). Many have the misconception that host load impacts the UAC data rate.
IMO it's only about terminology - the data on USB bus are bursty as the USB data clock is much faster than the audio data clock. But it's just words, we know what we are talking about :)
 
  • Like
Reactions: 1 user
Member
Joined 2010
Paid Member
Sure, no reason for that, I am not disputing the internet bandwidth is sufficient. The question is how much gets cached on your PC.


Please let's unify the terminology - DAC is either I2S+DAC or USB+I2S+DAC. A DAC cannot stream directly from internet (if so, it's PC + DAC in one case, in our terminology :) )

Still protocols like that are being used and perfectly relevant today - all the network renderers need them. otherwise they need to implement some sort of resampling - e.g. see the effort of @CharlieLaub in synchronized networked playback.


Again - it depends on the protocol. E.g. internet radios are network streams too and you need to have resampling to the audio clock somehow.



Yes, if the sender can send the stream faster than audio clock. This is possible for cloud music services with pre-recorded content, but not possible for interactive services (internet radio, online streaming, etc.).

We are not in dispute, just each of us talking about a different type of streaming :)

I write device drivers, internetworking, OS, schedulers, etc... Not for Windows, btw. I don't care for Windows development.

To me, the DAC is a black box. I get data from a TCP stream and I need to send data to a DAC. This DAC is either on a PCI or USB interface... I2S is hidden within it (*), how the audio data is being processed is a black box to me, I don;t need to know. All I need to know is that I must provide enough data to keep the DAC running at whatever "data rate" it must maintain. I also must listen to the device's back pressure and I must provide back pressure to the network connection up from me.

Using back pressure based throttle control adapts itself to cached and non cached systems nicely. But again, I don't require it, it's an option. I can always run polled, however, I'm most efficient running under interrupts and minimizing the number of transmissions by maximizing the transaction size.

As a hobby, I don't do stuff like "radio streaming"... I only do stuff like Tidal, Netflix and some html... then locally I use NASs with Plex, Foobar and VLC. So, my life is very easy. I also use my old version of Cubase to process audio.

I also don't mess with closed systems. If I can't do it with Android, Windows or Raspberry, I just don't bother. it's my hobby.

At work, it's different, there I tend to use vxWorks, Integrity, Linux, Android, ThreadX, etc...

(*) Although I did write part of an I2S driver once.
 
Last edited:
The input cache in mpd works now...meaning: It loads for 2sec the whole song into RAM and than ethernet is not going to be used anymore...If you have large RAM like i have 16gb on the Pink Faun type of PC...you will have like 10-15 songs in Ram, even in HD. You use the ethernet no longer for receiving any data to be played...all your songs are already in RAM locally.

On the PC thing...I need to play a bit further with this...SOC which have a direct USB out or I2S out on Gpios have an advantage not going through other chips/busses/clocks...so, I am not a big fan of first making stuff messy to later than spend xxx$ for a usb card with oxco clocks. "Just dont mess up my signal from the beginning " i like better.

On USB vs I2S...no, that is not automatically a nobrainer. i2s is not automatically better sounding as you need as well isolators and u have to look if you really have a good clock on the source side and a straight i2s line from the Soc va. some more chips in between. So far its a toss.
I'm curious where did you get the information on how the MPD input cache functions?

Please post a link to the document.
 

https://mpd.readthedocs.io/en/stable/user.html#advanced-configuration

Configuring the Input Cache

The input cache prefetches queued song files before they are going to be played. This has several advantages:

  • risk of buffer underruns during playback is reduced because this decouples playback from disk (or network) I/O
  • bulk transfers may be faster and more energy efficient than loading small chunks on-the-fly
  • by prefetching several songs at a time, the hard disk can spin down for longer periods of time
This comes at a cost:

  • memory usage
  • bulk transfers may reduce the performance of other applications which also want to access the disk (if the kernel’s I/O scheduler isn’t doing its job properly)
To enable the input cache, add an input_cache block to the configuration file:

input_cache {
size "1 GB"
}

This allocates a cache of 1 GB. If the cache grows larger than that, older files will be evicted.

You can flush the cache at any time by sending SIGHUP to the MPD process, see Signals.

I checked its behavior on Mem with Htop and on network i/o with cat /proc/interrupts...it does exactly what they promised...so we have no longer the OSnoise of network I/O at all, so we got one core back...which is nice on a quad core setup, i give mpd now two isolated cores, USB its own and rest of system one.
 
Last edited:
I am not using streaming services with MPD (yet). There are some plugins to use spotify I believe and maybe others, but someone using those need to comment. You can see this by simply do cat /proc/interrupts and see what your network i/o eth0 does when streaming...but I guess its not downloading a whole song...it is streaming, no ?
 
Member
Joined 2010
Paid Member
What if the network stream does not provide the back pressure feedback? E.g. an RTP/RTSP stream over UDP?

Why would a stream use UDP?

By definition, a stream must be reliable, hence it should use TCP.

If, however, you wanted to do it the hard way, you might use UDP but you still need to ensure reliability of the transfer, hence you have to provide some means of ACK'ing the received data and thus you implement back pressure.

Honestly, I have never used a data "stream" mechanism that didn't use back pressure in an asynchronous interface OR commands in a synchronous handshake. The reason is that the sender (master) needs to know the status of the receiver (slave). Even more important if the slave is a proxy of sorts that is caching the data.
 
Why would a stream use UDP?

By definition, a stream must be reliable, hence it should use TCP.

If, however, you wanted to do it the hard way, you might use UDP but you still need to ensure reliability of the transfer, hence you have to provide some means of ACK'ing the received data and thus you implement back pressure.

Honestly, I have never used a data "stream" mechanism that didn't use back pressure in an asynchronous interface OR commands in a synchronous handshake. The reason is that the sender (master) needs to know the status of the receiver (slave). Even more important if the slave is a proxy of sorts that is caching the data.

Actually TCP is not a good idea for audio streaming. This is because TCP will continually try to download a packet of data if it does not come through and this will block the flow of the remaining data. For a real-time application this can cause a problematic delay. UDP doesn't care if a packet arrives or not, or if it is in order or not, but this proves to work better for audio. The UDP RX end just has some code to deal with missing packets and performs re-ordering of out-of-order packets.
 

https://mpd.readthedocs.io/en/stable/user.html#advanced-configuration

Configuring the Input Cache

The input cache prefetches queued song files before they are going to be played. This has several advantages:

  • risk of buffer underruns during playback is reduced because this decouples playback from disk (or network) I/O
  • bulk transfers may be faster and more energy efficient than loading small chunks on-the-fly
  • by prefetching several songs at a time, the hard disk can spin down for longer periods of time
This comes at a cost:

  • memory usage
  • bulk transfers may reduce the performance of other applications which also want to access the disk (if the kernel’s I/O scheduler isn’t doing its job properly)
To enable the input cache, add an input_cache block to the configuration file:

input_cache {
size "1 GB"
}

This allocates a cache of 1 GB. If the cache grows larger than that, older files will be evicted.

You can flush the cache at any time by sending SIGHUP to the MPD process, see Signals.

I checked its behavior on Mem with Htop and on network i/o with cat /proc/interrupts...it does exactly what they promised...so we have no longer the OSnoise of network I/O at all, so we got one core back...which is nice on a quad core setup, i give mpd now two isolated cores, USB its own and rest of system one.
Thanks. I was looking for more of a technical breakdown on the caching strategy or the algorithm used.

I've observed that caching may or may not happen depending on machine ram size, cache size specified and file sizes being played.
 
I agree. UDP is like isochronous streaming in USB - only CRCs for error checks but no resends which would ruin the realtime capability. But it's beneficial only for latency-critical/realtime situations.
I don't think USB isochronous transfer has any error check/correct because it would be too time consuming. https://www.silabs.com/documents/public/application-notes/AN295.pdf

There was a similar document for XMOS "Fundamentals of USB audio" but the link is broken.
 
Thanks. I was looking for more of a technical breakdown on the caching strategy or the algorithm used.

I've observed that caching may or may not happen depending on machine ram size, cache size specified and file sizes being played.
Well, i found some old threads when inpit cache seemed not to be fully working in 2019...all i can say, with the latest version of mpd it works on my machines nicely. But i did not study the source code.
 
  • Like
Reactions: 1 user
Nice to see im not the only person that noticed OSNoise is a big contributer to sound quality
i made a thread a while ago where i showed my modded moode settings
https://audiophilestyle.com/forums/topic/65106-moode-latencysq-optimizations/#comment-1181905

i also think rapsberry pi is the way to go, just because it concentrates on the nessecary stuff without much stuff "around it", same goes for linux vs windows

i also noticed input cache of mpd makes a good difference, specially if you consider that many people believe in storage and ethernet sound (i didnt test it myself so far, it gets expensive quick), those things shouldnt matter at all with input cache, the only thing that matters then SQ wise should be the RAM itself and its connection to the CPU, well probably introduced noise from storage and ethernet matter more than the SQ change, i think thats why i prefer wlan over lan for the raspberry pi with a external usb wlan stick (the onboard wlan makes things worse)

i also have a usb dac and one thing im not sure about, maybe all this stuff just matters with the crappy usb connection (crappy because usb seems to be very variable, pc ports sound awful, sound from different usb hubs sound different, usb cables change sound... if isilencer and other filters matter... galvanic isolation helps too, its actually mindboggling and i wish it would be as easy as objectivists say), im curious how a IsolatorPi/reclocker/Spdif hat compares to a optimized system with usb, maybe the OS and rapsberry pi doesnt matter anymore then
this is also something i wanna try out if i come around to build my new cm4 streamer

things that also matter:
UNDERCLOCKING, very big time
if not needed, dont run mpd on 2 cores!! just use 1 since the thread will jump between cores then and this introduces more jitter
and if not needed disable unneeded cores, this reduces overall noise i guess, keep the system as minimalistic as possible

Edit:
One thing i also noticed while doing some testing (as you see in the last post in the audiphilestyle thread) that underclocking INCREASES jitter because there is more time between clock cycles BUT underclocking still sounds better, so im guessing that introduced noise from higher clocks actually matter more than the jitter
 
Last edited: