• These commercial threads are for private transactions. diyAudio.com provides these forums for the convenience of our members, but makes no warranty nor assumes any responsibility. We do not vet any members, use of this facility is at your own risk. Customers can post any issues in those threads as long as it is done in a civil manner. All diyAudio rules about conduct apply and will be enforced.

symphonic-mpd

Hi papalius,

thanks for the in-depth explanation.
..it´s really great to have you here, it´s so nice to talk to the actual
developer, you can learn so many new things...

well, i didn´t know about DVFS, so i will go back to the older firmware.

but i´m still not convinced about shairport, an on/off-switch somewhere
would be nice, why leave it up and running when you don`t need it ?
but it´s no big deal.
i will not try to disable or unistall it myelf and risking disrupting your
well balanced system.
 
Hi, M_Balou

Let me explain more about the gimmick that automatically stops an unnecessary process.

The automatic stopping and restoring of processes can be found in
Code:
/usr/local/bin/pipe_event.sh

It is called from aplay-rt, one of the core of the symphonic-mpd's playback engine.
This is a music playback software implemented by Xenomai API, and as the name implies, it has a simple function like aplay.

When aplay-rt starts, it will monitor the ring buffer that receives PCM data after initializing the PCM device.
When the PCM data arrives, run "pipe_event.sh start" to stop the unnecessary process and start playback with the RTDM driver.

The RTDM driver will continue to play as long as PCM data is supplied, but when the supply is cut off, it will return to aplay-rt for processing.

aplay-rt waits for the arrival of PCM data, but when it stops for about 0.5 seconds, it judges that playback has been stopped, does "pipe_event.sh stop", restarts each process, and continues monitoring the ring buffer again.

The ring buffer to receive PCM data is /dev/xsink.
mpd uses the pipe output plugin, shirtport-sync uses the pipe backend, and spotifyd writes PCM data to /dev/xsink using the ALSA library's file plugin.

Using the spotifyd method, all music playback software that supports ALSA will support the symphonic-mpd playback engine (aplay-rt, rtalsa.ko, xsink.ko) and will benefit from the high quality playback.


I should add one more point about shairport-sync.

There were four choices for mDNS: avahi-daemon, systemd-resolved, shairport-sync built-in mDNS, and spotifyd built-in mDNS.

Previous versions for RPi3 initially used avahi-daemon and switched to systemd-resolved from v0.8.x.

In v1.0.x for RPi4, systemd-resolved was used for beta testing, but we decided to use mDNS with built-in shirtport-sync for stability and other reasons.
Therefore, by default, shairpoprt-sync has auto-start enabled.

If you operate with a fixed IP address and don't need mDNS, you can
"systemctl disable shairport-sync" to disable auto-launch.
There is no web UI, so you need to execute commands via ssh.
 
Last edited:
IIUC the output thread of MPD writes the received samples to a kernel-based circular buffer (your module xsink.ko). Another process in user space (aplay-rt) monitors this circular buffer and if it contains data, it passes the data to some kernel module rtalsa.ko (written in/for the xenomai framework) which somehow writes the data to FIFO DATA register of the PCM interface. When new data start coming, aplay-rt calls some shell script to setup the whole chain.

Versus regular alsa:

MPD output thread copies the received samples directly to kernel buffer (device hw:X) via snd_pcm_writei system call, filling whole period portion of the buffer (e.g. several hundreds of ms - I do not know max period size for the I2S driver yet) at once in one block write. The memory area is directly accessed by the DMA controller which independently of the CPU keeps writing the samples in 2 32bit bursts (one stereo frame) to the FIFO DATA register of the PCM interface, as configured and started by the I2S alsa driver. For the rest of the period time the MPD output thread sleeps (blocked in the system snd_pcm_writei call) and lets the DMA controller do all the work, nothing else running on the CPU regarding the audio output.
 
Last edited:
Hi, phofman

Your understanding is correct, except that aplay-rt does not copy data between kernel space and user space.

The resources used by the DMA are DMA controller, Memory controller, SDRAM, L2 Cache, and AXI Bus.
All resources, except for the DMA controller, are also used by CPU tasks.

Even while the MPD output thread has finished writing data and gone to sleep, the DMA is still exposed to shared resource contention with the CPU, especially when large memory accesses contaminate L2 Cache, hampering burst transfers and increasing transfer time.

The process of outputting the same I2S signal can also be seen to be very different if we look at the amount of resources used, the cycle of use, and the timing of use in the SOC.
If you have evidence that can rule out how many cycles the CPU has been consumed, how many DMA requests have been sent, how many ldm/stm's have been issued, or that these differences in activity may be a source of noise or jitter that degrades the quality of the I2S signal (especially BCLK), please provide it.
 
So you have basically replaced the hardware multichannel DMA controller, a device dedicated to and optimized for its task, with a software loop running on a dedicated CPU core. And that is something which magically improves the sound, even though you could not measure the improvement by any means.


The resources used by the DMA are DMA controller, Memory controller, SDRAM, L2 Cache, and AXI Bus.

Apart of the DMA controller, what does your solution use less than the standard DMA-read kernel buffer used by alsa? How do you write sample by sample to the PCM interface FIFO_A register, without using memory controller, L2 cache, and the AXI bus when the PCM interface is connected via AXI and the ARM CPU virtual addresses must be translated by the ARM MMU?

All resources, except for the DMA controller, are also used by CPU tasks.

Just like in your case.

Even while the MPD output thread has finished writing data and gone to sleep, the DMA is still exposed to shared resource contention with the CPU, especially when large memory accesses contaminate L2 Cache, hampering burst transfers and increasing transfer time.

Have you measured the transfer time from RAM to the PCM interface via AXI, with your direct writing by software vs. with DMA automated transfers initiated by the DREQ signals issued by the PCM interface? I have no idea how you would do it without access to the SoC internals. If I were to choose, I would always pick a hardware-accellerated solution instead of software-based one, if I had the hardware available. And apparently all the hardware/kernel engineers think alike.

Back to your previous claims:

You were also concerned about the increase in IRQ.
Hardware interrupts and software interrupts, each of which requires different countermeasures, but beyond those countermeasures, you'll notice the inefficient side of ALSA libraries and ALSA drivers.
Because for every one period of replay, you're generating tons of memory accesses, tons of system calls, register reads, and a disgusting amount of context switches.

After taking them away, there is the symphonic-mpd.

Please identify the 'tons of system calls, register reads , and a disgusting amount of context switches' in the standard method mpd output thread -> writing period of data into the RAM buffer by a system call and blocking till the next period time -> DMA continously passing the samples to the PCM register. Where are they?

If you have evidence that can rule out how many cycles the CPU has been consumed, how many DMA requests have been sent, how many ldm/stm's have been issued, or that these differences in activity may be a source of noise or jitter that degrades the quality of the I2S signal (especially BCLK), please provide it.

?? You claim your method makes a difference in the sound (which I do not believe until shown a proof - never happened yet) and I am to provide an evidence of that? Again - the PCM interface has a 64 samples-long FIFO, why do you keep talking about transfer time, latency having an effect? Because these are words which 'audiophiles' who have no clue about real software/hardware operation like to read?

You claim the standard way is very inefficient and your method solves the claimed inefficiency, you are to provide the arguments. So far I have seen none, sorry. A heap of computer terms does not count, only real measured data and logical and technically correct arguments do. Developing proper software support for an existing hardware is an exact discipline, no psychological voodoo.

BTW, the rpi bcm folder linux/sound/soc/bcm at rpi-4.19.y * raspberrypi/linux * GitHub contains many drivers for different DACs, configuring/switching samplerate/adjusting volume/etc the boards via I2C, GPIOs, etc. along with the core I2S transfer. How are these devices supported when the core I2S driver using hardware DMA is replaced by your software loop?
 
Last edited:

TNT

Member
Joined 2003
Paid Member
Yes, it seems so, a software dream with a closed group that kisses the bottom of a turkey at annually meetings. This project, like others, lack basic technical grounding to actually be able to achieve anything valuable. I suppose there is a NW switch software product to match?

An important thing is to keep any (radiating) computer HW away from DAC circuitry - which will not be possible if i2s lines are to be kept sufficiently short. It's a flawed architecture to begin with which will never be salvaged with a dreamt up software solution. It needed to be said - sorry.

//
 
An important thing is to keep any (radiating) computer HW away from DAC circuitry - which will not be possible if i2s lines are to be kept sufficiently short. It's a flawed architecture to begin with
//

I agree that I2S is a very bad interface. On the other hand most (oversampling) DAC chips have the conversion stage clocked by their master clock, which decent implementations provide directly from near clock crystal/chip, without PLLing the master clock from I2S BCLK. I2S signals must be only synchronous with MCLK then, but can be galvanically isolated, the DAC section can be properly shielded, etc.

A balanced serial signal would be way better for reaching the noise-reducing distance, no doubt (the classical examples PATA -> SATA, SCSI -> SAS, PCI -> PCI-e, RS232 -> RS485, USB, ethernet, LVDS,...). I2S is a prehistorical thing, used only for low-speed audio chips (of which all the respectable ones have the separate MCLK input, fortunately).
 
Then there are ADCs which have even greater problem with noise due to their required/expected large input impendace. A robust reliable non-radiating (at least balanced, better optical) and inexpensive (i.e. standardized, widely used, with inexpensive conversion chips) communication channel would be certainly handy.

But that is more complexity, higher BOM price, ... everything is about finding optimal compromises :)
 
Hi, TNT

If the sound doesn't change with the transport, then you don't need to try this software.

If anyone is confused and troubled by the changing sound of the transport settings, they will benefit from using this software.

Typically, what is your % CPU load when playing?


playing mpd 44.1KHz/16bit

Code:
  cpu0   cpu1   cpu2   cpu3
====== ====== ====== ======
  0.00   0.00   0.00   2.16
  0.24   0.24   0.24   2.18
  2.19   0.26   3.16   3.16
  0.25   0.25   0.25   3.15
  0.00   0.00   0.28   2.22
  1.23   0.26   1.23   2.20
  0.26   0.26   0.26   3.16
  0.00   0.00   0.38   3.28
  1.23   0.26   0.26   2.20
  0.26   0.26   0.26   3.17
  0.28   0.28   0.28   3.19
  0.00   0.00   0.00   3.16
  0.24   0.24   0.24   2.18
  1.22   0.26   2.19   3.16
  0.25   0.25   0.25   3.16
  0.00   0.00   0.00   3.16
  1.23   0.26   5.11   3.17
  0.25   0.25   0.25   3.15
  0.00   0.00   0.00   3.16
  1.24   0.27   1.24   3.18
  0.26   0.26   0.26   3.17
  0.26   0.26   0.26   3.17
  0.00   0.00   2.17   2.17
  0.27   0.27   0.27   2.21
  0.27   0.27   0.27   3.18
  2.18   0.24   1.21   3.15
  0.00   0.00   0.00   3.31
  0.33   0.33   0.33   3.24
  0.26   0.26   3.17   3.17
  0.00   0.00   0.00   3.16
 
Last edited:
For those who are forced to use it in a DAC slave, there is a small advantage.

The BCLK of the DAC slave is generated by the PLL in the SOC, but the two frequencies are mixed because the Multi-stage noise shaping (MASH) is enabled.

In the symphonic-mpd, measures are taken to prevent such BCLK oscillations from occurring. Of course, this is not simply a measure to turn off MASH, but to prevent any changes in pitch.
This is a macroscopic change that can be observed with an oscilloscope, and there is no doubt about it.
 
The FSF has responded.
$300? OMG.

Hello, and thank you for contacting us!

> I'm developing a Linux distribution specialized in music playback,
> and I'm modifying source code licensed under GPL, such as Linux and
> mpd.
> We also run a membership-only club called the "symphonic-mpd
> Research and Development Club", which provides these products free
> of charge to our members and provides them with various kinds of
> feedback.
>
> Club members are free to download disk images from the membership
> site and install them on their own hardware.
>
> I would like to confirm with you that the club is under no
> obligation to disclose modified source code under the GPL to club
> members.
>
> The GPL states that use within an organization does not constitute
> distribution.
> However, for an individual who belongs to a club, it is possible to
> think of it as receiving a binary distribution from the club.
>
> What are the specific conditions for being recognized as an
> "organization" under the GPL?
>
> Also, what are the clear conditions under which "providing binaries
> to individuals in an organization" can be certified as "internal use
> within the organization" under the GPL?

Thank you for your inquiry about free software licensing.
The FSF has offered free software licensing support for many years gratis to developers of free software, and has a recognized engineering and legal expertise in this area.
As demand and expectation of our service has grown, we require additional funds to support our work.
Because of this, we now offer our services by paid consultation to nonfree software developers.

To answer your questions as outlined, we will require 60 minutes of consultation time, at a cost of $300. If you wish to proceed e-mail us to let us know so that we can arrange for you to make a payment.

If you do not wish to pay for this service, I suggest you carefully review the resources at <http://www.fsf.org/licensing/education> and refer to your legal counsel.
 
This project, like others, lack basic technical grounding to actually be able to achieve anything valuable. I suppose there is a NW switch software product to match?
//

What is SOC.
//


are you joking....??? :crazy:
"System on a chip" or the "cpu" of a SBC....

maybe you know a lot about computers in general,
but maybe you don´t know much about what we are doing here...
 
Last edited: