DAC word clock really needed? Active monitors vs. multiple USB DACs

Dear Members,

I'm designing my HTPC-based 5.1.2 system, cabling done and are in the walls.
Now comes the weird part on digital side which I don't really understand:

- when I want to use multiple consumer USB DACs (e.g. Topping E30), some folks tell me their clocks need to be synchronized else I might face some audible or non-audible errors during movie or multichannel music
(A 8ch DAC would be convenient but I'm interested in the theory now).

- on the other hand, we have plenty of 2-way active speakers where each of the speakers have individual built-in (usually) classD amp+DSP combo with active filtering.
And of course their imagined block diagram begins with an analog XLR input (AD conversion) and then after the DSP, 2 DA conversions (woofer, tweeter).

The active speaker's DAC part might be on the same clock, a simple 2ch DAC for woofer/tweeter, but the boxes themselves aren't connected to eachother at all, hence their DAC parts are NOT on the same clock.

Why is it sooo important to use 1x 8ch DAC (same word clock) or multiple stereo USB DACs with synced clock (very expensive), while with 2+ active speaker boxes they have 2 separate DACs operating inside without knowing about the other and we state it works and is good ? Some clarification would be great.

So, in studio monitor usage, we seemingly don't care of the built-in DACs between the 2 monitor speakers, for them to be in sync or to run from the same master clock.
Meanwhile, when using 2 or more consumer DACs, we do care about master clock.

What's going on ?

A studio monitor 2-way stereo setup with XLR analog inputs have their built-in DACs just like having 2 USB DACs connected to the PC, one driving the left side woofer+tweeter pair and the other driving the another side.
Or is the DAC master clock's existence only important in case of woofer/tweeter sections within one box, for active amplification... but NOT important between L/R side boxes due to them representing a different signal anyway ?
 
IIUC your active monitors have analog inputs, sampling the signal to digital for internal DSP, followed by DA conversion and classD amplification.

That means both monitors receive a continually synchronous (analog) signal. The rates at which their internal samplings run are not important, because they sample the already synchronous signal.

If you were to feed them a digital signal instead, it would have to be synchronized between the two monitors too. But the asynchronous USB audio protocol used by modern USB DACs is not clocked by the transmitter (being common for both monitors), but by the receiver. If the receivers do not have their clocks mutually synchronized, each receiver will fetch its data from the transmitter at a slightly different rate and the two incoming digital signals will drift apart.

The time drift will cause time shift of the two produced sounds, but also issues on the transmitter side as each receiver will end up requesting different parts of the common stream. The transmitter processes the stream in data chunks. Let's assume it fetches a new chunk when receiver/DAC A consumes the chunk. But the receiver/DAC B will eventually need data from an older or newer chunk M, which the transmitter does not have any more/yet.

The solutions are e.g.:

* keeping clocks of receiver A/B synchronous, making them request data at the same rate - multichannel DAC, word-clock synchronized DACs

* the transmitter adaptively resampling the stream for the "slaved" receiver B to the rate equal to the data consumption date of receiver B. The resampling can be simple (dropping/duplicating samples) or complicated (proper interpolation). An example is OSX aggregate audio devices feature.

* using a protocol synchronous to the transmitter - USB adaptive (in older/cheaper DACs) or SPDIF or network streaming where the receiver re-generates the clock from the incoming USB or SPDIF or network data stream.
 
  • Like
Reactions: 5 users
I think I've run into this issue as I try to use 2 separate USB DACs combined together with an alsa "multi" plugin. Surely there is a solution for this problem in alsa?

For the 8 channels I want it is a lot cheaper to buy 4x 2ch SMSL SU-1 ($320 total) than it is to buy 1x 8ch Topping DM7 ($600).

I've wasted a ton of time trying to figure this out before arriving at the (perhaps incorrect) conclusion that alsa keeps crashing after a short time without explanation because the DAC clocks are out of sync. At this point I wish I had just spent the extra $300 rather than wasting all this time, but it seems like this ought to be a simple problem to fix with alsa.
 
Or maybe pulseaudio or pipewire can work around this clock sync issue if alsa can't? I've tried so many things with alsa, pulse, and pipewire, and combinations of them, in tandem with CamillaDSP, but I haven't yet found a way to make it work properly in every situation without crackling or intermittent silent crashes.

All I want is 2ch stereo input from applications split into 4, 6, or 8 channels via CamillaDSP, each channel processed, and then be sent out to multiple 2ch USB DACs. It sounds simple to do in theory but in practice I've spent a few weeks messing with it and I have not yet been successful. It is easy with 1 DAC, but not 2 :|

I have done a bunch of searching for a solution but everything I've tried has failed. Does anyone here have a working solution for this problem?
 

TNT

Member
Joined 2003
Paid Member
If one want to use 2ch DACs for a multichannel system, one solution is to use one (1) RME Digiface USB (Toslink version). The RME will show up as an 8 channel sound card to a computer (Pi dont work) and you now have 4 synchronised 2ch optical outs that you can hook up 4 2ch DACs to...

//
 
  • Like
Reactions: 1 user
@phofman yeah your explanation helped me understand the problem better than anything else I've read. Thank you. That explains the various underrun errors I saw with CamillaDSP in some setups, and probably the pure-alsa setup crashing silently. Idk if it explains the camilla error about failing 100 times to write to buffer, but maybe.

I certainly have tried module-combine-sink with PA but I encountered many problems. You mentioning it again made me focus on it tonight to find & note the exact problems I was having, but I think I found solutions for them all one by one.

First problem was remixing. Despite remix=no PA was still doing some funny remixing. Disabling this globally stopped that nonsense. I'll probably just need to ensure applications do any surround to stereo downmixing because PA can't seem to do it right.
Next problem was the sample rate auto-detection results being inconsistent. Checking hw_params showed the DAC getting locked onto different sample rates after reboots causing some havoc. I checked it before and after any apps were even started, just fluxbox. Using rate= in the PA combine module did not set the rate of the DAC, but in daemon.conf by setting the default rate and alternative default rate I was able to lock it where I want it consistently. So far, anyway. It's strange, but at least I don't have any other devices to worry about, just these 2ch DACs.

Next problem was some stuttering and sometimes brief periods of so many stutters that 'crackling' is a better term. I'm running on an Orange Pi 5B which has plenty of muscle but the default PA settings didn't appear to be helping out here. I scrolled through the config file and found some things to change. I probably didn't do it right but I set high-priority=yes, nice-level=-11, realtime-scheduling=yes, and realtime-priority=5 and restarted PA. This appears to have stopped the stuttering and crackling.

Thanks for directing me back onto the right path. I tried so much stuff because I kept having problems, but it turns out that if I had just stuck with this method I could have saved a lot of time.

@TNT - that's a cool device but it looks like it, all by itself, costs more than a Topping DM7 8ch DAC.
 
For anyone that reads this in the future, further testing (with 4 speakers instead of just 1 that I was moving) revealed that I can't get camilladsp+pulseaudio to route 4 channels across two 2ch DACs. The DACs connections are treated like a hub - the same signal is sent to both, so I just get a doubling of same 2ch. I tried (seemingly) every combination of configurations and none would give me 4 independent channels across two 2ch DACs with pulseaudio.

While doing this testing with 4 speakers hooked up I was also able to hear a very noticeable wandering of the center image. Originally I thought this was an intentional effect in the remastered Dark Side of the Moon because it sounded like a gentle rotation of the image just like the slowly turning prism in the video, but after listening to other material (like ads on youtube) I realized that it was a major problem introduced by the DACs. Even if I could adjust the channel mapping to get my 4 independent channels, the clock drift and resulting phase shift is unacceptable.

I'm going to see how hard it might be to open up the DACs and modify the hardware to sync their clocks, but it's pretty likely that I'll have to just buy an 8ch DAC or "audio interface". It's easy to use pure alsa to combine multiple DACs but having perfectly synced hardware clocks is a strict requirement.

After reading someone else's analysis of using software to mitigate the clock drift and also thinking about it myself, a software solution cannot be precise enough for me to avoid audible drifting in mid and upper frequencies.
 
What you are describing is a well-known problem with more than 2-channels. If you want to diy a >2channel dac, there is a multichannel USB board you could try at: https://www.diyinhk.com/shop/audio-...et-type_c_new_and_slim/159-esd_protection-2kv

Other than that, you would have a fair amount of work to do depending on what SQ you are hoping to achieve. Cleaning up conducted and radiated EMI/RFI noise from USB is in itself not a trivial problem IME. Then there is the problem of designing a high quality multi-channel dac. The easiest way would probably be to use the new ESS chips (e.g. ES9039PRO) which are reported to sound better than the last generation. Better yet are discrete resistor dacs designed without using a commercial dac chip. There is complexity and cost to deal with in that case, along with a learning curve.

Its just that there is no really high quality diy solution I know of now for people who want to multi-amp. There is the miniDSD stuff, but I would consider it consumer grade at best, not the best hi-fi people know how to do. The other option might be proaudio recording interfaces, but even they often rely on AK4493 dac chips (which are decent, but not the best AKM makes).

EDIT: the other common solution to staying in sync is to use ASRC chips at every speaker. Even though each speaker has its own clock, the ASRC converts the incoming audio in real time to that local clock. It may not be quite as precise as using one central dac and line level cables to each speaker, but it next closest thing. Its also what's used in big commercial sound reinforcement systems and in a lot recording studios that use AES to pipe around digital audio to local dacs (USB won't work for that).
 
Last edited:
  • Like
Reactions: 1 user
Hi,
my feedback vs word clock : that's a must.

I've an active setup : a pair of Dynaudio Core59 (+ sub 18s).
The Dynaudio Core series are the unique actives on the market (as far as I know) to have a word clock input (I purchased them because they have this feature ; and 1st, because they sound great).

I tested various setups with this actives (from basic to "max") :
1. DAC => analogue XLR to each speaker
2. digital interface (Mutec MC3+) => AES to one speaker => AES thru from speaker 1 to speaker 2
3. digital interface (Mutec MC3+) => AES to one speaker => AES thru from speaker 1 to speaker 2
PLUS => word clock link from the Mutec MC3+ to each speaker

Results are clear clear clear :) =>
setup 1 => I had a nice passive hifi setup b4 => but this basic setup already killed my previous hifi gear...
Why are we so stupid to have a passive hifi gear in 2024 ?... so much money lost in many upgrades of a passive setup... where an active performs way better (better SQ I mean, of course).
FYI, I just use a fairly cheap but good DAC @ 350Eur, SMSL SU-8s.

setup 2 => big jump of SQ vs setup1 : smoother sound, more "natural", 3D is more precise etc.
=> an active speaker with digital input is the right choice vs "only analogue" input
=> but an AES input is recommended. A coax input is far less reliable due to loss (cable length must be kept short ; no such limitation with AES) ; and protocol of data transfer using coax is far less qualitative than AES.
=> warning : the SQ is affected by the quality of the digital interface ! As everyone knows, the internal power supply of Mutecs are noisy => thus it has to be replaced by a linear PS (although my tiny homebrew switching PS performs better than a LPS...)

setup 3 => word clock link between each active & the digital interface (MC3+).
Both speakers are "sync" directly from the interface that deliver the "raw" audio digital signal => vs setup2, you have another jump of SQ :
  • more natural SQ (minor)
  • 3D scene / stability / precision (obvious on live reording of clasicl music/instruments) => MAJOR improvement.

To conclude :
As "word clock" means "synchronization" :
  • a sync between the DAC/digital interface and the speakers bring a huge improvement (as expected... just common sense in fact)
  • a word clock sync between various devices b4 the speakers : not tested... although I plan to test it because I don't know what to expect from it...
 
Usually better practice to use a SPDIF receiver with optional very high quality ASRC, such as SRC4392. The internal SPDIF receiver includes a PLL, or the ASRC can use a local crystal reference clock for improved jitter attenuation. SRC is often done anyway in speakers that support digital inputs so that any local DSP can run at a fixed frequency.
 
  • Like
Reactions: 1 users
You are a lucky man, theoretical expectations and common sense are reflected in what you perceive. Not everyone's hearing works like that.
Come on...
In audio forums that's always the same kind of stupidity : "SQ is psycho-bias". Someone is not in line with a conclusions of someone else => the culprit is : "psycho-bias". And of course "psycho-bias" is cooked this way / that way, inside-out, upside-down... : it always works ! :)
Come on... let's be cool & a bit honest with ourself.

I do many experiments on the PCBs of my hifi gear & others.
When I plan to do a test by acting "there" on the PCB, and changing this-by-that etc... I expect the result => but I don't care about this bloody expectation ! :)
I'm a science guy !
=> so, what matters is the result ! Result via my ears, or someone else ears, and electrnoic measurements etc...
=> many times, results didn't match my expectations ! So what ? => no brainer, I screwed up somewhere => let's find out ! After investigations, if I realised I screwed up regarding "a" point => I'm happy, because I've learnt something I didn't know earlier ! I'm still a dummy, but a bit less : I'm happy with that !
(you're not a science guy, aren't you ? :) )
 
It also means PLL, which is usually considered inferior in terms of jitter performance as compared to local crystal clocking at the dac.
Hi Markw4,
I don't understand your post...
I was saying that the goal of the word clock signal, in the case of my speakers, is important because they work "in phase", a bit more precisely that just by letting each speaker extract the clock signal from the digital signal.

vs you post : you're right ! The local oscillator/crystal can be precise => if the PLL circuit is polluted by this or that => final result is bad.
 
Modern dacs tend to be clocked primarily by a master clock (MCLK) versus a word clock (LRCK or WCK, etc.). It means a PLL is needed to derive a MCLK signal from the lower frequency LRCK (word clock). PLLs tend to be more jittery than crystals, even if there is a dedicated word clock signal to work from. Thus, its usually better to use a local crystal and ASRC rather than a PLL derived MCLK.
 
Last edited:
thanks for reply Markw4 but this is all "old stuff" / "like in a 20yr-old book"

ON-topic => deeply into this & many other topics =>
why do we have NO measurements using a DSO / VNA / SA & so on all over this forum ?

it looks like anyone rely on "post from others" or "specs"....

Quite weird given that audio e-stuff is :
  • as a basic, below 24M (if upsampling to 192/24)
  • below 125MHz if Ethernet is not involved
  • below 480MHz if USB2 not involved
...

so talking about "modern DAC" at a time where we have many cheap tools to trigger precisely signals faaaaar faster than any sound signal (digital/analogue) => that's !?
 
...as a basic, below 24M (if upsampling to 192/24)
False. Clock edge risetime sets the maximum frequency that can cause problems. With a modern clock you might have a 0.5ns risetime which takes at least a 1GHz scope with an active probe to measure.

Not only that but cordless phones, cell phones, and wi-fi can cause radiated EMI/RFI interference in audio equipment which is happening at a few GHz. The idea that audio problems are limited to nothing higher than the fundamental sine wave frequency of a clock cycle is plainly wrong. The effects of very high RF frequencies often find their way down into band, which is well understood by IC manufacturers and other engineers.

Moreover, there have been measurements with SDR RF spectrum analyzer but they are not necessarily without aliasing problems. There have also been VNA measurements of speaker cables. In addition, I posted DSO measurements of timing margins in MarcelvdG's very modern RTZ DSD dac, which resulted in a design modification by Marcel. The timing margin was a fraction of a ns based on device specs and scope measurements. Scope sample rate was at 2.5GHz.
 
Last edited:
Hi Markw4,
thanks for reply, very interesting & I (y) all

You should post your post on ASR... but be carefull because I posted a few posts saying that x00MHz & xGHz matters but I got answer from the Guru that if a signal is above the ear-range (22kHz max), it has no effect on the SQ of the audio gear ( :oops: )

Reminder : I'm just an hobbyist, my level on the "never-ending learning curve vs electronics" is not at the bottom, but somewhere "above level-0" :) honestly I don't know my level given the pending issues :)

I understand your "view" on the risetime issue, but you "see" it through a "scope", but using a scope => risetime is a bottleneck of any scope
and this limit is "high"
I mean =>
  • time-domain analysis using a scope lead to weird results when you deal with high speed stuffs => not meaningfull
  • a freq-domain analysis overcomes the limits of a scope by a few degrees of magnitude

so to say, using a scope / refering to a measurement using a scope to deal with high speeed signal => to me => no-sense coz you end up to meaningless results

...what a tricky subject :)