synchonized streaming to multiple clients/loudspeakers

Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.
About 20 meters.

Code:
---------------------------------------------------------------------------------------
        Quality   Level       Channel      Encryption       Address
---------------------------------------------------------------------------------------
           72     -59dBm     6 (2.437GHz)   on  WPA2

I usually turn off auto channel selection and pick one not used by the neighbours.

I use an app on my phone to monitor channel usage and always set channel usage manually. The 2.4G band is crowded around here. I am using a higher channel for general use in my home and channel 100 in the 5.8GHz range for my three-Pi isolated network. The middle part of the 5.8GHz band (channels 100 to 144) is not frequently used and is less supported by hardware in general. The Pi 4B radio can do it. It would be nice if I could easily add an external antenna, but unfortunately that's not so easy on this hardware. As long as I keep all the Pis in the same room it works just fine.
 
I received the NETGEAR R7800 router today and used it to replace my older access point.

Now when I ping the Intel machines on the 5.8G home network I am getting much lower latency:
Code:
ping 192.168.1.231
PING 192.168.1.231 (192.168.1.231) 56(84) bytes of data.
64 bytes from 192.168.1.231: icmp_seq=1 ttl=64 time=3.70 ms
64 bytes from 192.168.1.231: icmp_seq=2 ttl=64 time=2.15 ms
64 bytes from 192.168.1.231: icmp_seq=3 ttl=64 time=1.69 ms
64 bytes from 192.168.1.231: icmp_seq=4 ttl=64 time=2.10 ms
64 bytes from 192.168.1.231: icmp_seq=5 ttl=64 time=2.83 ms
64 bytes from 192.168.1.231: icmp_seq=6 ttl=64 time=3.37 ms
64 bytes from 192.168.1.231: icmp_seq=7 ttl=64 time=3.62 ms
64 bytes from 192.168.1.231: icmp_seq=8 ttl=64 time=1.95 ms
64 bytes from 192.168.1.231: icmp_seq=9 ttl=64 time=2.55 ms
^C
--- 192.168.1.231 ping statistics ---
9 packets transmitted, 9 received, 0% packet loss, time 8012ms
rtt min/avg/max/mdev = 1.694/2.662/3.704/0.712 ms

So far, this seems to have fixed the audio problems I was experiencing in the past. Time to do some more extended listening tests.
 
Charlie, I believe the intention is to reduce channel to inter speaker delay, the smaller the better. If unrelible wifi poses a problem then buffering at each client/endpoint/renderer/presentation can be increased. By doing this, overall delay from source will be increased but delay amongst the speakers will be. Is my understanding correct?
 
The problem I was experiencing before replacing my router was likely due to unprioritized WiFi traffic, which resulted in large and highly variable latency (e.g. high jitter). It was probably (my guess here) the highly variable packet delivery times that was resulting in rebuffering and other effects. The timing info (e.g. the playback "clock") for when the audio should be rendered on each client is obtained using an NTP like algorithm inside Gstreamer. Based on my experience with NTP itself, high packet jitter likely gives rise to excessive playback position skewing that can cause the stereo image to jump around even if there are no audible gaps in playback.

As I mentioned, if I increased the buffering to 500msec or longer, the audio problems were largely reduced but not absent. This was using a system built with a couple of Intel Linux boxes that have good WiFi hardware, antennas, etc. This large RX buffer was able to accommodate much of the WiFi packet delivery variability, and worked for the most part, but the large delay from the buffering was definitely less than ideal.

A 500msec RX buffer was the status quo for me until recently when I set up the three-Pi system on an isolated network using a free 5.8G channel. I was surprised that performance was very good, with no dropouts and with a stereo image that remained centered. I obtained this using a much lower RX buffer size of 60msec, and only using the Pi's built-in WiFi (it's not running though my home WiFi at all). This led me to look into the performance of my home's WiFi system, and that is when I discovered the latency issues that have now been solved by the new hardware. It's surprising how good the Pi's own networking can be, on such a low-cost and low-powered computing platform.

I believe that it's because Gstreamer generates the playback clock from send and return times using an algorithm similar to NTP that the send and return times must have a sufficiently low jitter for playback timing to remain stable enough to keep the stereo image centered.

I have attached a slide (slide 15) from a presentation about Gstreamer's NetClock mechanism, taken from a presentation titled "Synchronised multi-device media playback with GStreamer" which you can find on the web. I direct you there for more info.
 

Attachments

  • Gstreamer GstNetClock illustration.png
    Gstreamer GstNetClock illustration.png
    295.7 KB · Views: 129
Last edited:
Timewise, wifi is a very unreliable communication channel. A random mobile hotspot nearby can ruin the otherwise perfect values. I do not know of any better way but cable, sadly.

You can compare what I was able to achieve, using the same machines, via WiFi (using the new R7800 router with Qos/WMM) and via wired ethernet.

First, the WiFi:
Code:
ping 192.168.1.231
PING 192.168.1.231 (192.168.1.231) 56(84) bytes of data.
64 bytes from 192.168.1.231: icmp_seq=1 ttl=64 time=3.70 ms
64 bytes from 192.168.1.231: icmp_seq=2 ttl=64 time=2.15 ms
64 bytes from 192.168.1.231: icmp_seq=3 ttl=64 time=1.69 ms
64 bytes from 192.168.1.231: icmp_seq=4 ttl=64 time=2.10 ms
64 bytes from 192.168.1.231: icmp_seq=5 ttl=64 time=2.83 ms
64 bytes from 192.168.1.231: icmp_seq=6 ttl=64 time=3.37 ms
64 bytes from 192.168.1.231: icmp_seq=7 ttl=64 time=3.62 ms
64 bytes from 192.168.1.231: icmp_seq=8 ttl=64 time=1.95 ms
64 bytes from 192.168.1.231: icmp_seq=9 ttl=64 time=2.55 ms
^C
--- 192.168.1.231 ping statistics ---
9 packets transmitted, 9 received, 0% packet loss, time 8012ms
rtt min/avg/max/mdev = 1.694/2.662/3.704/0.712 ms

Below, we have the wired internet performance:
Code:
ping XXX.0.0.231
PING XXX.0.0.231 (XXX.0.0.231) 56(84) bytes of data.
64 bytes from XXX.0.0.231: icmp_seq=1 ttl=64 time=0.733 ms
64 bytes from XXX.0.0.231: icmp_seq=2 ttl=64 time=0.746 ms
64 bytes from XXX.0.0.231: icmp_seq=3 ttl=64 time=0.728 ms
64 bytes from XXX.0.0.231: icmp_seq=4 ttl=64 time=0.752 ms
64 bytes from XXX.0.0.231: icmp_seq=5 ttl=64 time=0.758 ms
--- ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4056ms
rtt min/avg/max/mdev = 0.728/0.743/0.758/0.011 ms

The wired connection displays 2.662/0.743= 3.5 times faster response times as well as 0.712/0.011 = 65 times less jitter. This is just a quick example, but shows just how much better a wired connection can be.

Still, when I compare the audio performance of the system when connected by the two methods, it is about the same, so in this case the increased performance of the wired ethernet connection over a good quality WiFi connection does not result in any better audio performance.
 
Your wifi results are very good, not doubts. What I just mean is a random AP nearby can ruin them without any chance to prevent that.

Well, sure, I will concede that point. Someone can be malicious I guess, or it can even be accidental. I am willing to live with that possibility. It's the same for your home's WiFi - it's good practice to look for an unoccupied channel and use that. Hopefully you, and all your neighbors will play nice with each other. Where I live only the 2.4G band is crowded and there are plenty of open 5.8G channels, especially in the band spanning channels 100 and 144 for which DFS/TPC is required for operation.

Anyway, the transport medium can be either WiFi or wired. The Gstreamer based application doesn't care. It will need to know the IP address (the system uses fixed IP addresses) ahead of time, and this will change between the wired and WiFi adapters. If you would rather connect wires to your loudspeakers for the peace of mind of it, that is perfectly acceptable.
 
Last edited:
You can try the WiSA modules. They connect over 5ghz and scan for additional channels to switch to in case of problems. They claim 1uS inter-channel delay. Only work in single room though. But a lot of new TVs and receivers will be supporting WiSA so you can stream to the speakers easily.
There are usb dongles and modules with i2s output so you can potentially skip the RPI altogether.
 
You can try the WiSA modules. They connect over 5ghz and scan for additional channels to switch to in case of problems. They claim 1uS inter-channel delay. Only work in single room though. But a lot of new TVs and receivers will be supporting WiSA so you can stream to the speakers easily.
There are usb dongles and modules with i2s output so you can potentially skip the RPI altogether.

I am not interested in a hardware solution like that or Dante.

I want to use existing infrastructure (like the WiFi in our homes) and inexpensive general computing hardware like the Pi for performing data TX and RX as well as DSP processing.

I don't want to (as you say) "skip the R-Pi". Instead I am creating an R-Pi (or similar low cost computing hardware) based solution on purpose.
 
You can try the WiSA modules. They connect over 5ghz and scan for additional channels to switch to in case of problems. They claim 1uS inter-channel delay. Only work in single room though. But a lot of new TVs and receivers will be supporting WiSA so you can stream to the speakers easily.
There are usb dongles and modules with i2s output so you can potentially skip the RPI altogether.

You mention the WiSA products like they are available to the public. I thought this was more or less a proprietary commercial product that required licensing. Is that not correct?

I mean, let's say I want to share my Gstreamer stuff with someone. They can go out and buy Raspberry Pis on Amazon and get the OS and all software for free. It's very accessible to the DIYer.
 
Hello Charlie,

I really like your project. I know your goal is specifically to do this wirelessly, but I was wondering, what you would change, and what would it fix, if you were to do it wired? Be it Ethernet, or hdmi, or whatever.

I am wondering if something like this, could be used to do an in room multi-speaker setup, (5.2.4). Enable 5, 3 way speakers to be "active", all synced perfectly still. You wouldn't need a device with 15 channels, but 5 devices that could do 3 channels. Wireless wouldn't be required in that sort of situation.

I may be using the incorrect terms, I am still learning. I appreciate you sharing this project with us.
 
Last edited:
Hello Charlie,

I really like your project. I know your goal is specifically to do this wirelessly, but I was wondering, what you would change, and what would it fix, if you were to do it wired? Be it Ethernet, or hdmi, or whatever.

First of all, the best connection is a wire. You know, a wired interconnect or a speaker wire. If you can run some kind of wire, then there is no need to bother with what I am doing - just directly connect the speakers via speaker wire.

If there is no wire, then wireless is the only other option. The endpoints still need power, however, so unless you happen to have several AC receptacles nearby where you will locate the speakers you will be running power cable to the speakers. More wire!

In my case, I only listen to stereo audio. It's typically not difficult for me to set up the speakers near two AC receptacles. Also, I use this to send audio to various speaker system within my home, in different rooms and on different levels. Some of these happen to be connected via wired ethernet, and other are wirelessly connected to the LAN.

I am wondering if something like this, could be used to do an in room multi-speaker setup, (5.2.4). Enable 5, 3 way speakers to be "active", all synced perfectly still. You wouldn't need a device with 15 channels, but 5 devices that could do 3 channels. Wireless wouldn't be required in that sort of situation.

I may be using the incorrect terms, I am still learning. I appreciate you sharing this project with us.

So, the answer to this second part of your post is *yes*. But it probably only makes sense if the room is very large and running wires (ethernet or speaker wire) is just not practical. Otherwise you can just get a DSP box, locate that with your other equipment, and have the speakers and amps connected by good old speaker wire.

I should point out that my Gstreamer application can act like the aforementioned "DSP box" by implementing DSP in software and then sending the audio to a LOCAL DAC or other soundcard without doing any streaming. For instance if your computer has a 7-channel PCI soundcard (or USB for that matter) you can use my application just like a miniDSP 4x10, etc. Keep in mind that it only works under Linux. Since the Windows WSL (Linux under Windows) has no audio subsystem available you can't use that unless you really want to work at it - you can install PulseAudio on both the Windows side and the Linux side and send audio between the two, but this seems like a bit of a kludge to me.
 
Probably Ethernet over the power line could be another option. It gives the benefits of the wired network, eliminates the need in Wi-Fi and requires just the same main power wall outlet.
Amazon.com: TP-Link AV1000 Powerline Starter Kit - Gigabit Port, Plug&Play, Ethernet Over Power, Nano Size, Expand Home Network with Stable Connections, Ideal for Smart TV, Online Gaming(TL-PA7017 KIT): Computers & Accessories
I use it for many years without any issues.

The last time I look at the ethernet over power line adapters, the performance was not very good. This was several years ago now, so perhaps it has improved since that time. Can anyone point to some recent tests of reputable products of this type?

From my recent problems it seems that low jitter is a requirement for synchronized streaming over the LAN. But jitter is not something that I have seen as a spec for any kind of internet equipment. I just happened to get it from the results of ping-ing machines in my LAN.

If someone out there happens to own an ethernet over power line adapter and will collect and post some ping statistics I would be very interested to see the results.
 
Hmmm, encouraging that the TP-Link system includes Qos traffic prioritization.

So that particular one might be worth a try, especially if/when all the devices are in the same room and/or on the same branch circuit.

From my recollection and some recent browsing on the topic, some problems arise when you try to install the adapters on two different circuit branches because the signal has to go all the way to the breaker box and back again. Also, people have brought up issues with extremely variable latency (high jitter) that was causing problems with gaming and I am sure would cause problems with audio synchronization similar to what I was experiencing until recently. In addition, reports of interference from fluorescent lighting, vacuum cleaners, microwave ovens, etc. did not seem encouraging. These kind of issues scared me away from this technology in the past.

To implement a simple system of two synchronized loudspeakers fed from a computer audio source you would need one sender and two receiver adapters. This starts getting pricey.

On the other hand, I can implement the same thing with the built in WiFi on the Pi 4 and it works well on a private network when all units are in the same room, or on a low-jitter WiFi WLAN network. This is at no additional cost, so there seems to be little added benefit to spending $150+ on powerline ethernet adapters. Unless I am missing something.
 
Charlie,

Thank you for the response. I think I am still missing something basic. If for example I was to take 2 "dsp boxes" such as a "miniDSP 2x4 HD" and had one control the front right speaker doing active 3 way crossover, and the other controlling the front left speaker doing active 3 way crossover. Would the left and right speaker be in sync? If so, I guess I don't have the issue I am worried about.
 
If for example I was to take 2 "dsp boxes" such as a "miniDSP 2x4 HD" and had one control the front right speaker doing active 3 way crossover, and the other controlling the front left speaker doing active 3 way crossover. Would the left and right speaker be in sync?

Yes.

The plugin for the 2x4 HD can be configured for 3-way operation:
2x4 HD plugins : 2x4 HD1
 
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.