Open source Active Wifi speaker project

Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.
Hello,

I'm glad to share with you my new wireless speaker project. It's still in progress but the first version runs and sounds well.

Basically, it is :
- Raspberry Pi based audio player stacked with DSP and amplifier boards
- active filtering based on ADAU1701 DSP
- PCM5102 I2S digital to analog converters
- Class-D amplification based on TPA3118
- Low latency (<1ms) WIFI connected speakers that share audio (no cable between Left and Right speakers)
- Scalable speaker system (you can use 2 for wideband stereo, 3 for 2.1 system...)
- Easy DSP configurator software based on text file. You don't need to have SigmaDSP connected to your board to change the DSP settings, all you need is to edit a file with your settings (see here for more info)
- Free and open source

Right now, it's limited to stereo configuration but soon I'll extend the speaker configuration to 4.0, 5.1 and even 7.1 for anyone desiring to build a wireless multichannel speaker setup.

I have done a prototype based on this electronic & software architecture that is described in more details on the website, and look like this.

80caa4_b79db80a1a3345f28a9e62531e5bd17d~mv2_d_4608_3456_s_4_2.jpg


By the way, if anybody wants to join the project to give a hand for hardware, software or website developpement, feel free to contact me.

Website link (for speaker details) : http://mydspi.v2ale.com
GitHub link (to share the hardware and software design) : GitHub - V2Ale/MyDSPi: MyDSPi is an electronic and software solution that works with Raspberry Pi boards to make active speakers
Thanks,
 
Last edited:
It's a very interesting project.

Actually, I have posted another message about constructing WiFi active speaker using Libre LS5B module that allows direct streaming. Wondering if direct streaming from various music server such as Spotify, etc. will be of any interest to any reader in this forum?
 
I have worked on and implemented several projects with similar themes: two or more DSP processors in physically separated locations (e.g. left speaker, right speaker, subwoofer(s), and so on) and how to synchronize all of them.

I looked at your web site. I see you are using SnapCast. I don't have any experience with that. If you know, can you share the level of synchronicity that can be achieved using this software? It did not see that listed anywhere on the GitHub site. Getting systems as synchronized as possible is the main problem with "distributed" systems like this.

If you are interested, I will explain my own approach now and what I learned. I use all public domain software or wrote my own wrappers. I have a "server" and multiple (any number of) "clients". Systems consist of one or more client, e.g. client1 is embedded in the right loudspeaker, client2 is the left loudspeaker, and the system is comprosed of client1+client2. I use gstreamer to stream PCM audio at a fixed bit depth and sample rate to all clients in a system. This "fixed" rate is preselected and cannot be changed while audio is playing, but can be reconfigured any time. Some software on the server is the "player" e.g. a streaming audio player, etc. I use MPD for this purpose, because it can be controlled remotely using MPD clients. On the server the audio output of the player is sent via an ALSA loopback to gstreamer, which streams it to the clients in the system(s) that you would like to have playing audio. On each client, another gstreamer instance is run to receive the audio. Its output goes to a local ALSA loopback, and then I use the program ecasound to implement a DSP crossover using LADSPA plugins that I wrote. It's very flexible and powerful. The output from ecasound is routed to a DAC with 2 or more channels.

So, how to synchronize these systems? I use NTP with a local GPS based stratum 1 server that I built. This is able to achieve about 30-50 microseconds std deviation in the system time, which keeps the playback rates quite well synchronized. The same WiFi LAN connects all the clients to the host and the NTP server, and it is nothing special. I use a $20 router with good old 2.4GHz 802.11g.

When the system is up and running it's all good. One drawback is that, e.g. if I lose power or something like that, all these systems need to start up again and then re-synchronize and that can definitely take awhile (many hours). Luckily at my home this is a rare occurrence. Another one is latency for the WiFi can be 50msec to 100msec or more. Since I only use the system for audio playback this amount of latency is a non-issue for me.

What I have learned is that you can have streams for each client, e.g. speaker, and if any small synchronization errors develop what happens is the stereo image starts shifting to one side. It's a bit of an odd effect, but it goes away again when the sync improves. I would never think of using separate streams for, e.g. woofer, midrange, and tweeter. Synchronization problems would cause the frequency response to vary and other issues that would be much worse. So I wanted to caution against that approach for multi-way speakers.

Anyway, I would be interested in learning more about what streaming protocol you are using that achieves such a low latency. I recall there were some specialized hardware based streaming chips that were developed a few years ago but I never heard much about them. Maybe you are using that kind of platform?

-Charlie
 
I looked a little more into SnapCast. I found some information about the synchronicity of clients, or at least that it what I think it is. It's not too encouraging, and may not be good enough for your application.

In the SnapCast web page:
GitHub - badaix/snapcast: Synchronous multi-room audio player
Scroll down to read the README file. In the section ""How does is work" the last sentence says:
Typically the deviation is smaller than 1ms.
To me this means that any two clients will be synchronized to around 1ms or slightly better. This is not good enough for a pair of stereo loudspeakers. This is because with 1msec difference the stereo image will be obviously shifted to one side. As one speaker gets "ahead" of the other in time, the image shifts to that side. It's rather distracting. You need synchronicity of better than 100 microseconds at least, to have a stable, centered image. Even 100usec is not great, but less than about 50usec is no problem. Believe me I have struggled with this problem myself.

You mention 1msec latency in your first post, but latency is something else - the time from when the audio is "played" on the source to when it comes out of the speaker. 1msec latency would be very low latency even for systems connected via wired LAN (I use WiFi connections). I would expect maybe at least 20-50msec of latency due to buffering of in the SnapCast speak the "chunky data stream".

MrVins59, will you please comment on these issues?
 
If you move one of your speakers 3,4 cm backwards, will the stereo image be shifted to one side?

//

1msec = 0.343 meters. Try that.
3.4cm is 0.0343 meters, or 100usec.

Remember the values I am quoting are the std deviation. Over time the system experiences higher and lower instantaneous synchronicity. For a normal distribution of sync times (and I found this to be the case for my systems) 95% of the time the instantaneous sync will be within 2 standard deviations.
 
Member
Joined 2004
Paid Member
...higher and lower instantaneous synchronicity...

We used to call this variance, but it's now often called jitter.

The big 3 network quality of service metrics are throughput, latency and variance. Audio is always going to have latency due to buffering requirements, but for left/right audio synchronization the variance needs to be low.

One technique for audio synchronization is based on using time stamps, and that's what snapcast uses. According to the article on WiFi audio from TI, time-stamp approaches achieve synchronization in the millisecond range. Snapcast claims that the synchronization is typically 1ms, but Charlie is correct in having concerns about this being adequate for left/right paired audio.

However, the new generation of WiFi speakers are using mesh networking in which each node can be an access point. From what I understand of Sonos, Linkplay, Heos and other wireless speaker solutions that provide left/right synchronized audio, one speaker is the "master" and it serves as a "soft" Access Point that passes the audio to the other node with minimal delay. Heos claims "industry-leading microsecond audio synchronization". I don't know the specifics of how they achieve this, but I know that it works. My wife has two HEOS 1 speakers running in stereo and the image is perfectly stable. Linkplay provides the networking software for Dayton, Cobblestone, Audiocast, IdeaHome, and many others, and they also claim "perfect synchronization", although I can't find a spec for the amount of delay through their AP. I believe Sonos is in the 10usec range, but I can't find the spec on their website.

I'll be testing a pair of Linkplay modules in "stereo mode" in the next few weeks.

The Linkplay modules are available for around $40-$50. However, the Muzohifi.com web site is now dead, so you can probably get the Coblestone version fairly cheap--it's $23 at Fry's. There's nothing wrong with the Muzo Cobblestone--it appears that they went under because other vendors are selling the same thing at a lower price. Linkplay has some newer modules that have Alexa integration and much faster CPU's (A76), but I haven't identified products using the new modules yet.
 
Last edited:
I would love to find (or create?) a DIY software solution, or equivalent, to the mesh networking that Neil described. Maybe if one network client serves as the local access point for the system (it's master access point) and rebroadcasts locally to the other clients in the system... will have to think about how to implement this!
 
Member
Joined 2004
Paid Member
All of the major vendors are selling mesh routers now. Costco is selling both the Netgear Orbi and the Linksys Velop--this is clearly the direction in which home networking is evolving. I don't know how much support these vendors have for synchronized audio in their products, but I know that they use a lot of open source code. In fact, you used to be able to download the source code for many of the Netgear products from their web site. I don't know whether that is still the case, but look around--you might be able to found some of the new router code.

Also, Linkplay has a development kit for Alexa integration that runs on a pi board. I don't know whether there are NDI's and other agreements to prevent using the code for DIY applications, but it's worth looking into it if you want access to their source code.
 
I was thinking more that one speaker is a Pi or similar SBC outfitted with two network adapters. One runs at 2.4G to receive the audio signal from the "source". The other runs at 5.8G and forwards the audio to the other nearby clients in the system, possibly as an ad-hoc network. There would be some compensation for the delay in the forwarding, but that should be pretty low. 5.8G is pretty good when the distance is not large, and it offers many sub-channels.
 
Member
Joined 2004
Paid Member
I was thinking more that one speaker is a Pi or similar SBC outfitted with two network adapters.

That should work if the latency through the entire protocol stack is low enough. The Sonos Linkplay Heos etal solution probably implements the AP function at a lower point in the TCP/IP protocol stack to ensure low latency and less variance. So that's why getting that code might be advantageous.

But I'm guessing how the AP function is implemented, because I haven't found any technical discussions on how they do their left/right synchronization. In the article that I referenced, TI suggests that some solutions use 802.11v, which appears to be a MAC/PHY-layer protocol. It might be worth looking at that standard for ideas (I haven't done that myself and don't intend to).

As a side note, I had some trouble returning to this thread because I couldn't easily find the original post. Is the Multi-Way forum really the right place to discuss active speakers with DSP and Wi-Fi? According to the sub-title, Multi-Way is about "Conventional loudspeakers with crossovers". I suggested over 2 years ago that we needed some new categories to cover active speakers with DSP and WiFi, but got a lot of push-back. That's one of the reasons I set up the Audiodevelopers.com web site--there isn't a logical place on diyAudio to discuss this modern approach to speaker building. Meanwhile, the WiFi based speaker market grew by 62% in 2016 to 14 million units according to a recent report from Strategy Analytics. I'd still like to see diyAudio address this issue...
 
As a side note, I had some trouble returning to this thread because I couldn't easily find the original post. Is the Multi-Way forum really the right place to discuss active speakers with DSP and Wi-Fi? According to the sub-title, Multi-Way is about "Conventional loudspeakers with crossovers". I suggested over 2 years ago that we needed some new categories to cover active speakers with DSP and WiFi, but got a lot of push-back. That's one of the reasons I set up the Audiodevelopers.com web site--there isn't a logical place on diyAudio to discuss this modern approach to speaker building. Meanwhile, the WiFi based speaker market grew by 62% in 2016 to 14 million units according to a recent report from Strategy Analytics. I'd still like to see diyAudio address this issue...

According to what I have read, this forum will soon (but when?) be changing over to a TAG based system for identifying how the thread relates to DIY audio. Something similiar to what Stackexchage uses I am guessing. At Stackexchange you can create tags if they do not exist, and assign multiple tags to threads. Using this topic as an example, it could be tagged with:
loudspeaker
DSP
active
network audio
tcpip
etc.

I like that approach because it is more general and flexible and can accommodate things that this current system cannot. For example I do DSP but on a Linux box as a loudspeaker crossover. The speakers can use built in amps, and they could be class AB, D, or both. So, do I start the thread under under PC Based, Multiway-Loudspeakers, Solid State, Chip-Amp or Class-D?
 
Member
Joined 2004
Paid Member
According to what I have read, this forum will soon (but when?) be changing over to a TAG based system for identifying how the thread relates to DIY audio.

Yep, that's what was implied in that thread from June 2015. But I'm guessing that there will be massive push-back from a readership used to the existing organizational structure, so I'm doubtful that the Stackexchange approach will actually get implemented. Also, I think the existing forum structure works OK in many ways--it just needs to be updated. A lot has changed in audio in the past 10 years, and the diyAudio forum structure hasn't kept up.

Here's another example: I just added a page to my blog at Audiodevelopers about using the Analog Devices SuperBass algorithm. It's a psychoacoustic algorithm that is commonly used to extend the bass response of small WiFi and Bluetooth speakers. It's about audio, and it's now DIY, since anyone can download the Analog Devices tools, and the technologies to integrate these algorithms are readily available. But a discussion of a psychoacoustic algorithm like SuperBass doesn't fit into any of the categories here at diyAudio, even though it is a widely used technology in all of those millions of WiFi and Bluetooth speakers sold last year. You could try posting in the Subwoofers forum, but most people here refuse to accept anything less than 8" as a subwoofer. You could post it in Digital Line Level since it is code that runs in an ADAU1701 DSP, but I haven't had any luck getting anyone in that forum interested in DSP software. It definitely doesn't belong here in Multi-Way. It really belongs in an Active Speaker forum, along with the original post in this thread, where discussions about WiFi adapters and DSP algorithms would make sense. But since there isn't any Active Speaker forum at diyAudio, I was forced to start my own blog at Audiodevelopers.com. I would prefer to be a more active participant at diyAudio, but until there is an obvious place to contribute I don't feel welcome.
 
music soothes the savage beast
Joined 2004
Paid Member
Hello,

I'm glad to share with you my new wireless speaker project. It's still in progress but the first version runs and sounds well.

Basically, it is :
- Raspberry Pi based audio player stacked with DSP and amplifier boards
- active filtering based on ADAU1701 DSP
- PCM5102 I2S digital to analog converters
- Class-D amplification based on TPA3118
- Low latency (<1ms) WIFI connected speakers that share audio (no cable between Left and Right speakers)
- Scalable speaker system (you can use 2 for wideband stereo, 3 for 2.1 system...)
- Easy DSP configurator software based on text file. You don't need to have SigmaDSP connected to your board to change the DSP settings, all you need is to edit a file with your settings (see here for more info)
- Free and open source

Right now, it's limited to stereo configuration but soon I'll extend the speaker configuration to 4.0, 5.1 and even 7.1 for anyone desiring to build a wireless multichannel speaker setup.

I have done a prototype based on this electronic & software architecture that is described in more details on the website, and look like this.

80caa4_b79db80a1a3345f28a9e62531e5bd17d~mv2_d_4608_3456_s_4_2.jpg


By the way, if anybody wants to join the project to give a hand for hardware, software or website developpement, feel free to contact me.

Website link (for speaker details) : mydspi
GitHub link (to share the hardware and software design) : GitHub - V2Ale/MyDSPi: MyDSPi is an electronic and software solution that works with Raspberry Pi boards to make active speakers
Thanks,

What is the art object at bottom right? Just curious. Thanks.
 
Ooops I did not get notified of new messages and just discovered that. I'm glad to see that some people are also interested in similar topics.

Regarding synchronicity between speakers, the exchanges are very interesting. Thanks for sharing your experience and knowledge.
In my point of view, the value in itself is not that much important. What count is how stable is the value in time. From memory, the measurements I did with an oscilloscope was all around maximum 1ms.. but I do not remember the typical difference and that would be interesting to measure. For this project, I do not aim to get a so good time difference of around hundreds of microseconds because anyway my speakers are not placed perfectly.. it just have to find a good enough position for stereo that match the WAF factor ;). In my case, since the delay seems pretty constant, I make up the left/right balance (due to delay, listening position..) with a simple gain that Snapcast allows you to set easily, and it works fine.

Last, the to answer about the art object at the bottom right of the picture, it's Tom Dixon's etch tea light holder copper.
 
In my case, since the delay seems pretty constant, I make up the left/right balance (due to delay, listening position..) with a simple gain that Snapcast allows you to set easily, and it works fine.

+1 It does work fine for me too!

I must say that i was terribly biased against the idea of using a pair of L/R snapclients because i also found that 1ms delay variance or whatever ( 30cms :eek:) unacceptable...

But after giving it a try i shamefully have to admit that it sounds good enough for me...:p

Btw, i am now on a project of a wifi subwoofer based on this idea.
 
Last edited:
After a few hours of more careful listening I have to admit I am not so satisfied...:eek:

The problem is with the soundstage which constantly changes. This is especially obvious with very simple musical programs, for instance one classic guitar performance, the phantom image constantly changing in size and position.

It does not sound too disturbing at the beginning, and i don't find it sounds an unnatural artifact, because atmospheric changes at a live event outdoors or in a large hall produce similar changes to the soundstage and in a manner of speaking a rock solid studio one is indeed artificial (too good to be realistic...) , but i have to admit that it's an artifact resulting from the wandering sync between L and R channels allowed by Snapcast, not present in the recordings.:(
 
After a few hours of more careful listening I have to admit I am not so satisfied...:eek:

The problem is with the soundstage which constantly changes. This is especially obvious with very simple musical programs, for instance one classic guitar performance, the phantom image constantly changing in size and position.

It does not sound too disturbing at the beginning, and i don't find it sounds an unnatural artifact, because atmospheric changes at a live event outdoors or in a large hall produce similar changes to the soundstage and in a manner of speaking a rock solid studio one is indeed artificial (too good to be realistic...) , but i have to admit that it's an artifact resulting from the wandering sync between L and R channels allowed by Snapcast, not present in the recordings.:(

This is exactly the problem that you cannot overcome with e.g. SnapCast, Airplay, DNLA, etc. They just can't get the synchronicity low enough.

The new version of my streaming audio controller (actually even the old version) can do this much better, at least 10x better. The new version allows for the user to do LADSPA (IIR) filtering in the playback chain on the clients, so you can implement a DSP crossover as well. No need to pass off the audio to another program.

Send me a PM if you want to give it a try. You 2-speaker system might be a good test bed.
 
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.