UltraNOT; creating a more modern P16 monitor system

First of all, this is a project that is _way_ outside my skills, so I have a good dose of blissful ignorance. But sometimes that helps?

Behringer makes a P16 personal monitor system that uses 2 sets of 8 channel AES3-based digital audio over ethernet cable. They have a very capable and physical interface to create a personal mix of the audio. However, I have a couple of wishes:
1. I don't need 16 buttons and a variety of knobs; I want to have it controlled through a touchscreen wirelessly.

2. I want a smaller package.

3. I want it cheaper :D.

It occurred to me that it _might_ be technically feasible to even use an ESP32 to read in AES 'frames' into a set of DMA FIFO buffers, do matrix operations from the software fader settings, and then pass the audio out through an I2S interface to a headphone.

Because this is for personal monitoring, latency is critical.
At this point, I legitimately don't know if the optimized ESP-DSP libraries are going to be able to keep up with 16 channels of 48khz audio being mixed down to a stereo stream.

Does anyone have any advice? Is there an easier way to do this?
 
This thread has some relevant information relating to the Ultranet protocol : Suggestions please for 16-channel 24-bit digital audio recorder

ESP32 doesn't appear to have an AES3-type interface, MCUs with interfaces which might work would be RT105X (NXP) and STM32H7X0 (ST). However neither of those has in-built wireless.

Doing the audio mixing I'd suggest would be trivial compared to managing control by a wireless remote touchscreen.
 
I was thinking I would need something (that I hoped existed) to translate the AES stream into something that the SPI interface on the ESP could suck in to dma buffers, do the mixing, and then output the stereo mix via I2S dma buffers, as a high-level and potentially very wrong approach.

Thanks for the MCU suggestions. It seems wise to start into this with something a bit more powerful with AES capabilities built in.
 
Use a pair of CS8416 S/PDIF receivers in AES3 Direct -mode (see the datasheet). Use the differential receiver in that, so connect the outputs from the MagJack (Blocked) or similar to the CS8416 inputs. Then use both of the I2S interfaces on one ESP32 to receive the data using DMA in 32-bit mode to be able to read the preambles, synch both of the streams to the Z preamble and output the data via SPI to another ESP32 which then does the mixing. There will be some latency, if you use a 64 samples deep DMA then there will be about 128 sample latency - 32 samples or half buffer from the receiver, 32 from SPI out, 32 from SPI in and finally 32 samples from I2S DMA to the output DAC. You may need a more powerful chip to avoid the extra SPI and to use smaller DMA buffers and do the mixing inside single chip. ESP32 could do the mixing easily (it's just multiplying and adding the data) but it doesn't have third I2S interface, and it's I2S is only half-duplex. I would use simply use something like the stm32 nucleo-h743zi2 module which should have enough inputs and outputs, and for wireless some ESP module.
 
You may need a more powerful chip to avoid the extra SPI and to use smaller DMA buffers and do the mixing inside single chip. ESP32 could do the mixing easily (it's just multiplying and adding the data) but it doesn't have third I2S interface, and it's I2S is only half-duplex. I would use simply use something like the stm32 nucleo-h743zi2 module which should have enough inputs and outputs, and for wireless some ESP module.

Thanks so much for popping in, Mhelin. This is kind of where I am leaning as well; thanks for the development board suggestion.
If I find time/help and get this working, it'd certainly be worth a few circuit board spins to tidy it up as a product.
 
@mhelin, It's turning out that the nucleo h743zi2 (with STM32H743ZI) is backordered until maybe April 2022.
What do you think about the STM32F411CEU6? it's a m4 instead of m7 and only 100Mhz instead of 480. It still has up to 5 spi/i2s interfaces, so I think I oculd have the 2 in 1 out, plus an spi for interfacing with the ESP. I can order it on the "blackpill" dev board from adafruit right now.