First post! I joined here as I think I'm wearing my welcome on the electronics forum
🙂
There may be more experts in here to maybe nudge me away from peril or my own stupidity and in the more correct direction.
I set myself some requirements for how I personally want my audio served up.
A desktop system which will provide
- multiple USB digital inputs,
- a single 3.5mm/RCA analogue 'aux' in.
- multiple 3.5mm/RCA analogue outs.
- Option of a single digital output, tos, spdif TBC.
- Have 2 (or more) internal 'buses' with EQ, minimal processing (mix/balance/eq).
- nice to have, knobs, buttons and a LCD screen UI.
Without building anything I can find cheap rubbish for about £150 that will work for 75% of those requirements, but will end up fissiling and scratcing an failing after a month. I can also find studio grade monitor/cue routing boxes for 1U racks with multiple inputs and outputs and EQ + 4 or more headphone amps. However these cost £3k +
I'm not looking for "professional grade audio" or even a "professional grade mixer". Just something that allows me to maintain pure digital audio until the last possible second when it's off loaded at the best quality reasonably achievable to an amplifier. In the case of the headphone amp that will hopefully be a PCB trace away from the DAC.... if I can't find a beefy enough I2S headphone amp with more than 500mW power.
To that end. Unless required to "cast" up in bit width for calculations all audio will remain 16 bit. Unless there is a good reason, such as word alignment, buffer alignment or other prevalent reason, all audio will be 48K stereo.
This is cheating. Yes. It's the only reason I consider it achievable. If I wanted to do this to support all the possible bit widths, sample frequencies and all manor of formats and endpoints ... it would be better off just using a massive DSP chip and becoming long term friends with it's datasheet by the bed side!
By standardising all streams to the single 48k 16 bit stereo, and cutting the buffers to be the standard 1ms from USB and I2S I can treat audio buffers as cookie cutter items. All of them will be 192 bytes containing 96 samples. 48 left, 48 right. <- this is subject to a trade off between efficiency/stability of processing larger buffers versus latency (and increased synchronisation requirements) of the same.
Ideally, running a SINGLE I2S clock on the project will prevent buffer creep/clock skew on all but the USB end points which will need realignment from time to time. (present test setup loops the buffer in about 30 minutes with a 24.576Mhz clock on a breadboard, not ideal as that includes about a minute where both DMAs are reading/writing the same frame!).
The current outline architecture is really hinged on one question. Do I want to continue to use an MCU as the USB endpoint. I have had far, far too many issues with the driver code surrounding USB Audio Class on STM32. I still haven't solved the fact it will not respond to the incoming audio stream about 4 times out of 5. I have to continually reset the MCU to get it to pick up the stream. Obviously a driver timing issue there, it's missing an important packet or it's receiving it when it's not ready for it and the PC does not send another. I expect forcing an endpoint reset on power up or on USB cable insertion may help sync them. The other option is using a hardware IC like a PCM2706. That has the issue where "consumer audio" ICs, tend not to like 24.576Mhz clocks. They tend in fact to have a max I2S MCLK of 12Mhz. So until I can get my hands on one (shipping and IC supply issues) and test a few prototypes out I can't know if it will be worth seeking an alternative slower clock or continue with the MCU approach. The ideal here would be for the PCM2706 to be running with an external I2S master clock, and for it to do the USB sync'ing or asking the host to respond to ITS clock. That would completely free me from doing any reclocking / reframing of audio streams - a huge bonus.
The decision above knocks on to the internal bus architecture. I2S is great and all, but it's slow as it forces you down to the sample frequency domain for transmission. I mean that kinda is the point of it. However the I2S ports are really just SPI ports with a few additional signals. For the internal bus I am free to use SPI at a MUCH faster rate than the I2S. A 48K@16bit stereo I2S stream is about 1.3Mbit/s The native hardware SPI bus on even the small STM32F411 runs at 50Mbit/s. (well, 48Mbit/s if you need USB 48MHz clocks too). A single frame of I2S can be transmitted and received in less than 250us on that bus.
High level: Multiple input end points can "dump" their audio packets (pre-gained) to one or two internal buses. These buses are mastered and processed by a more beefy MCU such as a STM32H7. Processing will include down-mixing - simply mixing two streams together, parametric EQ per bus, not per channel, not unless I have LOTS of free horsepower, which I doubt. I won't be writing perfectly optimised filter code and will probably be re-calculating biquad coefficients on the fly.
Finally, of course, output routing. Each internal bus can be assigned to any output.... or more practically the buses will send their audio out regardless and each output endpoint is free to pick up that bus or not.
4 x Input - i2s -> 2x Bus mixer STM32F411 - 50Mbit SPI -> 2x Bus processors STM32H7 -> 4 x i2s output.
If the PCM2706 avenue hits a dead end, I can drop the i2s bus and just use SPI bus there. The exception would be the aux analouge in which will have to have it's i2s stream rebussed to SPI.
I already have breadboard prototypes of mixing an ADC stream with a USB stream, including buffer alignments. I have a list of prototypes to put together and test. Each sub component or bus topology has a set of prototypes to test it works as expected or not as the case might be. Each helps me decide which path to take.
The hardest part is steming from my lack of formal mathematics training/study. I just don't speak maths. That creates an issue when you come to researching EQs and filters, which the main populus in that field are electrical engineers who insist on calculating everything from scratch every time and forcing you to sit through listening to them explaing it in derivatives each time. I can find OS libraries I can pilfer/borrow, it's just finding the easies to port to ARM DSP biquads.