Can low jitter be achieved with STM32 microcontroller

JMF11 · 2016-11-06 9:59 am

Update:

On Github are available versions running on Stm32 (and especially stm32F4 discovery):
- Async USB with feedback 16 bits 48k. This is not perfect as the feedback sending is still a bit strange, but I can feed my Stm32 without buffer underruns or overruns,
- DSP section that can split the incoming stereo flow in different flows and apply biquad and scaling factor on them. I could drive one of my LXmini with this for real (only one as the Disco board has only one DAC)
- I2S output

I also have adapted my software to the Nucleo F746ZG to benefit from:
- the 2x SAI (potential for 4 stereo digital out = 8 speakers)
- the increased CPU power and DSP ability.

Software adaptation for the common part was easy thanks to the HAL libraries. This demontrates the hability of the platform and soft to scale up with new Stm32 devices, which is promising.

I now try to use the SPDIF out. I get sound, but at too low freq (my amp detects 24kHz when I expect 48 kHz). I could connect the nucleo board directly to my FX-Audio D802 without specific hardware and this seems to work fine.

I must recognize that I start to be short of breath... Arriving to there required me lot of effort, learning and time, much more than expected. I had underestimated the learning curve needed for such a development. I had also hopped that this would raise the interest of other members of the community and get some momentum.

If interested, please jump in ! I'm still convinced that the product has the required potential and needed features for multichannel applications with a lot of flexibility and low cost.

Best regards,

JM

- I have found my bug described above, but it took some time... frustrating

mhelin · 2016-11-06 12:53 pm

In some cases (eg. with Apple or Linux, or some other USB host, maybe Windows 10) you can drop off the feedback endpoint and instead introduce an audio input endpoint - feeding empty packets if you have no ADC - which the driver stack (in USB 2.0 audio class drivers) can then use as implicite feedback channel.

JMF11 · 2016-11-06 5:56 pm

The feedback mecanism seems pretty efficient, ans not so difficult to use. It is just a specificity of the Stm32 that makes a bit more difficult to manage the odd/even parity bit when you don't know when the host will pull the feedback data, and can block the transfer if not OK.

JMF

JMF11 · 2016-11-08 9:33 pm

Cool, tonight, I could have sound on the SPDIF output of the SAI.

Still some tuning to get enough sync between the 2 time domains, and then I sould have the basis for the final step.

JMF

JMF11 · 2016-11-11 2:50 pm

Works! Stm32 USB Async+DSP+4xdigital channels SPDIF

Nice,

I could solve some of the last issues and now I have a full working chain on STM32:
- input from Asynchronous USB taht allows to rely on the Nucleo board clock (only one time domain)
- DSP processing to apply filtering and EQ with biquads and scaling,
- Output to 2 digital SPDIF lines (4 channels ; could be I2S)
- To drive 2 FX-Audio D802 Full Digital Amps (STA326).

The code can be for sure improved, but is available at https://github.com/jmf13/F7USBAudio
Related information wiki (to be completed): https://github.com/jmf13/Const_DSP_I2S_DAC/wiki

Now it is time for listening tetst.

JMF

JMF11 · 2016-11-20 7:46 pm

In fact, it is "almost" there... I have some glitches, that are possible to hear. I recorded them but can't find the source. It seems to be a weird bug somewhere. I Have worked 3 days on the topic without finding the problem.

What a pity to work alone on that... Some different perspective would for sure help.

Anybody willing to jump in?

JMF11

steph_tsf · 2016-11-21 1:28 am

JMF11 said:
What a pity to work alone on that... Some different perspective would for sure help. Anybody willing to jump in?

I'd like to jump in, but currently I don't know which Development Toolchain is best suited to such realtime digital audio processing project requiring a fine control of the STM32F3, STM32F4, STM32F7, STM32H7 hardware ressources.

For instance, Atollic TrueStudio Pro 7.0.0 ($59 per month licence or $2,795 perpetual licence) supports the " Real-time data access tracing", the "Exception and Interrupt tracing" and the "Instruction tracing".

Unfortunately, the Atollic TrueStudio Free 7.0.0 ($0 licence) supports none of this.

What about the STM Nucleo-64 and STM Nucleo-144 boards?
Are they all supported by Atollic?
Why is the STM32F746ZG Nucleo not yet supported?
Does the built-in STM Nucleo ST-LINK/V2-1 debugger - programmer with SWD connector, support the " Real-time data access tracing", the "Exception and Interrupt tracing" and the "Instruction tracing"?

What about developing a few "capes" or "shields" hosting the required digital audio peripherals (I2S-in, SPDIF-in, I2S-out, SPDIF-out) and some simple user interface (rotary encoder, buttons, LEDs, alphanumeric display)? How to support such "capes" or "shields"? Is it required to create drivers?

What about developing a few all-in-one boards hosting a STM32F3, STM32F4, STM32F7, or STM32H7 plus the required digital audio peripherals (I2S-in, SPDIF-in, I2S-out, SPDIF-out) and some simple user interface (rotary encoder, buttons, LEDs, alphanumeric display)? How to support such all-in-one boards? Is it required to create BSPs (Board Support Package)?

In case the built-in STM Nucleo ST-LINK/V2-1 debugger - programmer with SWD connector, doesn't support the " Real-time data access tracing", the "Exception and Interrupt tracing" and the "Instruction tracing", what decent hardware probe is required for supporting most of the required features? Here is a list of some, however I don't know if they are suited :

ST-LINK V2 by STM costing $25
ST-LINK V2 ISOL by STM costing $85
CoLinkEx by CooCox costing $30
ULINK2 by ARMKeil (eBay) costing $20
ULINKpro D by ARMKeil (eBay) costing $825
ULINKpro by ARMKeil (eBay) costing $1,500
I-jet Trace for ARM Cortex-M by IAR costing $479
J-Link Base by Segger costing $298
J-Link Plus by Segger costing $598
J-Trace for Cortex-M by Segger costing $1,248

So, in a nutshell, what Development Toolchain is best suited?

Once a decision is made, is it still feasible to base on the project GitHub, and possibly, contribute to the project GitHub?

Realtime Audio DSP projects built from scratch that I managed to complete in the mid nineties in less than one months of work, using the DSP56002EVM board with Motorola’s DSP56000 cross assembler and Domain Technologies debug software, and a 2-channel 20 MHz oscilloscope for viewing GPIOs toggles, appear to require many month of work when basing on today's ARM Cortex-M4 chips and today's development toolchains.

Possibly, nowadays there are different hardware, different compilers and different debug software, best suited to Realtime Audio DSP projects. I'm thinking about PIC32MZ and XMOS, as possible entry points to be evaluated.

What's regarding USB Audio, I think it is best to base on ready-made converters like :
- miniDSP miniStreamer : stereo USB 2.0 High speed and Audio class 2.0 <-> 1 x stereo I2S 192 kHz costing $35
- miniUSB USBStreamer : 8-channel USB 2.0 High speed and Audio class 2.0 <-> 4 x stereo I2S 192 kHz costing $95
Worth noting, they both base on XMOS hardware.

Any practical advice welcome.

Cheers,
Steph

JMF11 · 2016-11-21 10:17 am

steph_tsf said:
I'd like to jump in, but currently I don't know which Development Toolchain is best suited to such realtime digital audio processing project requiring a fine control of the STM32F3, STM32F4, STM32F7, STM32H7 hardware ressources.

For instance, Atollic TrueStudio Pro 7.0.0 ($59 per month licence or $2,795 perpetual licence) supports the " Real-time data access tracing", the "Exception and Interrupt tracing" and the "Instruction tracing".

Unfortunately, the Atollic TrueStudio Free 7.0.0 ($0 licence) supports none of this.

What about the STM Nucleo-64 and STM Nucleo-144 boards?
Are they all supported by Atollic?
Why is the STM32F746ZG Nucleo not yet supported?
Does the built-in STM Nucleo ST-LINK/V2-1 debugger - programmer with SWD connector, support the " Real-time data access tracing", the "Exception and Interrupt tracing" and the "Instruction tracing"?

Hi Steph, nice to see you onboard again. I currently work with the free ST toolchain: SW4STM32 - System Workbench for STM32

It does the job, supports all processors and boards. It interfaces with CubeMX that you know well. Maybe it lacks some advanced features but allows to go quite far. I believe that the dedicated (free) tool StmStudio from ST allows " Real-time data access tracing".

I don't know for "Exception and Interrupt tracing" and the "Instruction tracing" which looks like very advanced features.

All in all, I belies that what is provided by ST is pretty decent for serious development. I would not block starting using the stm32 boards because of the choice of the toolchain.

steph_tsf said:
What about developing a few "capes" or "shields" hosting the required digital audio peripherals (I2S-in, SPDIF-in, I2S-out, SPDIF-out) and some simple user interface (rotary encoder, buttons, LEDs, alphanumeric display)? How to support such "capes" or "shields"? Is it required to create drivers?

I2S-in, SPDIF-in, I2S-out, SPDIF-out, buttons, LEDS: this is already available in Nucleo or Discovery boards. It seems that the Nucleo boards can even accept some Arduino capes on their connectors !

When going to HMI, discovery boards, with included screen provide a lot of potential.

The elements that I identified for an audiocape would be an external clock and connectors for the audio digital signals.

steph_tsf said:
What about developing a few all-in-one boards hosting a STM32F3, STM32F4, STM32F7, or STM32H7 plus the required digital audio peripherals (I2S-in, SPDIF-in, I2S-out, SPDIF-out) and some simple user interface (rotary encoder, buttons, LEDs, alphanumeric display)? How to support such all-in-one boards? Is it required to create BSPs (Board Support Package)?

Sure this is really interesting, but out of what I know I can do.

steph_tsf said:
In case the built-in STM Nucleo ST-LINK/V2-1 debugger - programmer with SWD connector, doesn't support the " Real-time data access tracing", the "Exception and Interrupt tracing" and the "Instruction tracing", what decent hardware probe is required for supporting most of the required features? Here is a list of some, however I don't know if they are suited :

ST-LINK V2 by STM costing $25
ST-LINK V2 ISOL by STM costing $85
CoLinkEx by CooCox costing $30
ULINK2 by ARMKeil (eBay) costing $20
ULINKpro D by ARMKeil (eBay) costing $825
ULINKpro by ARMKeil (eBay) costing $1,500
I-jet Trace for ARM Cortex-M by IAR costing $479
J-Link Base by Segger costing $298
J-Link Plus by Segger costing $598
J-Trace for Cortex-M by Segger costing $1,248

So, in a nutshell, what Development Toolchain is best suited?

Once a decision is made, is it still feasible to base on the project GitHub, and possibly, contribute to the project GitHub?
)

As discussed above, for me the standard free toolchain is good enough for majority of serious hobbyist. Personnally, it seems to me out of the resources of serious hobbyists (that don't earn a living with those things) to consider such tools. Can't those advance features be compensated by an oscilloscope for viewing GPIOs toggles, or a simple logic analyser?

steph_tsf said:
Realtime Audio DSP projects built from scratch that I managed to complete in the mid nineties in less than one months of work, using the DSP56002EVM board with Motorola’s DSP56000 cross assembler and Domain Technologies debug software, and a 2-channel 20 MHz oscilloscope for viewing GPIOs toggles, appear to require many month of work when basing on today's ARM Cortex-M4 chips and today's development toolchains.

Sure that on my side, I was slowed down mainly by my lack of knowledge and very rusted programming skills.

steph_tsf said:
Possibly, nowadays there are different hardware, different compilers and different debug software, best suited to Realtime Audio DSP projects. I'm thinking about PIC32MZ and XMOS, as possible entry points to be evaluated.

I had looked at that but had not identified any affordable development board and toolchain relying on those components and suitable for multi-channel digital outputs. Any proposal welcomed. It is another story for a professional product developed out of Hobby constraints, in a lab, with dedicated board manufacturing.

steph_tsf said:
What's regarding USB Audio, I think it is best to base on ready-made converters like :
- miniDSP miniStreamer : stereo USB 2.0 High speed and Audio class 2.0 <-> 1 x stereo I2S 192 kHz costing $35
- miniUSB USBStreamer : 8-channel USB 2.0 High speed and Audio class 2.0 <-> 4 x stereo I2S 192 kHz costing $95
Worth noting, they both base on XMOS hardware..

Good products. I however like the potential to have this + DSP + digital out integrated on a single 20€ generic board.

Best regards,

JMF

JMF11 · 2016-11-22 6:49 pm

Hi,
Good news: my glitches issue is now solved thanks to the help of hamster_nz at stm32 SAI / SPDIF_TX issue - Page 1

I was writing by error randomly in the SPDIF Channel Status bits and tickling the "pre-emphasis" bit, having the de-emphasis filter switching on and off.

So now the chain USB Asynch => DSP => 4 digital channels is perfectly reproducing the music.

Really happy of the result !

JMF

JMF11 · 2016-11-25 9:47 pm

Listening tests

After some days of listening, I'm really happy. This works really, really well !

The stm32 does perfectly the job and the Spdif output works great !

Very happy.

JMF

Sh0velman · 2016-11-25 10:20 pm

JMF11 said:
Listening tests

After some days of listening, I'm really happy. This works really, really well !

The stm32 does perfectly the job and the Spdif output works great !

Very happy.

JMF

Great!

Have you played with FIR filtering yet? That's really the question for me when it comes to an STM32 based solution.

JMF11 · 2016-11-26 6:29 pm

The first filter I implemented was a FIR filter, staring form the exemple developped in Realtime-Audio-DSP Tutorial with the ARM STM32F4-Discovery Board – Christoph's Homepage

As you can see they rely on similar structures and functions calls as IIR filters. And as you see, the functions exist in the stm32 libraries.

I believe that the template I built is straight forward to adapt to FIR filters. And the Stm32F7 chips deliver significant power.

However, I don't plan FIR filters in my application, so I don't plan practical tests.

JMF

Sh0velman · 2016-11-26 6:33 pm

JMF11 said:
The first filter I implemented was a FIR filter, staring form the exemple developped in Realtime-Audio-DSP Tutorial with the ARM STM32F4-Discovery Board – Christoph's Homepage

As you can see they rely on similar structures and functions calls as IIR filters. And as you see, the functions exist in the stm32 libraries.

I believe that the template I built is straight forward to adapt to FIR filters. And the Stm32F7 chips deliver significant power.

However, I don't plan FIR filters in my application, so I don't plan practical tests.

JMF

Right, I'm aware they are capable of it, but the real world practical performance of FIR filtering on this platform is a total unknown at this point.

Well if you get the opportunity to do any benchmarking to that effect, please do post your results.

Also, if it isn't too much of an inconvenience, do you think you could diagram the total solution on the hardware and software side and post it? I'd be very iterested to see the entire thing end to end to put your results into context and get inspiration for my own potential project (I have an STM32F767ZI NUCLEO board waiting for some attention.)

JMF11 · 2016-11-26 9:30 pm

There are comparisons made by DSPconcept (Audioweaver) between different MCU/DSP and that include the Cortex M7: https://www.dspconcepts.com/sites/default/files/pd8_beckmann.pdf (and see capture below).

About the diagram of my system:

Hardware part:
PC/USB => USB/Nucleo Stm32F746ZG using 1xSAI periph (from 2 availlable)/2xSPDIF protocol => 2x(SPDIF/FX-Audio D802)=>2xLXmini

One amp drives one speaker: one channel for the woofer and one for the Tweeter.

So from hadware perspective: 1xPC, 1xNucleo, 2xamplifiers.

The FX-Audio D802 is a "full digital amp" based on the STA326. The chip has I2S input. The amp has USB and SPDIF inputs.

Software part:
PC: Foobar that output the music to the stm32 audio device Class 1. It could be MPD on Linux. Nothing special here as the stm32 is automatically seen as an audio device. No driver is needed.

Stm32:
- programmed "bare metal" (no OS) relying on the ST Hardware Abstraction Layer (HAL) for portability.
- USB fills a ring buffer
- the main loop prepares (low priority) output buffers pulling data from the ring buffer (filled by USB), applying the DSP functions to crossover and EQ, going in my case from a stereo stream to two of them to drive the 4 channels).
- DMA streams feed the SAI with the output buffers in a ping-pong arrangement.
- To simplify a bit, one buffer is handled to the DMA to play, and as soon as this buffer is handeled, the main task prepares the nex buffer by: pulling data from the ring buffer (filled by USB), applying the DSP functions to crossover and EQ.

My application addresses 16bits 48kHz. But higher sampling frequency or data are possible.

I use SPDIF protocol because of my amps inputs, but I2S output is easy by just modifying the SAI parameters.

Does it helps? Do some points need clarifications?

JMF

Sh0velman · 2016-11-26 9:37 pm

JMF11 said:
There are comparisons made by DSPconcept (Audioweaver) between different MCU/DSP and that include the Cortex M7: https://www.dspconcepts.com/sites/default/files/pd8_beckmann.pdf (and see capture below).

About the diagram of my system:

Hardware part:
PC/USB => USB/Nucleo Stm32F746ZG using 1xSAI periph (from 2 availlable)/2xSPDIF protocol => 2x(SPDIF/FX-Audio D802)=>2xLXmini

One amp drives one speaker: one channel for the woofer and one for the Tweeter.

So from hadware perspective: 1xPC, 1xNucleo, 2xamplifiers.

The FX-Audio D802 is a "full digital amp" based on the STA326. The chip has I2S input. The amp has USB and SPDIF inputs.

Software part:
PC: Foobar that output the music to the stm32 audio device Class 1. It could be MPD on Linux. Nothing special here as the stm32 is automatically seen as an audio device. No driver is needed.

Stm32:
- programmed "bare metal" (no OS) relying on the ST Hardware Abstraction Layer (HAL) for portability.
- USB fills a ring buffer
- the main loop prepares (low priority) output buffers pulling data from the ring buffer (filled by USB), applying the DSP functions to crossover and EQ, going in my case from a stereo stream to two of them to drive the 4 channels).
- DMA streams feed the SAI with the output buffers in a ping-pong arrangement.
- To simplify a bit, one buffer is handled to the DMA to play, and as soon as this buffer is handeled, the main task prepares the nex buffer by: pulling data from the ring buffer (filled by USB), applying the DSP functions to crossover and EQ.

My application addresses 16bits 48kHz. But higher sampling frequency or data are possible.

I use SPDIF protocol because of my amps inputs, but I2S output is easy by just modifying the SAI parameters.

Does it helps? Do some points need clarifications?

JMF

That does help, and I'm familiar with those benchmarks, but that is utilizing the Audio Weaver software, so there may be more/less overhead and those are still theoretical values derived from synthetic benchmarks.

I suppose I might need to just try it myself, haha.

Do you have a git or other public SVN you are willing to share your source from?

Thanks again for the info.

JMF11 · 2016-11-26 9:50 pm

All code is there:https://github.com/jmf13/F7USBAudio

For sure the code can be improved and cleaned, which I will do progressively. But it is good enough to listen music.

I have been using OpenStm32, the free software environment provided by ST and based on Eclipse.

Normally, to play with FIR filters, you should only need to play with DSP.c and dsp.h

Don't hesitate to ask if you need clarifications on the code.

JMF

Sh0velman · 2016-11-27 3:31 pm

Do you know of a reliable way to measure the load on the MCU?

I suppose you could just add filters/channels until you start getting output buffer underruns, but it would be preferable to get some kind of % busy stat...

JMF11 · 2016-11-27 9:32 pm

I have in mind to do it in the following way in coming weeks if I succeed to implement it.

I need to toggle a GPIO at the beginning and end of the DMA transfer, and another one at the beginning and end of the buffer preparation.

Capturing the GPIO states with a logic analyser will allow to compare the respective times and the ratio will illustrate the load.

Another way would be to rely on timers, or even on the systick,wich may be sufficient for a first assessment. maybe I will give a try to this one first.

The professional way seems to be called "profiling" but I didn't found a How-to do it on the framework provided by ST.

JMF

JMF11 · 2017-01-15 7:48 am

Hi,

A small update to say that I have implemented a delay line function to the set-up to compelment the already available functions. So now, the framework is complete: Async USB, biquad filtering and EQ, delay lines for time alignment, digital output.

I have now to push it to Github.

In fact I ported the fractionnal delay code from Charlie Laub (mTap) to be more precise than samples delays, but it does not fit from CPU load perspective in the Nucleo board. However, my code is not optimized at all.

I listen daily to my LXmini with this setup, and the results are very, very, very good ! And me very happy !

Best regards,

JMF

steph_tsf · 2017-01-16 6:05 pm

JMF11 said:
In fact I ported the fractional delay code from Charlie Laub (mTap) to be more precise than samples delays, but it does not fit from CPU load perspective in the Nucleo board.

Come one, a one tap delay at the 41000 Hz sampling frequency delivers a 340 meter (speed of sound per second) divided by 44100 = 7.7 millimeter delay. You know you can implement multiples of such 7.7 millimeter delay using a plain simple circular buffer. Such integer delay function appears to be sufficient to me.

What is your motivation in implementing a fractional delay function ? I guess that you realize that any fractional delay function is doomed to introduce kind of approximation, a this is equivalent to doing some oversampling followed by some subsampling. Do you think your remarkable STMF4 audio system will be viewed as incomplete, in case it only support an integer delay forcing you to choose between a 69.3 mm delay, a 77.0 mm delay, or a 84.7 mm delay (as an example) ? Multiway array speakers possibly ? Please tell us more.

What's regarding me, personal opinion only, I would prefer that you upload your courageous STMF4 application on github, as a hardware that's able to emulate :
1) a 24-bit 96 kHz stereo DAC operating as "master" (it outputs the LRCK signal and BITCLK signal to the audio source),
2) a 24-bit 96 kHz stereo DAC operating as "slave" (it inputs the LRCK signal and BITCLK signal from the audio source),
3) a 24-bit 96 kHz SPDIF receiver, and
4) a 10 Mbit/s USB 24-bit 96 kHz "pseudo-async" soundcard.

Where are you at the moment ?
1) nowhere
2) nowhere
3) nowhere
4) restricted to 16-bit 44.1 or 48 kHz ? or better ?

Have you got the required answers, for connecting a Fs * 256 quartz oscillator on the STMF4 I2S_CLOCK input ? Are you okay with CubeMx for configuring the STMF4 audio system and clocks ?

Regards,
Steph

Can low jitter be achieved with STM32 microcontroller

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Attachments

Member

Member

Member

Member

Member

Member