Nick's audio test system (AK5572 ADC 129KHz 32bit stereo balanced input)

IMO since the EP IN is async, the actual ratio of packet sizes is determined by the ADC clock and will differ from the nominal calculation.

A bit of offtopic - MS docs says for the UAC2 driver:

The size of isochronous packets created by the device must be within the limits specified in FMT-2.0 section 2.3.1.1. This means that the deviation of actual packet size from nominal size must not exceed +/- one audio slot (audio slot = channel count samples).

IIUC the nominal size for 88k2 at bInterval=1 is 11 or 12. Would that mean the allowed actual packet size is from 10 to 13?

Also, how does a maximum allowed packet size variation relate to the allowed explicit feedback values the range of which is quite large for the MS UAC2 driver? What if the device e.g. requests 95% of the nominal rate which may require fewer than "nominal - 1" slots in the packets?

I would say the linux driver does not care about incoming packet size https://elixir.bootlin.com/linux/latest/source/sound/usb/pcm.c#L1180 . In the end samplerate is just a number, whatever samples come in will be used.
 
Last edited:
  • Like
Reactions: NickKUK
As I understand it the maximum packet size variation of +/- one audio slot means that e.g. for 48k the nominal is 6, min is 5 and max is 7. For 88k2 the nominal is 11.025 so min is 11 and max is 12.

Regarding input I2S the impact of USB packet size is most apparent at 44k1 where microframe should contain either 5 or 6 samples. So if device does not vary the USB packet size the actual data rate seen by host is either 40k or 48k. Whether or not that matters is of course depending on the host application.
 
As I understand it the maximum packet size variation of +/- one audio slot means that e.g. for 48k the nominal is 6, min is 5 and max is 7. For 88k2 the nominal is 11.025 so min is 11 and max is 12.
This seems quite little for the async range to me.
Regarding input I2S the impact of USB packet size is most apparent at 44k1 where microframe should contain either 5 or 6 samples. So if device does not vary the USB packet size the actual data rate seen by host is either 40k or 48k. Whether or not that matters is of course depending on the host application.
IMO every async EP IN varies the packet size, unless its input sample clock runs synchronously with the USB clock. Whatever gets out from the ADC/SPDIF receiver in between the USB frames is sent in the next packet (or next + 1 if the timing does not make it to next)

I will test both capture and playback ranges for win and linux the next time I have my usb gadget running. It's trivial to change the async rate in the gadget, both directions have a separate control for their actual rate pitch (IN pitch controls the actual number of samples in the IN packet, OUT pitch controls the explicit feedback value).
 
  • Like
Reactions: NickKUK
This seems quite little for the async range to me.
This is what Windows UAC2 driver works with. Both out and in as I understand.
Whatever gets out from the ADC/SPDIF receiver in between the USB frames is sent in the next packet (or next + 1 if the timing does not make it to next)
Basically so but the USB packet size has to be within the limits of FMT-2.0. Even though there is no async feedback for input UAC I have implemented a similar USB packet size calculation based on balancing ring buffer write and read pointers.

Another reason for having a buffer also for input UAC is the bit depth. In my case the internal sample representation is always 32-bits but it is converted to 16/24-bits if host uses that.
 
  • Like
Reactions: NickKUK
E.g. for incoming I2S at 88k2 Windows UAC2 driver accepts 11 or 12 samples in microframe. So for every 40 packets host needs 39x 11 sample packets and 1x 12 sample packet. I believe Linux driver works in same fashion.

Interesting IIRC the USB packet rate is 8Khz, so essentially it’s rate is average whole packets. However the usb specification doesn’t limit the packet size, the implication of audio rate vs usb carrying rate for a “stream” could lead to that.

That could/would cause dma to be complicated in that dma triggers would have to be individually issued. A simple count of normal frames between large frames would work. The same process could be done with cpu.
 
I just tested placing dma buffers in DTCM or SRAM with/without D-Cache. Every alternative worked. I seem to remember reading somewhere that DTCM vs. DMA is a STM32H7 issue only.

USB OTG dma does not work in any setting.

I think that's because of the matrix bridges and the H7 having three domains (ie three matrices).

I did note there's some parameters that require pre-initialisation only. Also I'm leaving this to have a check around the HCDMA. Note I found someone found a bug in the firmware with crashing: https://community.st.com/s/question...-dma-corrupts-memory-after-the-receive-buffer however from experience I think there's obviously either an alignment or size alignment I suspect it's transmitting more. Probably a memory corruption.

I'm not going to have a late one tonight, but I should have some time soon hopefully to work through this.
 
As I understand it the maximum packet size variation of +/- one audio slot means that e.g. for 48k the nominal is 6, min is 5 and max is 7. For 88k2 the nominal is 11.025 so min is 11 and max is 12.
Just tested the win10 driver with a simple wasapi exclusive capture written in rust, stored to wav, checked in audacity spectrogram

44.1kHz, bInterval=4 i.e. 1ms frame.

USB Packets captured in wireshark in my linux PC (the win10 running in a virtual machine there), processed with tshark:

Code:
tshark -r a.pcapng -Y 'usb.src == "1.14.3" and usb.transfer_type == 0x00 and usb.endpoint_address == 0x83' -T fields -e usb.iso.iso_len |  tr ',' '\n'   | sort | uniq -c


async 100% rate - capture in win10 bitperfect
6667 packets of 352 bytes
740 packets of 360 bytes

async 99% rate - bitperfect
3164 344
6113 352


async 98% rate - bitperfect
7284 344
2031 352

async 97% rate - some samples corrupted
1534 336
5342 344

360 bytes = 45 audio frames
352 bytes = 44 audio frames
344 bytes = 43 audio frames
336 bytes = 42 audio frames

I.e. for 44.1kHz the nominal frames are 44 and 45, with 43 accepted, while 42 already causing problems. IMO the allowed minimum is the lower nominal value - 1 which does make sense for a usable async control.

In linux USB-audio driver any number of samples works bitperfect, tested down to 75% rate.

I did not test the upper limit as the gadget module async control has a hard-coded upper limit at 100.5% (since that limit determines the max packet size which consumes the overall available USB bandwidth) and I did not want to recompile the kernel module.

I do not think the problems with 42 audio frames (n - 2) were caused by dropped packets, but as if the driver somehow kept track of time and filled in "time-missing" samples with zeros - all the discontinuities were like this, every 100ms which does not correspond to the 42-frame packet occurences:
3.png
2.png
1.png
 
  • Like
Reactions: NickKUK
The behavior of the input stream really seems like the driver or wasapi layer monitors the incoming stream rate and if it falls below some limit from its view point (measured probably against system time or maybe some hardware timer if available), it periodically (seems every 100ms) adds zeros to the stream to make up for the "missing" samples to raise the rate up to the minimum level. The lower the async rate, the longer the sequence of zeros inserted. Probably a reasonable behavior which can avoid potential timing problems in applications which can rightly expect a close-to-nominal-rate data stream. Falling deep below nominal rate is a faulty condition anyway.

To test the hypothesis I set the rate to generate just a few of the n-2 packets:

async 97.5% rate - bitperfect
32 packets @ 336 bytes (n-2)
12640 packets @ 344 bytes (n-1)

The result was a bitperfect capture in win10. That IMO confirms the hypothesis that the driver accepts larger variation of the packet size, but stuffs zeros if the overall rate drops below some minimal level.

IMO that is good because if the device side for some reason sends a shorter packet (e.g. a short-time drop in measured feedback value in the gadget, or some glitch in the number of new samples in the SAI buffer should it be transferred directly), it will likely not ruin the stream, provided the longer-time rate is reasonable.

However, no tests done for EP OUT, for now.

Based on this it does make sense to try to keep the data rate close to nominal also in input UAC2

That works if the SAI incoming rate is close to nominal too. If not (off-nominal external I2S clock), not much can be done. But again - it's a faulty condition anyway...

Here the USB gadget has life easier because the gadget is what determines the pace at which the dedicated alsa playback device (which feeds the EP IN side of the gadget) runs, controllable from userspace with the pitch "knob".
 
  • Like
Reactions: NickKUK
So I managed to get a little time today with the new code base and I've got the project back to the point where it compiles and starts (although no USB yet). This combines both SAI, buffer and USB code into one cleaner project. Now I can concentrate purely on the USB connectivity.

I've designed a better clock for the ADC using a Crystek crystal, a 3 output TI clock buffer fan out IC and two ADM low noise power regulators. I have the concept built (it pretty much follows the documented application design). The idea is to have a low close-in phase 24.576MHz clock source that I can connect via SMB - one will connect to the the ADC, the second will connect to the DAC when available and the third will connect to the current MCLK input BNC as an external source. The buffer will ensure that (a) the crystal sees a controlled impedance and (b) that one output will not influence the others to keep the noise to a minimum. Long term clock stability is less of an issue so this seems the better route and lowering the noise floor of the ADC (currently I note about 8bits of ADC output noise on the ADC bitstream so replacing the external clock and the SMPS power supplies with something quieter should really help.

I'll also be ordering a few bits for the headphone amp - including voltage doubler components and ordering the RTZ DAC backordered components.
 
Code:
pi@raspberrypi:~ $ more /proc/asound/cards
 0 [Headphones     ]: bcm2835_headpho - bcm2835 Headphones  bcm2835 Headphones
 1 [vc4hdmi0       ]: vc4-hdmi - vc4-hdmi-0 vc4-hdmi-0
 2 [vc4hdmi1       ]: vc4-hdmi - vc4-hdmi-1  vc4-hdmi-1
 3 [IQaudIODAC     ]: IQaudIODAC - IQaudIODAC IQaudIODAC
 4 [Class          ]: USB-Audio - DIY USB Audio 2.0 Class Nick's DIY Audio DIY USB Audio 2.0 Class at usb-0000:01:00.0-1.1, high speed
(reformatted to take less space)

So I had a bit of a session last night with the aim of finishing the base part of the new version of the driver.

As you can see it's now enumerating, although the ALSA is complaining in the logs and exits with code 99 - I'm pretty sure this is due to the endpoint not being opened at this moment in time. The logs show there's no USB enumeration issue, just that ALSA then decides it doesn't like the sound card offered to it by linux.

I did find a couple of issues with the existing USB descriptor code - specifically memory alignment (go figure the example code doesn't seem todo that properly) so I'm using an inline assembler alignment code to ensure we have alignment.

Next up is the endpoint - that should simply take the available data from the buffer. So that's looking like this weekend perhaps. I'll keep in mind the rate of samples per packet.
 
Code:
pi@raspberrypi:~ $ arecord -L
null Discard all samples (playback) or generate zero samples (capture)
default:CARD=STM32USBi2sBrid Default Audio Device
sysdefault:CARD=STM32USBi2sBrid Default Audio Device

Getting there - turns out the auto generator has mauled a portion of the code I thought was safe at some point. Hence I'm having to put together the control handling code again. This time well away from the auto generator.

alsactl isn't deciding that it can't access the controls now so that's good, however I remember the first time around I had to report the sample rates supported, yet I've not seen that request yet.. time to dig out Wireshark.
 
Last edited:
Slowly progressing.
1. Found an issue that the alsa mixer requires the audio function usb entity identifier to be unique across the function and not the entity (as per usb spec).
2. Sorting out the control request responses at the moment. Seems happy enough..
3. i think i have to check the UAC2 as it seems have an issue during the alsactl as it still is not added to the list of recording devices. It does have an info file under the /proc/asound/card4/pcm.. directory.
 
Finally. I found the usb-devices command that gives:
Code:
T:  Bus=01 Lev=02 Prnt=02 Port=02 Cnt=01 Dev#=  8 Spd=480 MxCh= 0
D:  Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=0483 ProdID=5740 Rev=02.00
S:  Manufacturer=Nick's DIY Audio
S:  Product=STM32USBi2sBridge
S:  SerialNumber=
C:  #Ifs= 2 Cfg#= 1 Atr=c0 MxPwr=100mA
I:  If#=0x0 Alt= 0 #EPs= 0 Cls=01(audio) Sub=01 Prot=20 Driver=snd-usb-audio
I:  If#=0x1 Alt= 0 #EPs= 0 Cls=01(audio) Sub=02 Prot=20 Driver=(none)

So that would explain it..
 
Code:
Nov 23 23:04:52 raspberrypi kernel: [ 4101.938187] usb 1-1.3: parse_audio_format_rates_v2v3(): unable to retrieve number of sample rates (clock 1)

Slowly but surely I'm sorting the issues out.. this one sounds like the GET isn't working or a rate in a format table that's missing..
 
I thought I'd have a look inside this old mAudio Firewire interface. Sure enough it has a combined ADC and DAC (AK4628) but it also has a 24.576MHz oscillator (NSK) plus what seems to be a PLL/clock doubler (TI 66FTL4K LCV000A). Add to that SMD caps, resistors, regulators, encoders, etc.. I believe this doesn't work due to the firmware not receiving a firewire driver signal so I'm eyeing this up as a donor for components in future. It also has a decent metal case but next- to try a spoof SPDIF signal to see if it will work with out a computer.

tempImageqhrZ0R.gif
 
So that's now being recognised. I had a couple of issue that were a pain to find (one being I'd not used MIN() on the return with wLength in the request) leading to what appeared as a timeout but wasn't being reported as a data overflow until I looked at the Wireshark ARB response that indicated the overflow..

Code:
pi@raspberrypi:~ $ arecord -l
**** List of CAPTURE Hardware Devices ****
card 4: STM32USBi2sBrid [STM32USBi2sBridge], device 0: USB Audio [USB Audio]
  Subdevices: 1/1
  Subdevice #0: subdevice #0

Next up.. endpoints and data transfer.

I'm surprised It even shows up on the stream info:
Code:
Nick's DIY Audio STM32USBi2sBridge at usb-0000:01:00.0-1.1, high speed : USB Audio
Capture:
  Status: Stop
  Interface 1
    Altset 1
    Format: S32_LE
    Channels: 2
    Endpoint: 0x82 (2 IN) (SYNC)
    Rates: 192000 - 192000 (continuous)
    Data packet interval: 125 us
    Bits: 24
    Channel map: FL FR

There's an issue - I see one being that it indicates a sync endpoint.. it's iso.. but it's SOF I suspect.[/code]
 
Last edited: