ADCs and DACs for audio instrumentation applications

Forgive me if I am thinking about this in completely the wrong way.

When acting as a master presumably the XMOS chip takes in the two main oscillators one at 24.576Mhz and one at 22.5792Mhz. It then selects which oscillator is necessary for the windows chosen sample rate and then performs clock division on one of the oscillators to generate all of the relevant clocks. Creating the LR clock, the bit clock and master clock.

In the case of an ADC these clocks are fed to it and it then outputs an I2S data stream shifting 64 bits at a time, 32 for the left channel, 32 for the right and upon receiving said data sends it into the PC. With everything being designed to work in such a way.

In the case of the SAR ADC. From the looks of things you would feed the LR clock into the part. The rising edges of the clock enabling the part and initiating the hardware to sample the input voltage. A small time later the ADC will have finished its sampling process and is now ready for the internally stored value to be read. What you don't want to do is try and access said value during the sampling process. Reading the data from the SAR ADC basically uses a standard SPI interface, doesn't have to be synchronous with any of the other sampling clocks and can happen as fast as the ADC can be read from. At least at a rate of 64MHz according to the LTC2378 datasheet.

So what you want is for every rising edge of the LR clock to trigger the left and right channel SAR ADCs to sample the input voltage. Wait for ~700ns then use a SPI module to read the data from SAR Left then SAR Right, ideally as fast as possible, process the data into the correct number format and then store this 64 bit number in RAM, either as one 64 bit number or two 32 bit ones, one for the left channel one for the right.

Then on the rising edge of the next LR clock send the data stored in RAM out of the part clocked to the original bit clock. I guess if a SPI module could accept an external clock you could use the bit clock to clock shift the data out.

Why is this giving you so much of a headache?

Microchip have a bunch of micros with Data Converter Interface modules (DCIs) that can be configured to work as an I2S device. You can provide an external bit clock and frame (LR clock) to which the device will output data. These are their 16 bit devices and allow you to choose how many words are per frame etc so allow for 32/64 bit transmission. Alternatively you could go for one of their 32 bit parts that allow you to configure the SPI module to work with an externally provided bit and frame clock, again allowing you to configure the module for 32/64 bit operation etc.

Presumably you could have LR clock input to the microchip DCI module and the SAR ADCs. The rising edge triggers the ADCs to sample and triggers an interrupt within the microchip part. The interrupt sets a timer going that counts for 700ns and completion of the timer triggers another interrupt. When the timer triggers the second interrupt one of the SPI modules transfers the samples from the SAR ADCs into the microchip part, processes it then sends it into the DCI modules FIFO ready to be transmitted. If no data is received during the SPI transfer then you can have the microchip part just fill the FIFO with zeros. You could easily buffer more samples before sending them to the DCI FIFO if needed too.

That sounds fairly simple? The microchip parts aren't expensive, you can buy an in circuit programmer (pickit4) for a relatively small amount of money. The development platform works entirely in C and they've also got a graphic design process flow for their 32 bit parts that apparently doesn't even require you to have programming knowledge. I think this is called Harmony? I've only ever used their 16 bit parts so far but I can't see why this wouldn't work unless the parts wouldn't be fast enough. The basic development platform and compilers are entirely free too so maybe look at that?
 
Last edited:
Forgive me if I am thinking about this in completely the wrong way.

When acting as a master presumably the XMOS chip takes in the two main oscillators one at 24.576Mhz and one at 22.5792Mhz. It then selects which oscillator is necessary for the windows chosen sample rate and then performs clock division on one of the oscillators to generate all of the relevant clocks. Creating the LR clock, the bit clock and master clock.

In the case of an ADC these clocks are fed to it and it then outputs an I2S data stream shifting 64 bits at a time, 32 for the left channel, 32 for the right and upon receiving said data sends it into the PC. With everything being designed to work in such a way.

In the case of the SAR ADC. From the looks of things you would feed the LR clock into the part. The rising edges of the clock enabling the part and initiating the hardware to sample the input voltage. A small time later the ADC will have finished its sampling process and is now ready for the internally stored value to be read. What you don't want to do is try and access said value during the sampling process. Reading the data from the SAR ADC basically uses a standard SPI interface, doesn't have to be synchronous with any of the other sampling clocks and can happen as fast as the ADC can be read from. At least at a rate of 64MHz according to the LTC2378 datasheet.

So what you want is for every rising edge of the LR clock to trigger the left and right channel SAR ADCs to sample the input voltage. Wait for ~700ns then use a SPI module to read the data from SAR Left then SAR Right, ideally as fast as possible, process the data into the correct number format and then store this 64 bit number in RAM, either as one 64 bit number or two 32 bit ones, one for the left channel one for the right.

Then on the rising edge of the next LR clock send the data stored in RAM out of the part clocked to the original bit clock. I guess if a SPI module could accept an external clock you could use the bit clock to clock shift the data out.

Why is this giving you so much of a headache?

Microchip have a bunch of micros with Data Converter Interface modules (DCIs) that can be configured to work as an I2S device. You can provide an external bit clock and frame (LR clock) to which the device will output data. These are their 16 bit devices and allow you to choose how many words are per frame etc so allow for 32/64 bit transmission. Alternatively you could go for one of their 32 bit parts that allow you to configure the SPI module to work with an externally provided bit and frame clock, again allowing you to configure the module for 32/64 bit operation etc.

Presumably you could have LR clock input to the microchip DCI module and the SAR ADCs. The rising edge triggers the ADCs to sample and triggers an interrupt within the microchip part. The interrupt sets a timer going that counts for 700ns and completion of the timer triggers another interrupt. When the timer triggers the second interrupt one of the SPI modules transfers the samples from the SAR ADCs into the microchip part, processes it then sends it into the DCI modules FIFO ready to be transmitted. If no data is received during the SPI transfer then you can have the microchip part just fill the FIFO with zeros. You could easily buffer more samples before sending them to the DCI FIFO if needed too.

That sounds fairly simple? The microchip parts aren't expensive, you can buy an in circuit programmer (pickit4) for a relatively small amount of money. The development platform works entirely in C and they've also got a graphic design process flow for their 32 bit parts that apparently doesn't even require you to have programming knowledge. I think this is called Harmony? I've only ever used their 16 bit parts so far but I can't see why this wouldn't work unless the parts wouldn't be fast enough. The basic development platform and compilers are entirely free too so maybe look at that?

Your description is entirely correct, and I believe it involves what is called "double buffering" (which is in turn a poor's man FIFO). I've highlighted two issues above:

1. For once, the need for RAM to accommodate the buffer(s) takes the project from the CPLD realm to the FPGA realm. It's not the cost of the FPGA chip, but the cost of the infrastructure for in-system programming which I find impractical.

2. You are describing software based solution, which was the first thing I thought about. Unfortunately nobody tried to code this into the XMOS framework (I am sure it's possible, but it's not for faint heart). Otherwise, perhaps I'm wrongly reading the information on the Internet, but the consensus seems to be that a Microchip or former Atmel 32bit chip don't have the muscle to process audio for fs=192KHz or higher, at least without coding the critical timing parts in assembler (something I'm not at all excited to get into). The one family that would work (again if I'm reading correctly) is STM32 - but then again that's not something I am familiar with (or have the time and mood to get into). In fact, the quickest way would be to take your previous suggestion and code the d*mn thing into a RPI4 and be done with it.

Ultimately, I am taking this "project" opportunity to learn something new, and I have to admit the Verilog approach looks exciting and fun (even if difficult). If I would hunt a "product", then there's essentially nothing to code in Xilinx or Altera FPGA spaces, other than a few lines of glue Verilog and the usual constraints; all building blocks are already included as IP components, even audio I2S-SPI interfaces supporting both master and slave modes are included. But then for DIY audio purposes I would have a very hard time justifying the cost of those IP components.
 
Having enough processor power to shift the data around in a microchip part would be the bottleneck. Some of the microchip parts do include direct memory access paths, with on board peripherals, to reduce the amount of processing required when shifting data, so maybe that would be useful.

Some of the microchip parts can operate fairly quickly I'd have to have a look to see if the faster 32 bit parts had the interface hardware to do what needs to be done.
 
Yeah you've got access to 250-300MHz 32 bit parts I'd imagine them to have the processing power.

As you are obviously aware, a 300MHz core cannot execute code straight from the flash, as most Microchip and Atmel MCUs do. Since program RAM is not in the MCU, here you go again, you need an external RAM chip to accommodate the firmware image originally stored in flash, and for DMA access as well. Once again, far from a "small CPLD"... although nothing that can't be done (and not necessary expensive, too). If anybody would like to embark in such a project, I'm all ears. Perhaps the source code could eventually be ported to the XMOS core(s)... although somebody with more embedded programming experience than myself could focus straight on an XMOS software solution...
 
I'm confused the chips have both flash storage and RAM built in along with the DMA. The internal peripherals use the DMA to send data directly to and from the RAM space freeing up processing cycles that would otherwise be used to shift the data around. Why do you need to add in extra RAM? It's not as if the program for doing this would be large, in fact it'd be extremely small.

Also is there some reason why an SAR ADC has been chosen the ADS127L01 from TI looks quite nice.
 
Why do you need to add in extra RAM? It's not as if the program for doing this would be large, in fact it'd be extremely small.

Again, I'm not an embedded MCU guru, but the small amount of RAM available on the chip is data memory only. High speed MCUs are using flash essentially as a SSD, the software image is stored there in a compressed format; the MCU bootloader is in charge of taking this image from flash, decompressing, relocating addresses in the external RAM, passing a few parameters, then pass execution control to the RAM boot address. To my knowledge, that's how all high speed MCUs are working (including the Broadcom chip in RPI).

Regarding the ADS127L01 it's indeed a great chip, I played with it, performance is very good, and has as a plus the Master/Slave Frame mode that looks very close to I2S Left Justified.

The big hurdle I was not able to get over is providing the required Frame clock which needs to be exactly double the fs frequency that XMOS is passing as LRCK. How to build a ultra low jitter frequency doubler (starting from the LRCK) is something I was not able to figure out, except perhaps for synthesizing a PLL which is IMO a big gambling in terms of performance. Secondly, while in Frame Mode, the ADSL127L01 is no longer software controllable (hardware pins only) and implementing the stereo I2S still requires the glue of a CPLD (although much simpler, since the buffering/FIFO is on the chip).
 
I'll have to look more into the RAM onboard the microchip parts and see how everything works.

What I do know is that the amount of RAM available on the devices I've been suggesting is more than enough to store the entire program in an uncompressed form if you wanted to.

I haven't been aware of any compression taking place either when monitoring the data required for programs when storing small image files as arrays. An uncompressed 1k bitmap array takes 1k of compiled storage space. Unless you were meaning something else, if the compiler were using compression, I would assume it would compress all the stored data as well as the program to execute.

Of course there's also the many compiler options available to maximise the speed that the program will run at at the expense of memory used.

I do find it a little odd that TI can come up with a part like the ADS127L01, but can't make an audio DAC based on it with a little modification. Oh well.
 
Last edited:
Unless you were meaning something else, if the compiler were using compression, I would assume it would compress all the stored data as well as the program to execute.

It's not the compiler compressing, but the deployment tool compressing the linked relocatable binary file before writing to the flash. Of course, local data is also compressed and included in the flash file. That's only for saving flash space, and to minimize the amount of transfer from flash to RAM, before executing the code (that is, speeding up the boot time).

No surprise with the lack of an ADS127L01 based audio ADC, there's simply no market for that.
 
Couldn't you also just invert the LR clock?

Feed the standard polarity LR to one ADC and feed the inverted LR clock to a second ADC.

Feed the standard 64x bit clock into both ADCs. The first ADC would clock out it's 32 bits on the rising edge of the normal LR clock, then the second ADC would clock out it's 32 bits on the rising edge of the inverted LR clock.

Although I'm not sure how the ADC would react to having the bit clock present after the data has been removed. Maybe it would just clock out nonsense or zeros either way you'd ignore them until the next LR clock comes along. You could ask TIs engineering zone how it would behave like that.

Essentially you'd be sampling one ADC for one half of the LR clock and the other one for the other half. They wouldn't be phase aligned though.

Alternatively you trigger both ADCs on the rising edge of the LR clock, take the data from both at the same time but delay the propagation of the data from the second ADC by 32 bit clocks. The first ADCs data occupies the first 32 bits then the delayed data from the second ADC occupies the second 32 bits. This way they are in phase at the sample tine but again how would the ADCs react to having a bit clock present without any new data to present. Would it just clock out nonsense.
 
It's not the compiler compressing, but the deployment tool compressing the linked relocatable binary file before writing to the flash. Of course, local data is also compressed and included in the flash file. That's only for saving flash space, and to minimize the amount of transfer from flash to RAM, before executing the code (that is, speeding up the boot time).

No surprise with the lack of an ADS127L01 based audio ADC, there's simply no market for that.

Surely decompression would take time though...
 
Surely decompression would take time though...

There's an optimum... imagine the extreme case of an 1Mb of statically pre-allocated memory, that would compress to a few bytes. I'm suspecting the compressing process (usually not well documented, although there are exceptions) is also a counter measure against inquiring minds looking forward to disassemble the code.
 
I have never worked with FPGA/CPLD. But it feels like simultaneous burst reading from two SPIs while maintaining continuous data flow to I2S requires multiple threads/processing units to keep pace reliably, merging their output in some synchronized way. How about something like this:

2 x STM32F103 @72MHz (1USD chip or 2USD bluepill board), each reading 1 ADC SPI via DMA, triggered by the ADC "data ready" pin, copying the read 24 bits to another location (double buffering) and outputting to 24 output GPIOs when next L/R signal comes.

All the GPIOs connected to 6 x fast 8bit parallel-serial 74HC165s, serial out clocked by the I2S clock generated by the USB-audio controller => 2 x 24 bits I2S serial data

L/R I2S clock starting the ADC conversions, both at the same time. One STM32 outputting the data on L, the other on R, when the serial register reading pointer starts reading the other channel data.

24bit at 512kHz is 12Mbps, STM32F103@72MHz should do 18MHz DMA SPI, a simple assembler loop transfering the 24bits from RAM to GPIOs could IMO handle 512kHz cycles. The 165s may have a problem at 3.3Vcc with 24MHz, 5V Vcc should handle OK, if the 3.3V outputs from the STM32s triggered the level. Maybe some faster CPLD could be used instead, I do not know. Perhaps something like this (very inexpensive) could work, at least for 256kSps.
 
Last edited:
I find rather amusing how everybody is trying to bring the problem (SPI to I2S conversion) to their own expertise domain - that's obviously the logical way to approach it :D. The not so amusing conclusion is that we sorely miss XMOS (hardware, software framework) expertise, since a software solution running on the XMOS platform would be the most natural approach. I'll try to look into that as soon as I'm done experimenting with the CPLDs and FPGAs.

I am still waiting somebody to come up with some clear examples of where having separate sampling processes of the two channels (that is, a phase shift between channels) is impacting real world audio measurements. Even if you look at such an instrument as a VNA, since such a phase shift is a (constant) sistematic "error" term, it could be trivially eliminated by a calibration process. The requirement to have both channels strictly in phase is equivalent to a requirement to have a RF/microwave VNA test cables strictly identical, a rather stupid thing. BTW I measured the phase shift between channels in my Rohe UPD, Advantest R9211, Agilent 3562A, 89410A, 35665A and they all have out of the box various time domain phase shifts between channels, none is spot on. All are allowing to use one of the channels as a reference and then both channels are brought exactly in phase with the generator during (for example) a "calibration" magnitude/phase sweep.
 
By the way I did a bit of looking around and someone documented hundreds of products they'd brought to market that can provide asynchronous USB1.0 to i2s up to 96kHz 2 channels on rather a basic 50MHz maximum pic32. Others suggested that fast 32 bit pics could probably handle 1MHz i2s data rates for a simple 2 channel i2s data in data out application. Which is essentially what you want to do here. This is certainly good to know for future ideas using non audio ADCs and potential solutions to the problem.

You are quite correct though that if one were au fait with the xmos parts/code it would be seemingly trivial to have it interface directly with any number of SPI ADCs.

For instrumentation purposes you probably can easily calibrate out a phase mismatch it would just be nice if you didn't have to.

I don't know, saying there's no market for an audio rework of the TI D/S ADC. It's very low power and offers extremely low THD. If a rework/reoptimisation could drop the noise floor by 6-10dB without impacting the THD performance too badly...put two in one chip modify the interface a little and you'd have an audio ADC with better performance than anything TI currently have with seemingly much lower power consumption. Just the low power consumption alone would have it finding its way into multichannel mixing consoles in recording studios.

Maybe dropping the noise a bit and keeping the THD performance is actually quite a difficult task.
 
I don't know, saying there's no market for an audio rework of the TI D/S ADC. It's very low power and offers extremely low THD. If a rework/reoptimisation could drop the noise floor by 6-10dB without impacting the THD performance too badly...put two in one chip modify the interface a little and you'd have an audio ADC with better performance than anything TI currently have with seemingly much lower power consumption. Just the low power consumption alone would have it finding its way into multichannel mixing consoles in recording studios.

Maybe dropping the noise a bit and keeping the THD performance is actually quite a difficult task.

Well, maybe, or maybe not... remember that the very low noise floor of the “audio” ADCs in the audio band is a result of noise shaping, not of some special design or process. And this has as a side effect the annoying raise of noise at higher frequency, which triggered the whole story of looking into SAR ADCs.

BTW it just occurred to me that by using a FIFO stack there’s no need to force the ADS127L01 in slave mode. The SAR ADC could very well work in master mode, generate its own clocks and write in the FIFO at its own (asynchronous) pace, while the XMOS could read at its own pace, as long as the data rates on both R/W sides are equal.
 
I find rather amusing how everybody is trying to bring the problem (SPI to I2S conversion) to their own expertise domain

IMO it's not about SPI to I2S conversion, but about merging two concurrently running processes (if the two separate ADCs should start at the same time and not be interleaved) to one continuous output. I mimicked the way audio has been working in PC since its beginning - the soundcard reads a memory buffer via DMA in pace of its clock. The buffer is split in (usually) two segments and interrupts are issued when reading segment A is finished (and reading segment B starts) - to tell the CPU to re-fill segment A. In the analogy these interrupts are L/R clock and the CPU's are two - each responsible for refilling one half of the buffer, when the serial shift register is scanning the other half. That gives the corresponding MCU a time window to bring in new fresh data. When it does not keep up, the register will read old data, just like in PC when the CPU does not make it on time and users hear old samples. If the CPU stalls, a looped sequence keeps playing.

The two MCUs would cache one period of samples, receiving SPI samples for time N and outputting samples for time N-1 (MCU1 to segment A at R, MCU2 to segment B at L). That is the software and hardware combination, just like in the PC.

The SPI input is just a format, it would work the same with a sequenced parallel format (some 24bit Analog Devices ADCs use two transmissions of 16bit parallel), the two independent outputs would be merged in window-timed manner to a single stream the same way.

If you have a device capable of multithreading, it can be coded in software directly, without using two single-threaded MCUs. Or very fast single-threaded MCU but the timing is way stricter then.
 
Last edited:
Member
Joined 2004
Paid Member
FWTW first gen audio ADC' shared one ADC for both channels and switched the input between channels. There is no reason you could not do this in a measurement instrument and calibrate out any phase difference if your software supports it. it would greatly simplify the conversion I suspect and save some on ADC's or enable use of an even more capable ADC. The input switching will be a bit of a challenge.
 
After so much digging into this, I have to declare myself a total moron. I just found that the STM32F7xx series support natively SPI and I2S, Rx/Tx, master/slave, have DMA and FIFOs built in, and the chips cost $5.

A $50 board is on the way, I suspect not more than a few lines of code are required to complete the conversion. I2S is half duplex only, but who cares in this case?

See https://www.st.com/resource/en/refe...d-32-bit-mcus-stmicroelectronics.pdf#page1086 Chapter 32, page 1088 and the following...

So much for CPLDS and FPGAs, but learning a little Verilog was worth the trouble.