16x Digital interpolation filter - drive PCM56, PCM58, AD1865 and so on up to 768 kHz

Hello

In the last month I've been working on a custom digital interpolation FIR filter implemented on an FPGA. This filter does pretty much the same thing as well known DF1706, SM5847, PMD100, and so on. However, its ability to reconstruct and attenuate a signal is way beyond those :)

The filter contains 8192 coefficients and it interpolates the data by a factor of 16 times (e.g. 44.1 kHz to 705.6 kHz and 48 kHz to 768 kHz). It has several FIFOs built in since its core is running asynchrounsly (at 225 MHz) while the MCLK is used only to clock data out of FIFO and to create LE (latch enable) signal for the DAC.
In fact, this filter does always interpolate to 705.6 kHz and 768 kHz (an integer factor of the input) and accepts data up to 768 kHz / 32 bits. In order to do that it works like a sample rate converter, so it interpolates and decimates at the same time. However, it should be noted that for 44.1 kHz and 48 kHz it does not decimate at all, but for anything higher than 44.1 kHz and 48 kHz it still interpolates data 16 times, so e.g. having 96 kHz input means that the data is interpolated up to 1536 kHz, but decimated back to 768 kHz. The same goes to 768 kHz input which is interpolated to 12.288 MHz (in a mathematical sense of course) and decimated back to 768 kHz.

It should be noted that along with huge amount of coefficients (8192) this filter incorporates multiply-accumulate units of 32x35 bits wide. It means that the input data word is fully accepted up to 32 bits (without any truncation for that matter) and coefficients are quantized on 35 bits resulting in unmatchable accuracy of the math it does to calculate the output sample ;)

In the title of this thread I did mention you can drive PCM56, PCM58, PCM63, AD1862, AD1865 and so on up to 768 kHz. How is that possible? The filter contains a self reconfiguratable oscillator which sets its frequency depending on the output length (16, 18, 20 or 24 bits). It means that the output bit clock (CLK) is running fully asynchronously from the LE (latch enable) signal which is generated by dividing the provided MCLK signal, so data is latched in almost all DACs without any extra jitter introduced by the oscillator itself. In fact, the filter contains 3 sets of FIFOs (6 FIFOs for both channels) - one in its input (I2S) before going to the core, another one on the core output and the last one for the final oscillator to clock data into the DAC. Besides all of that this technique introduces a quiet zone after the latch signal going down, so it should give DAC some time to settle down with its output before clocking in another sample ;)

Depending on the word length the following frequencies are created on the CLK output:

16 bits - 14 MHz
18 bits - 15.5 MHz
20 bits - 17 MHz
24 bits - 20 MHz

It means that certain DACs such as PCM56 and AD1865 will be running at the edge with 768 kHz stream, but they will work just fine according to my tests. The LE (latch enable) signal is always 705.6 kHz or 768 kHz depending on the input (either multiply of 44.1 kHz or 48 kHz).

0 dBFS @ 20 kHz:

c31dd8ff94a96c13da976ee3ecbd5909_1535181782.jpg


-60 dBFS @ 1 kHz:

7ac73355b2ac80274d27050b355aefb5_1535040822.jpg


Jitter test @ 48 kHz with LSB toggled @ 250 Hz:

41acfdd52a5ac594ec90a6479865c14f_1535181784.jpg


Filter attenuation (linear phase) - white noise @ 48 kHz:

309a53ea9e8ee3adf81fa9430f601058_1535181783.jpg


All measurements were performed using PCM58.

The filter has I2S input with signals of MCLK, BCLK, LRCK and DATA. However, MCLK can be fed by the same source as BCLK signal (if no MCLK is available) assuming that BCLK has a rate of 32x, 64x, 96x, 128x Fs or similar since the filter has to determine its frequency to know how to divide it in order to create LE (latch enable) signal. Any exotic values of BCLK rate will not work as MCLK, so keep that in mind. The jitter and synchronization of FIFOs depends purely on the MCLK signal, so in the long term it needs to be synchronous with LRCK (in almost all cases it is, since BCLK and LRCK should be derived from a divided MCLK clock).

Following frequencies are supported as MCLK:

49.152 MHz
45.1584 MHz
36.864 MHz *
33.8688 MHz *
24.576 MHz
22.5792 MHz
18.432 MHz *
16.9344 MHz *
12.288 MHz
11.2896 MHz
9.2160 MHz *
8.4672 MHz *
6.144 MHz
5.6448 MHz
4.608 MHz *
4.2336 MHz *
3.072 MHz
2.8224 MHz
1.536 MHz
1.4112 MHz

Those with asterisk are usually used in CD-Players and it should be possible to use the filter within a CD-Player once input is attenuated using 170 Ohm or so resistor per line (in order not to damage the FPGA and its I/O pins due to 5V logic levels). The filter can be powered by an external supply (5V or higher) or by providing a direct 3.3V power supply (it is up the user).

Following outputs are provided by the filter:

CLK - clock for data
LE - latch enable signal
SD_L - serial data for the left channel (and its inversion with the line above it, so you can create a differential DAC for XLR outputs)
SD_R - serial data for the right channel (and its inversion with the line above it, so you can create a differential DAC for XLR outputs)
3.3V - main power supply
GND - ground

It should be noted that the filter does have a TDPF (triangular probability density function) dithering algorithm as well. It can be turned on or off by a jumper depending on your preferences.

The price for a fully assembled and ready to use board will be 250 EUR.

Final revision of 16x interpolation version:

YutcsgA.jpg


AksM4Ki.jpg


zvz3lc7.png


ROM jumper selects which type of filter should be loaded during boot process. There are two filters available, one is linear phase and the other one is a minimum phase :) Both of them have 8192 taps.

The filter can be directly powered by an external USB to I2S converter, but it draws about 300 mA of extra current, so keep that in mind. Also, it is a 4-layer PCB. Dimensions are 50 mm x 47 mm.

For more information and orders please visit:

Digital Interpolation Filter (FIR) - KuSy Audio

I am rarely available on forums nowadays, so a friend of mine will take care of your questions.
 
Last edited:
Very impressive. I would not know where to start designing something this sophisticated but I prefer the sound of NOS balanced TDA1541 and AD1865. (just bought a pair of AD1865 off Ebay to make a balanced dac for use with IanCanada's I2S to Pcm board.)

I do not believe you should rule out this digital filter based on your previous experience of generic digital filters ;) This one was created without sacrificing anything like those generic ones do (e.g. size of an accumulator which is way less than required multiply result width, thus resulting in heavy rounding errors). Not to mention the MAC units of 32x35 within this filter.

Having to choose between generic digital filters is like having to choose whether you want your leg or your hand to be cut out (losing digital information due to lack of precision). That is no longer the case :)
 
This is how I did test PCM56 with a 768 kHz stream:

An externally hosted image should be here but it was not working when we last tested it.


An externally hosted image should be here but it was not working when we last tested it.


An externally hosted image should be here but it was not working when we last tested it.


This is how a reconstructed 20 kHz sine wave looks like with a simple 2rd order Sallen-Key low pass filter:

An externally hosted image should be here but it was not working when we last tested it.


And this how 10 kHz (far from 20 kHz) looks like on NOS zero-order hold DAC (one of the old projects of mine):

An externally hosted image should be here but it was not working when we last tested it.


Obviously DAC in an universal board is not a DAC, but it was for a test purpose :) Anyway, as you can see the frequency LRCK is operating at is always constant it depends on the provided MCLK signal. This is how latching works - it is always constant and as jitter free as possible (it depends what kind of MCLK signal you do provide).

I did study all possible approaches to DACs. That is including NOS as zero-order hold and "NOS" as first-order hold done in a hardware (the same approach as within the well known thread Building the ultimate NOS DAC using TDA1541A). There is a reason why this custom digital filter was created for and it certainly wasn't done on a whim ;)
 
I do not believe you should rule out this digital filter based on your previous experience of generic digital filters ;) This one was created without sacrificing anything like those generic ones do (e.g. size of an accumulator which is way less than required multiply result width, thus resulting in heavy rounding errors). Not to mention the MAC units of 32x35 within this filter.

Having to choose between generic digital filters is like having to choose whether you want your leg or your hand to be cut out (losing digital information due to lack of precision). That is no longer the case :)

Two rather basic features that appear to be missing from almost all digital filter chips on the market are headroom for intersample overshoots and filter responses that actually prevent imaging from the Nyquist frequency onwards, not just imaging above 0.55 fs. What is your take on that?

Besides, did you include an optional apodizing filter for those who believe in apodization?
 
Two rather basic features that appear to be missing from almost all digital filter chips on the market are headroom for intersample overshoots and filter responses that actually prevent imaging from the Nyquist frequency onwards, not just imaging above 0.55 fs. What is your take on that?

Besides, did you include an optional apodizing filter for those who believe in apodization?

There is an internal attenuation of the signal by 1 dB. Also, take a look at the first post - I did include white noise test in there. As you can see for 44.1 kHz input does start attenuating at 20 kHz resulting in way over 140 dB of attenuation by 20.5 kHz. Far from Nyquist frequency of 22.05 kHz.

Regarding anodizing - I assume you mean the window function to reduce pre-ringing. The current set of coefficients was created using a Kaiser window. However, I do plan to implement a minimum phase filter as well, but I don't believe it will be switchable, so just a solid set of coefficients programmed into the device according to someone's preferences.
 
Anyway, speaking of minimum phase filter and users who do not like pre-ringing as mentioned. Below is a conversion from a L=16384 taps linear phase filter to minimum phase filter with L=8192:

An externally hosted image should be here but it was not working when we last tested it.


White noise @ 44.1 kHz (a bit cut out, sorry about that):

An externally hosted image should be here but it was not working when we last tested it.


It's pretty much the same as on the chart.

20 kHz:

An externally hosted image should be here but it was not working when we last tested it.


That's a PCM58.

All measurements are pretty much the same as with the linear phase filter. However, certainly it will sound different ;)
 
I do not believe you should rule out this digital filter based on your previous experience of generic digital filters ;) This one was created without sacrificing anything like those generic ones do (e.g. size of an accumulator which is way less than required multiply result width, thus resulting in heavy rounding errors). Not to mention the MAC units of 32x35 within this filter.

Having to choose between generic digital filters is like having to choose whether you want your leg or your hand to be cut out (losing digital information due to lack of precision). That is no longer the case :)

I'm looking forward to feedback from buyers, ideally using a AD1865 and / or TDA1541 based dac. (my I/V conversion is done using transformers designed for dual balanced dacs in nos mode)
 
As usual I support any attempt at making something new and a bit more advanced, even if it does not always work as intended so count me in for at least one :) I have several DAC boards from Mark Levinson based on PCM1702 and PCM1704 so it would be fun to try. When do you plan the second run?
 
In a few days I do expect to order a new batch of PCBs ;) There were a few minor changes like moving 50 MHz clock from the bottom to the top - just a visual thing and a fix in a footprint of a SPI memory.

The question is which filter to program into the device - whether it should be linear phase or minimum phase. Both of them can attenuate signal starting from 100 dB at 20.5 kHz (about 110 dB for a minimum phase and way beyond 140 dB for a linear phase filter), so that is no longer the case whether to choose good attenuation or no pre-ringing ;)

I suppose I will leave it up to you when you order it. Both filters will sound different due to their nature.
 
There is an internal attenuation of the signal by 1 dB. Also, take a look at the first post - I did include white noise test in there. As you can see for 44.1 kHz input does start attenuating at 20 kHz resulting in way over 140 dB of attenuation by 20.5 kHz. Far from Nyquist frequency of 22.05 kHz.

Great! There are recordings that need more than 1 dB of headroom to prevent intersample clipping although these are relatively rare, see

Audio That Goes to 11 - Benchmark Media Systems, Inc.

Intersample Overs in CD Recordings - Benchmark Media Systems, Inc.

Regarding anodizing - I assume you mean the window function to reduce pre-ringing. The current set of coefficients was created using a Kaiser window. However, I do plan to implement a minimum phase filter as well, but I don't believe it will be switchable, so just a solid set of coefficients programmed into the device according to someone's preferences.

Regarding apodizing, I meant a filter with little pre-ringing and a rather smooth response that has a stopband that begins just below the start of the transition bands of the other filters in the record-reproduce chain, say around 0.45 fs. This will limit the bandwidth a bit, but that may not matter too much for high sample rate recordings.

The idea is that when all other filters have a flat response and linear phase up to 0.45 fs and no aliasing below 0.45 fs, a single apodizing filter somewhere in the chain will determine the impulse response of the entire chain and filter away all aliasing products. Peter Craven wrote an interesting article about this: Peter G. Craven, "Antialias filters and system transient response at high sample rates", Journal of the Audio Engineering Society, vol. 52, no. 3, March 2004, pages 216...242.

An apodizing filter could be a minimum-phase filter, but there are other options, an asymmetrical Wilkinson filter for example, or even a short symmetrical FIR filter with a large transition band.
 
Lovely stuff ;) I will gladly have a read myself. However, it should be mentioned that we are talking about coefficients of the filter. They are changeable and I don't really mind loading a custom set of coefficients for someone as long as they valid (8192 taps and limited bandwidth for sixteen polyphases). Anyway, the issue with intersample clipping according to the article you did provide can be solved with an attenuation of the signal by 3.5 dB. It can be easily compensated within the analog domain. They even mentioned so:

"We also drive the ESS D/A converter chips at -3.5 dB so that no clipping will occur inside the ES9018 and ES9028PRO converter chips."

That's again a thing you can do with filter coefficients. It's not like you need to attenuate the signal itself. You just need to properly scale filter coefficients to achieve a lower total gain from the filter. It's even better that way because you do not lose precision of the signal itself.

Getting back on track - below are some photos from testing a minimum phase filter:

Square wave:

DFIxg9X.png


Zoomed in rising edge of the wave:

B3LBSBM.png


Zoomed in filter coefficients:

An externally hosted image should be here but it was not working when we last tested it.


Reconstruction of a 20 kHz wave:

An externally hosted image should be here but it was not working when we last tested it.


Both filters do work quite well. That is whether it's a linear phase or minimum phase. The amount of coefficients is enough to reconstruct a wave and filter everything out.
 
Last edited:
Member
Joined 2017
Paid Member
This is a very interesting project! :)I have two questions. One is a configuration of FPGA. I guess there exists an EEPROM to be configured through JTAG on the solder side. How do you intend to do the configuration? Do you accept reconfiguration by a user?

The second is why you need a 225MHz clock to calculate 8192 taps in a 48kHz sample rate? Your 32x35 multiplier is probably the most effective size to be implemented by four DSPs out of XC6SLX09. But I'm sure you can do such convolution by a 197MHz(48000*4096) clock. Higher clock ends up higher power consumption. Is there any advantage to use a 225MHz clock?
 
It sure might be nice to have a similar 8x oversampling interpolation filter with I2S output for use with other types of dacs like ESS Sabre parts. Even better would be to include an option to convert PCM input to DSD out. :)

Primary use of this digital filter is R-2R DAC. In general it is achievable to make this filter 8x with limited bandwidth to 384 kHz, but I don't really find it that much useful like it has with R-2R DACs ;)

This is a very interesting project! :)I have two questions. One is a configuration of FPGA. I guess there exists an EEPROM to be configured through JTAG on the solder side. How do you intend to do the configuration? Do you accept reconfiguration by a user?

The second is why you need a 225MHz clock to calculate 8192 taps in a 48kHz sample rate? Your 32x35 multiplier is probably the most effective size to be implemented by four DSPs out of XC6SLX09. But I'm sure you can do such convolution by a 197MHz(48000*4096) clock. Higher clock ends up higher power consumption. Is there any advantage to use a 225MHz clock?

That's a very good question. On the bottom there is a serial Flash memory for the configuration. If someone wants to update the bitstream due to some changes after the assembled PCB was sent I do encourage to do so since you wouldn't have to send it back for a update from my side. Regarding the speed of the device, in general that's a good assumption considering the fact that you do use two 32x35 MAC units (8x DSP48A1 slices) to compute that, so only 4096 multiplies are required. However, you should keep in mind that in order you achieve that speed you need to pipeline and you need to pipeline a lot with a Spartan-6. It turns out that due to latency about 210 MHz is required to compute everything. 225 MHz was chosen to avoid exotic values of M and D of the PLL.

Anyway, in general it would be possible to have L=16384 FIR symmetric filter (linear phase filter since minimum phase filter is non symmetrical) within XC6SLX9, but that's not doable with a polyphase implementation since you do lose the symmetry within polyphases. As far as I know it is only possible to re-gain symmetry under some conditions which simply do not apply here due to interpolation / decimation factors.
 
Last edited by a moderator:
All right, a new batch of PCBs went into the production ;)

What could have been done with this filter has been achieved - I did put two sets of coefficients within the filter! Both of them have 8192 taps, but one is linear phase and the other one is minimum phase. I did not know whether I would be able to achieve that, since meeting timing requirements is a pain in the ***, so I didn't mention anything, but it has been done nonetheless ;) There is nothing more we can fit into the device - it is running at 97% capacity of the block RAM resources.

I did add a "ROM" jumper on the PCB which selects which filter will be loaded during boot process (power-up or reset by a switch). That is whether it will be a linear phase filter or a minimum phase filter. Also, there are two extra LEDs indicating which filter was loaded during configuration.
 
Member
Joined 2017
Paid Member
Hi,3lite.I highly appreciate your achievement. I usually don't use platform proms supplied by Xilinx because they are very expensive. Even the small size XCF01s(1Mbytes) for XC6SLX09($18) costs probably $7. If you use XC6SLX25($40), the prom for it incredibly costs about $30!!! :confused:Instead, a cheap SPI flash is available less than $1. Another advantage of cheap SPI is data buffer for several coefficients. You can store many coefficients in a non-used area which can be uploaded into coefficients memory inside FPGA. You can program SPI flashes by a cheap prom writer.

I also have similar uncertainty as you whether I could implement all functions into the device or not. The most difficult thing about designing FPGA is "shrinking," IMHO. Two channel x16 oversampling filter by XC6SLX09 means gold medal from my experience. I'm sure multipliers and block rams are almost full usage. I'm afraid fast carry adder usage and timing constraints are a tough challenge. Anyway, you did a great job.:)