Hi JWR,

First, a quick point:

You mention converting to a 196kHz sample-rate. This isn't standard. I'd put it down to you mistyping 192kHz (which is the standard) but you did it several times, so it seems this is your actual aim. I'd recommend you double-check this!

Hmmm, I'm a little confused - convolving by a sinc function is about as textbook as it gets. However, the implementation isn't usually described the way you have. As you're probably aware, the sinc function performs a perfect band-limited interpolation between the samples, which can be evaluated at any point and so allow you to resample the stream at whatever frequency you wish. The problem of course is that the sinc function has infinite support and must be windowed in some fashion. While you can reach a result of arbitrary precision by using a longer window, you can never reach the exactly resampled result.

In addition to length, there is a variety of windowing functions that are applicable. The rectangular window is optimal in terms of mean-squared error (i.e. error power) with respect to window length, but does not perform too well in the time-domain (i.e. pre- and post-echo). Some work has been done on optimising this sort of calculation against criteria other than MSE - I can dig out some papers if you are interested in this.

You report you are calculating "millions" of points - exactly how much of the sinc function are you using?

[ My guess is that you are using a weighted sinc function for every sample in the track - this is far more than necessary. I do not remember offhand, but the sinc function will be more than 120dB down in level after probably a few hundreds of cycles, so going beyond this isn't going to increase your accuracy much. It is a judgement call though, so take it as far as you wish

]

Now, in my view the key to moving forward is to realize that your current method is exactly equivalent to an FIR filter interpolation using a sinc kernel. For this sort of interpolation, as you noted, you require a periodicity. However, there

*is* a periodicity in the conversion from 44.1kHz to 192kHz - which is 640/147. So, the conversion can be considered as an upsampling by an integer factor of 640 times, followed by a downsampling by an integer factor of 147 times. If you consider that in its raw form, you have a long FIR filter running at 640Fs! But closer inspection shows that the vast majority of input samples are zeros and don't contribute anything to the result, so you can get there much more cheaply by just not calculating for them. The standard structure to do this is called a polyphase filter, and is quite straightforward. It is documented all over the place. I would recommend you look at this approach, as it combines the arbitrary precision of the sinc implementation with the cheapness and ease of a fixed filter kernel.

I appreciate the fun of deviating from the mainstream, but in this case, there's a reason that polyphase filters are the norm!

In any case, the trick now becomes in designing the "best" filter kernel for the job, and there's plenty of fun to be had there. Nobody quite knows what to aim for in the properties of these filters, so it's still an active research topic.