True 32 bits of SPDIF interface (new IC specification) ??

Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.
there's no difference in quality going from, say, 44.1->88.2 versus 44.1->96K
Actually, not quite. The output samples contain the antialiasing filter's impulse response in addition to the input. As most ASRCs use brickwall linear phase antialiasing the output contains a substantial sync function contribution from the antialiasing. The result is integer multiple resampling sounds better---it samples close to the sync's zeros and therefore contains less of the error inserted by antialiasing. This is easy to mitigate by employing a more balanced tradeoff between time domain and frequency behavior in the antialiasing filter, which is what makes the parts here interesting.

Probably the simplest way to experiment with this tradeoff is using SoX; send in an impulse and plot the outputs of different antialiasing settings. When I was ABX testing this I found I needed just about the slowest rolloff, minimum phase filter SoX offered in addition to the right kind of sample rate shifts to be unable to distinguish between original and resampled output. 44.1 -> 48 is, in particular, abominable; IMO it's a crime against music.

Any good engineer will tell you to sample in a multiple of your finished product. Less errors . . .
Yup. Also, 24+ bit files mitigate errors like phase quantization---first time I did the math and found 16 bits wasn't enough to handle well scaled data moving through an allpass filter with sufficient accuracy to avoid audible phase errors I stared at it in disbelief.

I'd record in 176.4 if my interface supported it. But it doesn't, so 88.2 it is. Impulse resampling in ASRC is enough of a problem I wouldn't at all mind hardware which split the signal and recorded concurrently at a 44.1 integer multiple for redbook audio and a 96 integer multiple for the high definition formats around DVDs.

I would think it comes back to not having your DAW have to sample rate convert on the fly as it read audio files from your hard drive, especially if you are doing a high track count (only PT-HD loads complete audio files into ram for your mixing session).
That's some of it. Another consideration if you're manipulating low frequencies with EQ patches and such up to 48 bit fixed point DSP needed to prevent numerical errors creeping into 16 bit output. 64 bit is overkill from a fixed point precision standpoint---it's used because it's more efficient at the hardware level than 48 bit on nearly all processors---but it takes 64 bit floating point to match 48 bit fixed point.
 
Last edited:
Actually, not quite. The output samples contain the antialiasing filter's impulse response in addition to the input. As most ASRCs use brickwall linear phase antialiasing the output contains a substantial sync function contribution from the antialiasing. The result is integer multiple resampling sounds better---it samples close to the sync's zeros and therefore contains less of the error inserted by antialiasing.
ASRCs don't necessarily have to use a linear phase FIR, you can make one with any impulse response you want. In most applications linear phase is desirable - especially if the input is riding close to 0dBFS, as altering the relative group delay of different frequencies can cause peaking.

If you can go 44.1KHz to 88.2KHz, or 44.1KHz to 176.4KHz without any "antialiasing error", there's no reason you can't upsample 44.1KHz by a factor of 320 to get a 14.112MHz sample rate without any of the "error". Now take every 147th output sample of that 14.x MHz and throw away the rest - you've got a 96KHz audio stream. Where's the error?

The effect of any properly implemented resampling system is the same as filtering the input samples with a continuous time filter with the 22K/22.05K (or whatever) response, then sampling the continuous time result at the output sampling rate.

I do this stuff for a living, got plenty of ASRC related design experience under my belt.
 
I do this stuff for a living, got plenty of ASRC related design experience under my belt.
Cool. So do you know an ASRC implementation which yields samples of ...000010000... out when presented with an input of ...000010000...? Or which comes reasonably close whilst maintaining reasonable frequency domain characteristics?

Not exactly short on DSP experience here either; post 23 reads like there's an assumption of alias free downsampling. Pretty neat trick if one can do it.
 
All you need for alias free downsampling is to ensure your interpolation filter removes frequency content above the nyquist frequency of the output rate.

Getting 0001000 out of an interpolator from 0001000 means your filter is 1 tap and you're zero stuffing the input, causing everything below nyquist to alias to higher frequencies. That's a pretty crappy interpolator.
 
Audio data doesn't meet the alias free criterion, particularly in the 44.1 space. Hence the problem. I'm glad you agree an impulse preserving interpolator is a poor choice. The corollary of these two conditions is a rate converted output cannot be error free.

In mastering the practical question is what configuration minimizes the error. The answer most everyone's converged on is SRC rather than ASRC. In interconnect---which is what this thread is about---the question's whether ASRC or the jitter shaping of an elastic buffer PLL yields lower error. The PLL is difficult to beat as it's a couple hundred dB down.
 
Protools 11 is using a 64 bit engine now so the plugins and dsp are 64bit AAX format. PT11 offers to save your 24bit recording as 32bit float as a default. I would think it comes back to not having your DAW have to sample rate convert on the fly as it read audio files from your hard drive, especially if you are doing a high track count (only PT-HD loads complete audio files into ram for your mixing session).

24 vs 32 bit has nothing to do with sample rate. Yes, it makes sense to use 32 bits as an internal processing format, but it doesn't make any sense to use it as distribution or playback format, as a 32 bit float only has the same precision as a 24 bit integer anyway. A 32-bit float has a 24-bit significand (that determines the precision) and an 8-bit exponent that is only used if your data is not normalized.

It is not about headroom and audio noise floors. It is about sample and DSP accuracy; and reducing the work load on your Audio Engine Many plugins sound/work much better at higher resolution, even though the final product is going to be stone age 16bit/44.1k format.
We agree. You want headroom/increased accuracy for processing, but once your signal is processed, there is nothing to be gained from 8 empty exponent bits in every sample.

Cirrus makes 32 bit AD chips now.
Someone is selling 32 bit AD/DA hardware.
That is what doesn't make sense. What SNR do they get?

What is the maximum dynamic range you have ever seen in any of the material you have recorded?
 
DSD / DoP

===============================================

CT-7301C Audio SRC Bridge : True 32 bits of SPDIF

===============================================

Key Features:

DSD / DoP

this feature means :

. DSD support 1x/2x/4x/(8x with dop-to-pcm) mode.

. DOP support 1x/2x/4x mode, with I2S.

. DOP support 1x/2x mode, with SPDIF

. DSD / DOP to PCM converter


ComTrue Inc. - Contact

ComTrue Inc. - Downloads
 
24 bits is an almost impossible ask in a 20K bandwidth, just on the thermal noise performance required, 21 bits or so is about as good as anything actually gets, and even 20 bits needs work to pull off.

Further I have never been in a room good enough to provide 21 + bits of dynamic range with the peak level being anything I would care to be exposed to.

32 bits is total wishful thinking for an ADC or DAC in an audio bandwidth (And I would hate to try it even in an instrumentation bandwidth, even dissimilar metal junctions would have you at this level of precision), further there can be no such thing as a 32 bit spdif interface because the spdif standard is only specified to 24 bits....

You could make an interface sort of like spdif that did 32 bits, but it would not, and could not be spdif (And also rather pointless, no hardware exists to convert that to analogue with sufficient precision).

Regards, Dan.
 
Further I have never been in a room good enough to provide 21 + bits of dynamic range with the peak level being anything I would care to be exposed to.

I don't think the room is the best focus. There's a tendency to think of digital running at '0 dB', potential true only for the type of modern music production that makes terms like 'fidelity' irrelevant. The gif is a fresh snap of midly processed cold voice taken from a live 44.1/24 digital broadcast system. Peaks are around -10 dBFS. A live acoustic event would require much more headroom.
The FFT filtering lowers the display level but median for a controlled, compressed and limited audio source is still - 20 dBFS. Properly recorded lcassical probably spends most of it time in the -40 dBFS range.
 

Attachments

  • processed_voice.gif
    processed_voice.gif
    50.1 KB · Views: 137
I don't think the room is the best focus. There's a tendency to think of digital running at '0 dB', potential true only for the type of modern music production that makes terms like 'fidelity' irrelevant. The gif is a fresh snap of midly processed cold voice taken from a live 44.1/24 digital broadcast system. Peaks are around -10 dBFS.

That might be true for a voice broadcast. Any music recording, even classical, will be normalized so that peaks hit close to full scale.
 
That might be true for a voice broadcast. Any music recording, even classical, will be normalized so that peaks hit close to full scale.

Normalization is a post AD process, nearly impossible to achieve ahead of the recorder. The peak to average ratio of the processed voice used for that graph is also much lower than would be expected from live acoustic performances. -40 dBFS might be optimistic.
 
further there can be no such thing as a 32 bit spdif interface because the spdif standard is only specified to 24 bits....

As I know not only the 32 bit, but the 384 is also problematic.
If I play a 24/192 music, then the ES9018 plays it well, but when I try 24/384, then nothing. Checked the Musilands output with scope and it does seem to play the 24/384 music as the square wave is roughly half of the 24/192.
So I assume it wouldn't be a problem to have spdif with 32/384, but at the moment DACs doesn't seem to care for it as it was not configured, although it would be able to play it.
 
Normal broadcast alignment levels are generally around -20dBFS quasi peak, plus minus a few depending which country you are in, so yea a classical orchestra playing pp might be 50 or 60dB below full scale...

So What?

Even with ~20 bits you have ~110++ dB dynamic range so that 50dB below full scale is still 60dB over the noise floor of the electronics, but probably only 30dB over the clothing rustle, breathing and aircon noise.....

At 110dB or thereabouts the microphones are running out of dynamic range, at least for the usual small diaphragm condensers used for this purpose, puts a limit on what the ADC needs to do even if you ignore the room.

Regards, Dan.
 
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.