Asynchronous Sample Rate Conversion

werewolf · 2004-03-04 9:54 pm

Actually I think a SUMMARY POST is in order 🙂 Key points :

- Aynchronous Sample Rate Conversion is really an ANALOG problem, because there is no RATIONAL relationship between input & output sample points, in time.

- We can create a good analog signal DIGITALLY through massive interpolation of the input data, followed by a simple holding function.

- To save computation, interpolation with an FIR filter makes sense. And we consider an FIR filter to consist of several sub-filters, which we call polyphase filters.

- Each output word is computed by a simple convolution of stored input data with the correct polyphase. The whole trick of ASRC is simply selecting the right polyphase, when an output sample is required.

- We can achieve the precision required in polyphase selection, even with reasonably available clocks, through an AVERAGING operation I've called a Polyphase Locked Loop. It's the low-pass dynamics of this PPLL that dictate how the ASRC responds to jitter.

That's it 🙂 Jitter is up next.

A 8 · 2004-03-05 6:08 pm

There is "sort of" a sentiment on this forum that asrc will irreversible build incoming jitter into its output samples.

My experience of the 8420 does not support this, in fact quite the opposite and I for one am very keen to understand if and possible how incoming jitter effects the rate converted samples.

ScottG · 2004-03-06 1:57 am

Fantastic thead, Thanks!

I realize its to soon for this topic (or perhaps not for this thread at all) but something you said really peaked my curiosity:

"the Asynchronous Sample Rate Converter in fact DOES convert the signal to analog ... but it does it's job COMPLETELY in the digital domain !"

what does this imply for altering the bit rate?

Prune · 2004-03-06 2:29 am

A 8 said:
There is "sort of" a sentiment on this forum that asrc will irreversible build incoming jitter into its output samples.

No kidding. I think this was started by salesmen of jitter-related snake oil. Here's what they have to say:

If you use an ASRC as a jitter attenuation device, the jitter at the input of the ASRC will be distributed into the output signal-data, and what was simple clock-jitter at the beginnig is now forever glued your digital audio signal, it has become something comparable to sampling jitter.

werewolf · 2004-03-06 8:33 pm

ScottG - I don't think I understand your question ... may I please ask you to elaborate?

A 8 & Prune - allow me to begin the discussion of jitter, by first casting ASRC in the WORST possible light. The above quote is fundamentally correct 🙁

A couple simple, logical points :

1. Does the ASRC completely ELIMINATE all forms of jitter from the incoming (e.g. S/PDIF) stream? NO. It substantially, heavily FILTERS the jitter ... but does not eliminate it altogether.

2. Where do we "find" the residual jitter? Well, it certainly isn't on the output clock ... that's ultra-clean (by definition), derived from a local crystal oscillator. Only leaves one place ... the output data. So must we conclude that, after being heavily filtered, the incoming TIMING jitter is somehow "mapped" or imposed upon the DATA? The answer is yes.

HOWEVER (actually, there will be alot of "however's" in the upcoming posts) ... where I must take issue with the above quote is with it's use of the phrase "simple clock jitter". There's nothing "simple" about clock jitter, if the intention is to imply that clock jitter is "less harmful" or "easier to eliminate" than what ASRC does with jitter.

Want to eliminate jitter completely? Slave the source (transport) to the DAC. Want to eliminate jitter and remain compatible with S/PDIF? Good luck ... it WON'T happen with PLL-based clock recovery, and it WON'T happen with ASRC. However, ASRC is still superior to PLL-based clock recovery, for reasons we shall soon see 🙂

Let me offer a quote of my own (actually I didn't invent it, but it is very relevant here) as food for thought :

The RIGHT sample, at the WRONG time ... is the WRONG sample.

In other words, errors in TIME are no more "friendly", no more "benign", than errors in DATA.

ScottG · 2004-03-07 7:27 am

sorry I wasn't clear

It appears there are two basic processes an ASRC can perform: 1 Sample rate conversion (at freq. of 44.1 up to 192) the other is bit rate conversion (at bit rate of 16 to 24). (of course it can "down sample" to.) Typically both processes are termed "upsampling", and in the past, articles on this topic have said that "no new info. is created".

others still have maintained that no "additionally" new info. is created when upsampling. (essentially maintaing that the ASRC essentially performs destructive testing and a complete reconstruction of the signal - hence that the entire signal is "new" but it doesn't contain any additional info.) (If I'm understanding correctly what you have written so far - this destructive testing and reconstruction is correct?)

still others have maintained that while converting the sample rate upward in freq. does not alter the signal (fundamentally) any more than an oversampler would (i.e. not at all other than to move the distortion higher in freq.), the bit rate conversion does in fact add more info. to the signal. Typically though the extent of the basis for this is simply the subjective description of the author claiming a more "analog-like" sound.

This is perhaps as, or more controversial than the "encoding of jitter" issue that Prune mentioned.

But like the jitter issue - I'm interested in what is the correct response and why. (in this instance you mentioned that the signal is converted to analog in the digital domain - this sounds to me as if there may be some basis in fact for the #3 response, but its just a lay-persons guess.)

A 8 · 2004-03-07 10:16 am

My asrc dac seems less sensitive to different transports etc compared to my non asrc dacs hence my previous statement.

It will be Interesting to understand if there is anything to my findings or if my ears are playing tricks on me.

dmosinee · 2004-03-07 11:22 am

First off, thank you Werewolf! This has been a truly interesting reading session. Posts like this are so great, because information like this can generally only be found in textbooks or other materials requiring you to buy them (at least, information that has been digested for the 'learner').

Second off, please excuse any obvious stupidities in the following verbage, as I am rather new to this. I only discovered this site, and DIY audio for that matter, a couple of months ago ( I used to think that only sony and bose and others had the magical voodoo that was needed to make quality sound equipment, hah!). Since then I have been trolling the forum inhaling any and all information i can glean from all different parts of this wonderful forum. This is actually even my very first post.

My question: I may be being stupid here, but why is it neccesary to store the 64 most recent input samples ? My understanding of this thread was that whenever the output clock "asks for" an output, the circuitry picks the most recent one of the 1 million points that we "add" between each input clock. So It would seem to me that as long as the output rate was faster than the input rate (you said earlier that it has to be), the most recent interpolated sample we need would always be between the most recent input sample, and the second most recent input sample. Is it only by convention that 64 are kept instead of 2 ? Or is my understanding of the process fundamentally flawed?

To help with the above question, and perhaps to serve as even another round of interpretation/simplication for the less EE-savvy (like me), I've made a drawings displaying how I envision the process happening (forgive my MS paint skills 🙂 )

An externally hosted image should be here but it was not working when we last tested it.

The above is the first step in the process (conceptually). Where by we'd like to add a million samples, equally spaced in both time and amplitude, in between each actual input sample in order to "fill in the gaps".
--

As for the output process, I painted up a ludacris drawing of how it could work in the perfect world, where we could have perfect clocks of any speed (the 2 characters Digit and Al represent the electronics involved, it just seemed like the right way to do it).

An externally hosted image should be here but it was not working when we last tested it.

To do this, we would need (as you said) a clock that could "check" 1 million times per input sample, or 44,100,000,000 Hz (~50 billion). And we would also have to have computed every single one of the interpolated points (both impossible for us)

As I understand it, the way you said to get around both, is to store 1 million different "formulas" to compute just the interpolated sample we need at a specific 1/millionth of an input-sample time unit. And to decide which of these time-units we need, we look at the output rate, and can get awfully close to the actual one we need (using feedback to make sure we don't drift too far off).
this last part is illustrated again by Digit and Al.

An externally hosted image should be here but it was not working when we last tested it.

The flaw in my above explanation is, obviosuly, the fact you said we need to keep the 64 latest input samples (and the subsequent 64 coefficients that makes up each "formula"). So i guess my question is really, where have I messed up in my above understanding ?

A previous poster commented that all of this was purely for theory purposes. I might be completely naive to the ways of the world in saying this, but it seems like this would be possible for the DIYer to implement on their own through one of the various kinds of programmable logic out there. I would think that something of this nature could be made with XILINX fpga's, which are easy to obtain (and cheap), and can be easily programmed from a PC. Although, like I said i may be quite naive to certain issues surrounding this kind of technology, I have only taken two EE courses here at UW Madison so far.

Sorry for the long post everyone 😱
-Dave

dcole · 2004-03-07 10:59 pm

dmosinee:
First off I should point out that I don't really know what I'm talking about.😀 But I think the point where your understanding is going awry is when "Al" asks "Digital" (I love the characters and drawings!) for the next sample he doesn't know exactly how far between the two samples he is as this would imply having a clock running at the full interpolation speed (something like 50Ghz right?) which of course we don't have, so the trick then becomes figuring this out. As for how this is actually done I have no idea...
Doug

werewolf · 2004-03-08 5:14 am

You guys are GREAT! Thanks for all the feedback. Allow me to address the questions, then "on with the show" 🙂

Scott G - I don't think that destructive testing & reconstruction is a good way to view the interpolation process. Interpolation is simply a digital technique used to "fill in" digital samples between the original samples. A very accurate way to view interpolation is that the samples generated are essentially the same samples that would have been generated, if you had simply sampled the original analog signal FASTER ... or at a higher sampling rate ... with ONE very important qualification : the signal content is STILL bandlimited to half of the ORIGINAL sample rate. To further explain, let's examine three cases:

Case 1 : Sample an analog signal at 48kHz. Just before sampling, we need to apply an anti-alias low-pass filter that removes any analog content in the signal above 24kHz. This "band-limiting" to 24kHz prevents any aliasing when we sample.

Case 2 : Sample an analog signal at 192kHz. Just before sampling, we apply an anti-alias low-pass filter that removes any analog content in the signal above 96kHz. So the signal still has "information" all the way up 96kHz.

Case 3 : Sample an analog signal at 192kHz ... but before sampling, apply an anti-alias low-pass filter that removes any analog content in the signal above 24kHz (not 96kHz).

Using digital signal processing to digitally interpolate the CASE 1 samples, by a factor of 4, will (almost) exactly re-create the samples of CASE 3 ... but NOT CASE 2. 🙂 🙂 This is a VERY key point of interpolation. What's the bottom line? Very simple, and don't ever let anyone tell you otherwise : Interpolation (or upsampling, or oversampling) does NOT add any information to the signal. It's impossible. The same holds true for increasing "bit rates" (or word sizes) ... information is NOT added to the signal. Even with well-dithered recordings ... no information is EVER added after the orignal sampling & quantization steps that create the original recording. Why then do we ever bother with fancy DSP like interpolation? In a nutshell, it allows us to do certain filtering process ... such as anti-image filtering, asynchronous sample rate conversion, etc. ... in the DIGITAL domain, rather than the ANALOG domain. And sometimes (not always), DSP really IS better than analog! 🙂

A 8 - I'm not surprised at your observation. ASRC really is a better way to address (not eliminate) the jitter issue, than PLL-based clock recovery schemes. And what is the ONLY possible difference between digital transports? Why, jitter of course 🙂

dmosinee - WOW !! I really want to thank you for the effort you put into your post! If I knew how to make drawings like that in my posts, this whole thread would be alot easier to understand 🙂 The truth is, I'm pretty much a moron when it comes to computer technology ... but I struggle through 🙁 to advance & share my understanding of our REAL passion : music 🙂

Anyway, allow me to address your main question. Why do we need to store, and use, 64 input data points in our interpolation ... when we know that any Fs_out clock edge demanding a sample will, of course, lie between only "2" Fs_in clock edges?

The answer lies in the interpolation process itself, shown in your very FIRST figure. Forget for a moment that we're going to interpolate by such a huge factor N, and forget that we're ultimately going to "asynchronously decimate". For the moment, let's just imagine that we only need to do the most simple interpolation possible : we just need to interpolate the input data stream by a factor of 2 ... meaning, we simply have to "fill in" a SINGLE sample exactly between the incoming sample points. To do this accurately for audio, how many input sample points can possibly be needed for this simple computation? The answer is .... 64 🙂

Here's why. If you were given this simple interpolation problem ... interpolate by 2 ... how would you proceed? Well, you have to fill in a sample between each input sample. So as a first attempt, you might just draw a straight line between adjacent samples, and pick that middle point ... essentially the "average" of the two adjacent samples. This form of interpolation is called, not surprisingly, "linear interpolation" ... we use two (adjacent) samples to define a straight line, and our interpolated point lies on that line as well. The problem is, it's not very "accurate" at all ... meaning that, the original audio signal was most probably NOT always following a straight line between each sample!

So let's get a bit more clever, and increase the "order" of our "interpoalting polynomial" ... instead of linear, let's go to a cubic. Our hope will be that the resulting "curve fit" will be somehow more accurate ... and in fact it will, because it allows for the signal to do something a bit more interesting between samples than follow a straight line 🙂 But in order to "curve fit" a cubic polynomial, we need FOUR (4) input data points (two before the interpolated sample, and two after). And of course, we could continue this process with higher order polynomials, until we achieve the desired accuracy.

Now in reality we don't really use such polynomial curve fitting to do our interpolation (polynomials aren't particularly good interpolators) ... but they do suggest a very valid trend : the more input data points I use, the more accurate my interpolation will be, no matter HOW MANY samples I need to provide in between the orignal data points. And remember ... "accuracy" in the interpolation process is really just a measure of how closely the interpolated signal matches what I would have generated, by simply sampling the original signal faster 🙂 (with the qualification mentioned above, in my answer to Scott G)

It turns out that, for high quality digital audio, I need to use 64 input samples for ANY accurate interpolation process.

Oh, by the way, how do I KNOW that all this fancy DSP interpolation REALLY can generate something ARBITRARILY CLOSE to the orignal analog signal, just sampled faster? Very simple my friends : it's a direct consequence of the Nyquist Theorem 🙂

That's not a Nyquist "conjecture" by the way, or a Nyquist "approximation" ... it's a THEOREM 🙂 There is no stronger statement in the fields of mathematics or engineering. And it's irrefutable.

dmosinee, the short answer to your question is this : It takes a surprisingly large number of input samples, to accurately interpolate or "predict" what the signal is doing BETWEEN any two samples. But we know, as sure as 2+2=4, that such "prediction" is NOT guesswork. The Nyquist Theorem tells us that, given enough computation, the samples alone are all we need to determine EXACTLY what the original signal was doing BETWEEN those samples. But you need alot more than the adjacent samples to do it ... just like you need alot more than two data points, to accurately predict a TREND in any process 🙂

Make sense?

netgeek · 2004-03-09 12:04 am

Not to take away from this thread - 'cause it's a GREAT thread and tutorial, but in case the "loyal readers" are looking for additional information and sources to pursue here's one that talks about (or at least alludes to) some work done by Smith and Gossett:

www.atc.creative.com/silicon-design/10k1.pdf

If interested in further reading - a google search will prove handy...🙂

Regards

eganz · 2004-03-09 2:52 am

I am enjoying your tutorial.
My philips dvd 963sa has an AD1895 ASRC in it.
Maybe one reason that it sounds so good.

Is this good enough or would the AD1896 be noticably better?

werewolf · 2004-03-10 6:00 am

JITTER

Let's first review the Phase Locked Loop (PLL) clock recovery scheme, the long-time standard for receiving S/PDIF data.

PLL's typically consist of a few components in a feedback loop :
VCO (Voltage Controlled Oscillator), PD (Phase Detector) and LF (Loop Filter). The PD compares the phase of incoming data/clock to the VCO clock. The phase differences drives the VCO input through the Loop Filter, thereby adjusting it's frequency & phase to "lock" to the incoming data.

Classical linear analysis can be applied to the PLL in the "phase" domain. We won't go through all the details (but the reader is certainly welcome to investigate 🙂 ), but there are TWO (2) noteworthy conclusions :

1. The PLL is a LOW-PASS filter of incoming jitter (jitter = phase noise). Imagine an "impulsive" phase hit on the input clock (engineers like impulses as test cases ... because they tell you everything you need to know about linear systems) ... meaning one input clock edge is not where it's "supposed" to be. The loop will respond to this phase hit rather slowly ... the output clock will be disturbed, and it will take several cycles for the output clock to settle back to a uniform, steady state. ONE input phase "glitch", causing the output clock to be disturbed for SEVERAL cycles?
Yes ... and that's EXACTLY what you want. You see, by LOW-PASS FILTERING the incoming jitter, the PLL is actually REDUCING the TOTAL jitter. This is easily seen as a consequence of Parseval's Relation, which equates total frequency domain energy to time domain energy. The bottom line is this : The LOWER the cutoff frequency of the PLL (acting as a low-pass filter), the LONGER the transient response of the PLL will be (meaning that it will take more output clock cycles to settle down), but the total input jitter energy will be MORE reduced.

So, in order to attenuate INPUT jitter alot, we want a LOW bandwidth for the PLL 🙂

But there's a problem .....

2. Our second conclusion is the "Achilles Heel" of the PLL. The VCO inside the PLL is definitely NOT an ultra-low phase noise oscillator ... nowhere near the super-high Q, low noise of a crystal oscillator, for example. Why not? Simple ... it can't be ultra-high Q, because it must be voltage-controllable over some range of frequencies so that it can "lock" to incoming data. So the VCO has higher phase noise than a crystal ... how does the PLL "respond" to it's own VCO phase noise? Glad you asked 🙂 It can be shown that the PLL is a HIGH-PASS filter of it's own VCO phase noise. And guess what ... the LOWER you make the cutoff frequency, the HIGHER the PLL output noise will be from it's OWN oscillator 🙁

So we have CONFLICTING requirements. One the one hand, you want a LOW bandwidth PLL for clock recovery, to attenuate jitter on the INPUT data stream. But on the other hand, you want a HIGH bandwidth PLL to attenuate jitter from it's OWN VCO. The ultimate solution is an optimization, or compromise, between these conflicting requirements.

Couple more points. First, that high-pass filter action on the loop's own VCO phase noise is not to be taken lightly. Many of these VCO's are implemented in CMOS technology, where flicker noise (1/f noise) can get pretty large at LOW frequencies ... so as you lower the loop bandwidth, you may "pick up" a surprising amount of VCO jitter. Also, PLL's have another potential problem as you lower the bandwidth ... you may exceed the so-called "jitter accomodation" of the PLL (a large signal limit on the jitter that the PLL can track), which will cause the PLL to "cycle-slip". Depends alot on phase detector topology and other loop dynamics ... but it just needs to be mentioned as another PLL concern that may prevent the loop from operating with a very low bandwidth.

Finally, the PLL typically has analog loop filter components outside the IC. These pins can be particularly sensitive to outside noise interference (electrical or magnetic coupling), which will be yet ANOTHER source of jitter for the PLL (extreme case is injection locking).

So there you have it. Ideally, the PLL would be a simple LOW-PASS filter of incoming jitter, effectively attenuating input jitter MORE effectively as you LOWER the loop bandwidth. But there are OTHER sources of jitter to worry about with PLL's ... primarily the VCO's own phase noise. This source of jitter, plus the PLL's jitter accomodation, tend to place lower limits on the loop's bandwidth ... and ultimately compromise the PLL's ability to attenuate incoming jitter 🙂 🙂

Exercise for the reader : research recommended loop bandwidth for the ubiquitous Crystal CS8412.

And remember : CORRECT audio samples are inexorably linked to PRECISE moments in time. So "perfect" audio data tied to a "jittery" timebase, is no better than "jittered" audio data tied to a perfect timebase. The effects are in fact the same, and in either case, lower jitter is (of course) the goal.

Anybody see where I'm going with this? May be able to wrap up the entire tutorial in one more post. 😀

Ouroboros · 2004-03-10 8:13 am

Like everyone else, I'd like to thank Werewolf for explaining ASRC far better than I've seen anywhere else.

But back to PLLs. Why can't you have a fast-tracking PLL (in the CS8414) to re-generate a clock from the SPDIF data, and then feed this clock as input to a slow PLL which will provide a clean clock to re-clock the data into the DAC? There will still be some edge movement on the clock and data but it will be at such a low frequency as not to matter.

Provided that there is not so much jitter on the incoming SPDIF bit-stream that there is more than +/- a half bit period of edge uncertainty then won't this ensure 100% clean data into the DAC?

Petter · 2004-03-10 2:34 pm

I also have one question related to PLL's.

There are a number of problems with input receiving, one of them being the required pull for a given sampling rate, another being the fact that one may use several sampling rates.

Historically jitter reduction on decoupled systems use PLL's, ideally using pullable crystal oscillators meaning you needed several crystals and complex schemes to come up with VCXO based PLL's

Then someone invented digital delay line ("memory based") PLL's controlled by again ideally a pullable crystal oscillator. More of a digital implementation of a PLL if you wish (reference: Erland Unruh, search the web. I also believe it was implemented in a "Genesis digital lens" commercial audio system 10 years or so ago).

Now we are using ASRC's which this thread is mostly about.

My guess is that a "digital lens" type of approach is potentially better than item 1 above and likely a lot of fun to implement. Does anybody know if this delay type scheme is implemented on chip anywhere?

Also, there is now the opportunity to use a fixed crystal and "pull" it using something like an Analog Devices DDS unit http://www.analog.com/Analog_Root/p...D117%26level2%3D137%26level3%3D%252D1,00.html (parellell input units also available) would be one such example (and it would be "pulled" with a digital binary input, i.e control signal is noiseless-ish but not step-less". Is this the way to go? I note these units have PLL's which likely will operate at high frequencies and somewhat decoupled from the control side of things which would be a digital input based for example on the "fullness" of a shift register or DRAM bank.

My thinking is that a minimally jittered master clock generated from a DDS unit will potentially be a very good solution competing with ASRC?

Any thoughts on this?

rfbrw · 2004-03-10 3:22 pm

Petter said:
Does anybody know if this delay type scheme is implemented on chip anywhere?

Possibly the 74HC297. DLL's are also a feature of some of the larger FPGA devices.

ray.

rfbrw · 2004-03-10 3:38 pm

werewolf said:
JITTER
Let's first review the Phase Locked Loop (PLL) clock recovery scheme, the long-time standard for receiving S/PDIF data.

Werewolf, you appear to be running the risk of merging two seperate issues. Extracting the embedded clock from the SPDIF datastream and reclocking or 'cleaning' the extracted clock. I can see no obvious way of extracting the clock from the SPDIF datastream using an ASRC, so why malign the PLL for doing a task the ASRC can't.

ray.

EC8010 · 2004-03-10 3:55 pm

Wow!

Werewolf, I've just discovered this thread and would like to add my thanks to you for taking the time and trouble to write this most interesting thread. DSP isn't my speciality, so I shall be sitting down to read your words carefully.

werewolf · 2004-03-10 6:18 pm

now things are getting INTERESTING !!

Ourboros & rfbrw - I think I can address your comments & questions together ...

rfbrw - you are absolutely correct, in that Asynchronous Sample Rate Conversion, by itself, will NOT "extract" the clock from the incoming S/PDIF data stream ... that's a function for a PLL (often integrated within the ASRC device, but a SEPARATE function to be sure). So, in order to really compare apples-to-apples, we have to compare the following two systems :

1. S/PDIF --> Fast PLL ---> ASRC

In this system, the "fast" PLL recovers the clock from the data stream. It's relatively fast, so it "tracks" without any jitter accomodation problems. But since it's fast, we haven't heavily filtered the incoming jitter. So the data, and recovered clock are sent to the ASRC. What the ASRC does to the jitter ... well, that's the next post 🙂 But I think some of you guys already know the answer 😉

2. S/PDIF --> Fast PLL --> Slow PLL

This is Ouroboros' question. It certainly works ... but the second, slow PLL still suffers from the same problem of VCO jitter. The lower I make the bandwidth of the second PLL, in order to filter input jitter, the more the output clock suffers from it's own VCO phase noise. So cascading PLL's doesn't really solve the problem, it just defers it to later stages ... anywhere I have a low bandwidth PLL, I am "exposed" to the internal VCO phase noise.
Make sense?

Now Petter raises an interesting question. What if that second "slow" PLL can indeed use a crystal oscillator for it's VCO?
That would certainly be the best you could hope for, in terms of VCO phase noise. There's only two ways to do this, as far as I know : first, you can design a crystal-based VCO that you can indeed "pull", in an analog fashion, over a very limited range ... may in fact work well enough for certain limited applications. The other option is really a category of clever digital techniques, that essentially VARY the feedback clock divider inside the PLL ... skipping cycles now & then, if you will ... so that the PLL can really frequency-lock to the incoming data (or clock). Now, this "variable divider" is built according to a NOISE SHAPING algorithm, so that the phase noise you incur from slipping cycles is DISTRIBUTED in the frequency domain so that most of the resulting noise power resides in frequency bands you don't care about. Sometimes called "fractional-N" PLL's. I'm not particularly familiar with the "digital lens" technology (other than the marketing) ... but I'll see what i can find out. Cool ???

But let's take a step back, and ask a philosophical question. Let's assume that we want to be compatible with S/PDIF data. So our first receiver is a fast PLL, like both examples above ... which extracts the clock from the incoming data stream, and is certainly wide-band enough, and pull-able enough, to track any incoming data stream. The question is ... what would we like our SECOND block to do? Ideally, we would want it to low-pass filter the clock jitter of the first PLL (which had at least two contributions ... unfiltered incoming jitter, and VCO phase noise) with as LOW a corner frequency as possible (because Parseval tells us that this will most effectively reduce jitter energy, no matter what domain we observe). Furthermore, we would like this SECOND block to ADD no jitter of it's own 🙂 Just filter the incoming jitter, thank you, without adding any more. Agreed?

Well that almost takes us to the conclusion 😉

werewolf · 2004-03-11 12:45 am

How does the ASRC respond to clock jitter? Well, the Polyphase Locked Loop (PPLL) inside the ASRC, used to select the right polyphase when an Fs_out edge demands an output sample, responds VERY similarly to the way a Phase Locked Loop (PLL) responds. An impulsive phase "hit" on the input clock will disturb the PPLL counter, and this disturbance will decay with the time constant of the feedback loop. And yes, such a glitch will "disturb" MORE polyphase calculations as the loop bandwidth is DECREASED (because, once again, the settling time will be longer). But, just like the PLL case, this is exactly what you want for lowest output jitter energy.

Of course, there certainly seems to be one fundamental difference with the ASRC. In the PLL case, it was the output CLOCK edges that were disturbed in a transient fashion by an input phase glitch. However, with the ASRC, the transient behavior of the PPLL effects output DATA calculations ... which is worse, jitter showing up as PHASE noise on the output clock, or jitter showing up as AMPLITUDE noise on the output data? Here's the answer : if the jitter is SMALL enough, there's no difference at all 🙂

I've implied as much with a few statements like :

- The RIGHT sample at the WRONG time, is the WRONG sample
- CORRECT data is tightly linked to PRECISE moments in time

But now I've come right out and said it 🙂 Pristine samples with a jittery clock is no better, no worse ... in fact, no different ... than jittered samples with a pristine clock. PROVIDING that the jitter is small, and comparable magnitude in the two cases.

Here's a little simple math to suggest (not prove) the validity of this statement : Imagine a jittered sinewave given by :

Asin(wt+p), where A=amplitude, w=frequency, t=time, p=phase noise. Now a simple trig identity can be invoked to show :

Asin(wt+p) = Asin(wt)cos(p) + Acos(wt)sin(p)

Now if the phase noise, p, is small ... the following approximation is valid :

Asin(wt+p) ~ Asin(wt) +pAcos(wt)

Where we see that SMALL phase noise modulation has become indistinguishable from amplitude noise modulation.

No, it's not a complete proof ... but any complex signal can be described as a summation of sinusoids. So the concept holds for more complex cases.

So I submit that if we filter the jitter enough, with a low enough bandwidth loop ... it won't matter at all if the residual jitter is mapped to the data, or the clock. The observable effects will be the same. So, what's the best block to follow our fast, S/PDIF clock recovery PLL? The ASRC filters the incoming jitter with a ridiculously LOW bandwidth of THREE (3) Hertz. YES, the tiny residual jitter is mapped into data amplitude ... with no observable difference to tiny clock jitter. AND, the ASRC adds no jitter of it's own ... ZERO. Because the output clock is derived from the cleanest crystal oscillator you can build 🙂

How would a slow PLL stack up? Well, it would behave just as well ... IF you can build a PLL with a 3 Hz bandwidth, AND add no additional jitter in the process. That's a tough job indeed 🙁

So that about wraps it up 🙂 I believe that ASRC technology is a step forward in digital audio jitter management. Is it the ultimate solution for all cases? NO ... I think we would all agree that slaving the source or transport to the DAC clock, possibly with some RAM or FIFO in between ... is the ultimate jitter hammer. But, for S/PDIF compatibility, a strong case for ASRC can be made.

Thank you, my new friends, for sticking with this. I am of course open to further debate & discussion ... but for now, class is dismissed 🙂

Search

Amplifiers

Source & Line

Loudspeakers

Design & Build

General Interest

Live Sound

Member Areas

Site

Featured Vendors

Members Market

Vendors Market

Vendors

Search

Asynchronous Sample Rate Conversion

werewolf

A 8

ScottG

Prune

werewolf

ScottG

A 8

dmosinee

dcole

werewolf

netgeek

eganz

werewolf

Ouroboros

Petter

rfbrw

rfbrw

EC8010

werewolf

werewolf

Asynchronous Sample Rate Conversion

Member

Member

Member

Disabled Account

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Ex-Moderator

Member

Member