Highest resolution without quantization noise

Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.
Right, now remove all the ultrasonic energy and you will get a visually perfect sine wave.....
The blocks **ARE** the ultrasonic components, filter them out and you get your sinewave back.

There are two distinct processes going on (At least conceptually), sampling (take the audio value 44100 times a second) and quantisation (Turning the value into a number with a finite precision).

Any sampled system MUST feature filters sufficient to limit the bandwidth to strictly less then Fs/2 if the resulting stream of samples is to be unambiguously reconstructed (Which also must feature such a filter to remove the images centered on N*Fs.

If you try to run the system with a sample rate only just greater then twice the bandwidth (The usual NOS case) then the filter required to reconstruct the signal becomes a nightmare as it has to pass (in this case) 20K while attenuating 22.05K by lots of dB, and all this without causing phase errors in band.
If you take the set of values and intersperse a 0 value between each one then use a magic filter (Done in the digital domain in reality) to make sure that the new Fs/4 and up is sufficiently attenuated then the analogue filter only needs to attenuate lots at 44.1 and up (The rate is not 88.2, but there is no information above 22.05 due to the upsampling filter).

Almost all of this can be done without going digital at any point (Granted the upsampler would be hard, but a sampled system is certainly possible in the analogue domain).

Now the other part of the issue is quantisation, which MUST include dither to work correctly.
Consider a 2 bit quantiser (because it is easy to reason about), it has 4 possible levels, call them 0,1,2,3 now without dither applying say a value of 0.1 results in an output of 0,0,0,0,0,0,0,0... so the thing is inherently non linear.
However if we sum noise having an amplitude distribution such that the probability of the noise being greater then a given value is equal to 1 - that value, then 0.1 + the noise will exceed 1.0 exactly 10% of the time (at random, but 10% of the time), and an input of 0.5 would make the quantiser toggle between 0 & 1 at random but 50% in each state, 1.2 would make the thing toggle between 1 & 2, spending 20% of its outputs in state 2, and so on.

The effect of adding that noise is that the thing has gone from being non linear (and thus distorted) to linear with noise, and the noise is at the LSB level.
Thus in a correctly dithered quantiser it DOES NOT MAKE SENSE to speak of resolution, because the thing is LINEAR, it makes sense to speak of dynamic range, but there are no pixels, just signal that fades cleanly down into (and below) the broadband noise floor).

Combine the two pieces and you do not have silly stairstep things, you have signal and you have noise at the DAC output, and that is it.
Do it wrong and all bets are of course off.

Regards, Dan.
 
Kastor, Have a view of these two informative videos on digital audio for your answers

- Xiph.org: Video This information puts to rest the myth of stair cased or blocky sine waves.

- Computer Audiophile - Fun With Digital Audio – Bit Perfect Audibility Testing Here you can download audio files at various bit-depth's and hear first hand where your audibility limit is.

- Artifact Audibility Comparisons

|
Hello Mitch,

Firstly, thank you for the links.

______

The third link there looks like a lot of files which contain noise at various amplitudes, with various content, such as music, speech and so forth.

If we were strictly analyzing noise, perhaps those files there would be useful.

I can think of a similar analysis of noise performed by Axiom, here is the link

Experimental Study : Distortion - Axiom Audio

Yes, it says distortion, but they clearly indicate they are listening to noise.

Dither is another form of noise. I have heard dither myself, for example with the Russian Neutron player application for smartphones and a very sensitive IEM, there is an alternative to "dither", in fact it's simply a little very low level noise you can hear within the music, to attain a theoretically different result.

I noted Neutron player once earlier in this thread, since it includes a 32-bit or even 64-bit audio rendering technique.

What is the relevance of 32-bit audio software in a smartphone? Can anyone hear the difference? If you look at the reviews, I'm sure you'll find comments like "this music player sounds amazing", which, I take it no one in this thread is interested in entertaining without strict tests of the frequency response of the player and blind tests and so forth.

Neutron Music Player

Returning to your third link, it's published by someone called Ethan Winer.

I saw another article by him once, you know the "x parameters of audio".

That article is highly inaccurate, thus his so called 'science' is highly suspect, I wrote a few quick thoughts on it here http://www.diyaudio.com/forums/lounge/200865-sound-quality-vs-measurements-1385.html#post3996220

______

Your first link to the Xiph site.

It says my browser can not support the video.

I'm looking at the text version instead Videos/A Digital Media Primer For Geeks - XiphWiki
https://wiki.xiph.org/Digital_Show_and_Tell/Episode_02

In the second link

"For now, the waveform display shows our digitized sine wave as a stairstep pattern, one step for each sample.

When we look at the output signal that's been converted from digital back to analog, we see that it's exactly like the original sine wave. No stairsteps.

So where'd the stairsteps go?

anyone who looks up digital-to-analog converter or digital-to-analog conversion is probably going to see a diagram of a stairstep waveform somewhere, but that's not a finished conversion, and it's not the signal that comes out."

Ok, so his explanation which puts the stairstep myth to rest is "that's not the finished version".

Looking at what he writes after that, it states.

"When we digitize a signal, first we sample it. 1- The sampling step is perfect; it loses nothing. But then we quantize it, and quantization adds noise. The number of bits determines how much noise and so the level of the noise floor.

What does this 2 - dithered quantization noise sound like?"


The numbers 1 and 2 by me.

1 - The way he says "the sampling step if perfect" is pretty vague, perfect in reference to what? To the number of steps? To Shannon-Nyquist? To waveforms in real life?

It's not perfect.

2 - Dithered quantization noise.

I see, so it's dither which puts the staircased and blocky sinewaves to rest?

Well, here is a photo of a cat......

225px-Dithering_example_undithered.png



Here is that cat again, now in 16-bit!!

225px-Dithering_example_undithered_16color.png



Looks pretty blocky and staircased to me. Poor cat!!

Now let's apply some dither......

225px-Dithering_example_dithered_16color.png



As you can see the cats neck for instance looks smoother now.

Ok, so "dither" puts "the blockiness myth" to rest.


Except, dither and quantization noise are both...... noise.

This thread title says, "without noise".

......

______


Your second link, to the Computer Audiophile article.



Well, I have to say that looks like quite an in-depth and well executed study.

I think it's executed by you as well, in that case, well done!

:snowman2:

However, it looks like you are using a modern sound-card and listening to the noise floor.

For instance

"Speaking of 14 bits:

14bit difference.wav

This is considerably down in level as perceived by my ears. At regular listening level, I can’t hear the noise at all. I have to turn the volume up near maximum to hear the noise."


In the number #1 post of this thread, I write


"A 24-bit resolution has -144.49 dB noise, due to quantization error in the ADC.

This is considered lower than the human hearing limit, thus it's estimated we can hear around 22-bit in ideal conditions.

However, that is noise.

If we remove noise from the equation, what is the highest resolution we can hear of a sine-wave, or any kind of wave?

12-bit? 32-bit? 50-bit? 100-bit? Where is the limit?

Thank you"



If you removed noise from the equation, you would not hear any difference in the 14-bit file, this is correct, yes?


Your test there would need to be ditherless and non-interpolating / non-upsampling / non-oversampling in order to correctly address my question.

It needs to be using an R2R DAC as well.


If I were to use a 24-bit R2R DAC and re-replicate your test. I would arrive at different results.

Are we on the same page?


I don't have a 24-bit R2R DAC at the moment, in the future I think I will, however it will most likely be DIY so it may take me a while until I can arrive at the time to run the tests.

There is a discrete 28-bit R2R DAC thread on DIYaudio at the moment, it's using upsampling.

Therein lies a question as well.

If we can only hear up to 14-bit in normal conditions and 22-bit in very, very ideal conditions, why would a 28-bit DAC need upsampling?


I hope my answer has clarified the topic now!

Have a nice day.
|
 
Thus in a correctly dithered quantiser it DOES NOT MAKE SENSE to speak of resolution

Yes, however, I'm not talking about correctly dithered quantizers.

Please check my my post directly above.


Right, now remove all the ultrasonic energy and you will get a visually perfect sine wave.....
The blocks **ARE** the ultrasonic components, filter them out and you get your sinewave back.

This should be pretty easy to answer.

Is the ultrasonic energy purely ultrasonic, or is it reflected back, thus sonic?

If I measure the 1 kHz sine of a Nos DAC and compare that with a very, very fine 0.00005% THD sine, then the 1 kHz blocky / stair-cased sine of the Nos will have will have pretty high THD, right?

Thus, is that ultrasonic THD or 20 - 20 THD? It's a 1 kHz sine!
 
Now the other part of the issue is quantisation, which MUST include dither to work correctly.
Consider a 2 bit quantiser (because it is easy to reason about), it has 4 possible levels, call them 0,1,2,3 now without dither applying say a value of 0.1 results in an output of 0,0,0,0,0,0,0,0... so the thing is inherently non linear.

The effect of adding that noise is that the thing has gone from being non linear (and thus distorted) to linear with noise, and the noise is at the LSB level.
|
Analog-to-digital converter - Wikipedia, the free encyclopedia


"An audio signal of very low level (with respect to the bit depth of the ADC) sampled without dither sounds extremely distorted and unpleasant.

Without dither the low level may cause the least significant bit to "stick" at 0 or 1.

With dithering, the true level of the audio may be calculated by averaging the actual quantized sample with a series of other samples [the dither] that are recorded over time.

A virtually identical process, also called dither or dithering, is often used when quantizing photographic images - This analogous process may help to visualize the effect of dither on an analogue audio signal that is converted to digital."

--- see my cat example ---

"Dithering is also used in integrating systems such as electricity meters. Since the values are added together, the dithering produces results that are more exact than the LSB of the analog-to-digital converter.

Note that dither can only increase the resolution of a sampler, it cannot improve the linearity, and thus accuracy does not necessarily improve."


In post #7, I write

"If audio does not have any idea where the resolution limit of fine detail is, then we may as well just make 64-bit ADC, 64-bit DAC and get rid of the dither and reconstruction filter altogether."


The thread topic isn't really that dither is imperfect or reconstruction filters are imperfect.

I have no specific position in dither or reconstruction filter accuracy


--- We can certainly record with a 64-bit ADC without dither, then play that media in a 64-bit R2R DAC, without a reconstruction filter ---

--- Then, one may ask, at which XX-bit is the human hearing limit, pertaining to this example illustrated above? ---


If you can not hear any noise at all, then how do you hear the bit-depth, in the example I've illustrated?

Please don't answer thermal noise or flicker noise.

It's not a trick question.
|
 
Last edited:
Kastor L said:
Does this mean the 24-bit file will result in lower THD?
THD is not a useful concept in this context. The difference between a perfect sine wave and the unfiltered output from a perfect DAC is ultrasonic images, not harmonics. Images are not distortion. There will also be some quantisation noise; more bits mean less quantisation noise.

Note that a full amplitude sine wave will look just as 'blocky' with 24 bits as it does with 16 bits. A low amplitude sine wave will look more 'blocky' with 16 bits. The issue is the ratio of the LSB to signal amplitude, not the total number of bits.

Is the ultrasonic energy purely ultrasonic, or is it reflected back, thus sonic?

If I measure the 1 kHz sine of a Nos DAC and compare that with a very, very fine 0.00005% THD sine, then the 1 kHz blocky / stair-cased sine of the Nos will have will have pretty high THD, right?

Thus, is that ultrasonic THD or 20 - 20 THD? It's a 1 kHz sine!
The 'blockiness' (in the time domain) is the images (in the frequency domain). With a perfect million-bit DAC you still get blockiness. Higher sample rate means less blockiness (time domain) and smaller images (frequency domain - because of the sinc frequency response of a first-order (i.e. sample-and-hold) DAC).
 
THD is not a useful concept in this context. The difference between a perfect sine wave and the unfiltered output from a perfect DAC is ultrasonic images, not harmonics. Images are not distortion.

Let's just wait for D. Mills answer first.


He states, unequivocally

Right, now remove all the ultrasonic energy and you will get a visually perfect sine.
The effect of adding that noise is that the thing has gone from being non linear (and thus distorted) to linear with noise, and the noise is at the LSB level.

Thus in a correctly dithered quantiser it DOES NOT MAKE SENSE to speak of resolution, because the thing is LINEAR, it makes sense to speak of dynamic range, but there are no pixels, just signal that fades cleanly down into (and below) the broadband noise floor).


Wikipedia states

"An audio signal of very low level (with respect to the bit depth of the ADC) sampled without dither sounds extremely distorted [non-linear] and unpleasant."

"Note that dither can only increase the resolution of a sampler, it cannot improve the linearity, and thus accuracy does not necessarily improve."


So which one is it? Is it linear or non-linear?

An externally hosted image should be here but it was not working when we last tested it.
 
Last edited:
When you yank two statements out of context, you're just going to confuse yourself further.

Linearity is not resolution. Resolution is not noise. Three separate issues.

I'm serious, you have to study some basic signal processing texts and do some experiments. Otherwise, you'll continue to conflate different terms, draw poor analogies, and not become even the slightest bit more enlightened. Asking half-understood and self-contradictory questions might be fun for you, but it wastes your time and everyone else's.

Start with a couple simple questions: what would the noise voltage for a converter with a 2V full scale be if it's 64 bits? What's the equivalent noise resistance? Same question for 24 bits.
 
a linear system as a concept and the linearity of real world converters are two separate things.

What I suspect that wiki article is trying to say is that dither cannot improve problems with the diferential linearity of a real world converter (For example if the quantising levels are NOT all equally spaced you introduce non linearity into the transfer function which dither cannot fix, but if you don't dither then even a quantiser with perfectly equal step sizes will be non linear.

Now I **REALLY** don't like the word 'Resolution' when applied to audio, it mainly says that the author has not really thought things thru, visual analogies DO NOT WORK for this stuff, because pictures are (usually) massively undersampled, picture resolution is if anything closer to sample rate then word length (And both picture resolution and audio bandwidth have fairly easily tested upper limits (Google 'Circles of confusion').

Now I dont really know where this discussion of dither in the context of a simple minded R2R dac came from, it has application to sigma delta and to ADC stages but it is then on the recording and you don't need to worry about it for replay unless doing the upsample thing to allow a shorter wordlength to be used for the DAC (All real converters managing over about 16 ENOB in an audio bandwidth).

I would guess attovolts in the 64 bit case (and nanoOhms), but a converter managing 32 (never mind) 64 ENOB is clearly nonsense (and I have **NEVER** encountered a 24 bit unit where the bottom few bits were not just noise by the time you got to the outputs).

Grab a copy of AOE, the "Scientist and engineers guide to DSP", and maybe a few books on discreet mathematics, signal processing and the Fourier transform if you want to understand this stuff, for a laymans overview Dan Larvy has some good stuff on his website.

Finally do the sums SY suggests, they will reveal just how off wall and physically stupid the concept of a 64 bit converter is (And just how **HARD** 24 bits is to pull off in real life).

Regards, Dan.
 
|
Hello Mitch,

Firstly, thank you for the links.

....

Your second link, to the Computer Audiophile article.



Well, I have to say that looks like quite an in-depth and well executed study.

I think it's executed by you as well, in that case, well done!

:snowman2:

However, it looks like you are using a modern sound-card and listening to the noise floor.

For instance

"Speaking of 14 bits:

14bit difference.wav

This is considerably down in level as perceived by my ears. At regular listening level, I can’t hear the noise at all. I have to turn the volume up near maximum to hear the noise."


In the number #1 post of this thread, I write


"A 24-bit resolution has -144.49 dB noise, due to quantization error in the ADC.

This is considered lower than the human hearing limit, thus it's estimated we can hear around 22-bit in ideal conditions.

However, that is noise.

If we remove noise from the equation, what is the highest resolution we can hear of a sine-wave, or any kind of wave?

12-bit? 32-bit? 50-bit? 100-bit? Where is the limit?

Thank you"


...

If we can only hear up to 14-bit in normal conditions and 22-bit in very, very ideal conditions, why would a 28-bit DAC need upsampling?


I hope my answer has clarified the topic now!

Have a nice day.
|

Hi Kastor, wrt the Xiph videos, if you get a chance to download a different browser and view the videos, it would be worth your while as the visuals will assist. If you are into books, and in case you don't have this one, I recommend: Principles of Digital Audio, Sixth Edition (Digital Video/Audio): Ken Pohlmann: 9780071663465: Amazon.com: Books

Wrt to "thus it's estimated we can hear around 22-bit in ideal conditions."

As far as I have been able to research over the years, there is no peer reviewed accepted empirical evidence to support that statement.

Re: "If we can only hear up to 14-bit in normal conditions and 22-bit in very, very ideal conditions, why would a 28-bit DAC need upsampling?"

The answer for upsampling is that we don't. If you mean oversampling, that's another topic and covered in great detail in the Pohlmann book.

Here are a few more links, including another article I wrote, that demonstrates 16/44 is more than enough resolution for our ears/brain:

24/192 Music Downloads are Very Silly Indeed

16/44 vs 24/192 Experiment - Blogs - Computer Audiophile

Archimago's Musings: 24-Bit vs. 16-Bit Audio Test - Part III: SUBJECTIVE COMMENTS & FINAL THOUGHTS

Dozens of ABX listening tests with a wide variety of perceptual encoders: Hydrogenaudio Forums -> Listening Tests

The general consensus being that there is very little (to none) empirical evidence to show that the human ear can discern any higher resolution than 16/44. If you spend time on the Hydrogenaudio site and listening to the downloadable tests, with the different perceptual encoders, under proper ABX test conditions, it is suprising how low of a resolution one has to go to actually discern an audible difference.

The point is that the limiting resolution factor here is our ears/brain and not the technology, irrespective of quantization noise.

Speaking of tech, you may want to try this objective test with your AD DA converter to determine transparency: Gearslutz.com - View Single Post - Evaluating AD/DA loops by means of Audio Diffmaker The Lynx Hilo I use is near the top of the list.

Good luck with your experiments.
 
I have edited the topic title now to correctly reflect the actual question.


What is the highest perceivable bit-depth resolution, for instance within 44.1 kHz or 192 kHz PCM material, if there is truly zero noise?


Even if this is theoretical, it's still a perfectly valid scientific / mathematical question.
 
Kastor said:
There is a discrete 28-bit R2R DAC thread on DIYaudio at the moment, it's using upsampling.

Therein lies a question as well.

If we can only hear up to 14-bit in normal conditions and 22-bit in very, very ideal conditions, why would a 28-bit DAC need upsampling?

The answer for upsampling is that we don't. If you mean oversampling, that's another topic and covered in great detail in the Pohlmann book.


If we entertain that we don't, then this developers statement is incorrect

Should I NOS-DAC ?

"Hey NOS lovers !?! do we think this is bad ??
Well, IMHO nobody should think this is bad, but NOS die-hards of course never will. They have all the in-band distortion out of the way (sorry guys, this NEEDS upsampling) and THD+N is now not 30% but 0.0018% (in my case) and ...
NO RINGING."


He claims that upsampling has reduced the THD from 30% to 0.0018% in his DAC.

Either your statement is incorrect, that we don't need upsampling in an R2R, or his statement is incorrect.

His statement may also provide an explanation why I can hear a difference when I 4x upsample via software my 16-bit R2R DAC, thus I find his statement interesting.

It provides a theoretical connection between the terminology THD and resolution as well.
 
Last edited:
Mitch said:
The general consensus being that there is very little - to none - empirical evidence to show that the human ear can discern any higher resolution than 16/44.


|

Technically, I can think of four cases when it comes to redbook versus SACD / DSD.
______

The first, is if the 1-bit style of DSD has any theoretical impact or not, in any technical fashion.

See the link and pictures

Mother of Tone - Conversion Techniques

______

The second, is that SACD's tend to sound very high quality, due to the studio equipment, microphone, et cetera.


See post #22 / 178 in this link
SA-CD.net - Forum

"You guys really got to read the paper. Especially the part where we go on about how really, really good hi-rez sounds, almost uniformly."


The explanation for this is, I think, that these studios with the high-end faith type individuals in them which are using DSD, tend to use a lot of fancy, unverified, high-end style equipment.

If you shoot wildly, eventually you'll hit correctly -

______


The third, is that if there is a difference then it seems very, very tiny, at least via direct perception.

Personally, I don't care about 192 kHz sonics very much, I have very little interest in this issue, since if there is a difference, then I consider it a slight difference, which I haven't personally perceived, at least not very much if I have, hence it's all a pretty low priority on my list.



Personally, here is the only ABX style test I've seen which shows a positive result

http://www.hydrogenaud.io/forums/index.php?showtopic=17118

______


The fourth, there are some studies which show a positive result, however not via direct listening perception.


This is direct perception, they found no difference

http://www.nhk.or.jp/strl/publica/labnote/lab486.html#1

This is indirect perception, they detected a difference

http://jn.physiology.org/content/83/6/3548.full

Tsutomu Oohashi , Emi Nishina , Manabu Honda , Yoshiharu Yonekura , Yoshitaka Fuwamoto , Norie Kawai , Tadao Maekawa , Satoshi Nakamura , Hidenao Fukuyama , Hiroshi Shibasaki
Journal of NeurophysiologyPublished 1 June 2000Vol. 83no. 3548-3558
|
 
Last edited:
Now I dont really know where this discussion of dither in the context of a simple minded R2R dac came from

According to this link Mother of Tone - Conversion Techniques

This is Sigma-Delta

3-bit_green.jpg


This is 16-bit R2R

16-bit_green.jpg


This entire time, my question has been what is the highest perceivable resolution level of an a R2R waveform like that which we can hear?


If we introduce dither, that changes everything! It nullifies the entire question.

If it's DSD, then I suspect the human hearing limit is at 1-bit.

If we suppose, that a theoretical 2-bit DSD format sounds identical to us.

In that sense, the highest perceivable audio bit depth resolution to us, with or without noise, strictly speaking, is 1-bit.

When we start introducing different parameters into the question, like......

- introducing sampling at a maximum of 44,100 times per second
- introducing dither
- introducing shaped dither

Et cetera.

Then the answer will vary wildly!
 
Last edited:
He claims that upsampling has reduced the THD from 30% to 0.0018% in his DAC.

Presumably the 30% figure comes at HF when a NOS DAC is used without a reconstruction filter. The 0.0018% figure is presumably with some kind of band-limiting filter in circuit - either in the DAC or in the THD analyser (or both).

Either your statement is incorrect, that we don't need upsampling in an R2R, or his statement is incorrect.

He's correct - upsampling isn't a necessity, its a convenience to ease the design of the reconstruction filter.
 
Presumably the 30% figure comes at HF when a NOS DAC is used without a reconstruction filter. The 0.0018% figure is presumably with some kind of band-limiting filter in circuit - either in the DAC or in the THD analyser (or both).

He's correct - upsampling isn't a necessity, its a convenience to ease the design of the reconstruction filter.

XXHighEnd said:
It needed 16x digital upsampling to do it right.

And so, this DAC is perfect in the time domain, and is fairly accurate in the frequency domain.
Fairly ?

Yes, and this is the last subject of this sequence;
Eliminating the stepping distortion sufficiently is different from what we call a "reconstruction filter". Now what ?

When the (in-audio band) frequency gets too high, too few samples exist to form a normal sine. So, think of it : a 10000 Hz tone has 4 samples available to create its original sine. Uh-oh ... that is squares again !
Now, this is only partly solved by the upsampling filter.

"upsampling isn't a necessity, its a convenience to ease the design of the reconstruction filter."

It seems like he doesn't have a reconstruction filter. It's just your usual NOS R2R, upsampled 64x, reduced from 30% to 0.018% sheerly via reduing the stepping distortion, i.e. removing the stair-case distortion, within the audio band.

That's what he says at least, right?

Haha, what do you think?
 
Last edited:
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.