What's wrong with Class-D?

Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.
Founder of XSA-Labs
Joined 2012
Paid Member
Unfortunately, the music signal is not a sine wave.

Any arbitrary signal can be completely and accurately represented by a series of superimposed sine waves. Music is a subset of any arbitrary signal, and therefore is in fact a superposition of sine waves. Some signals may take more than others and that is where compression comes in. For audio signals, the human ear cannot distinguish any more than 32 Fourier coefficients based on psychoacoustics research at Fraunhoffer Institute.
 
Any arbitrary signal can be completely and accurately represented by a series of superimposed sine waves. Music is a subset of any arbitrary signal, and therefore is in fact a superposition of sine waves. Some signals may take more than others and that is where compression comes in. For audio signals, the human ear cannot distinguish any more than 32 Fourier coefficients based on psychoacoustics research at Fraunhoffer Institute.

Hi,
Psychoacoustics is vast and unexplored, which is considered a new direction.
32 coefficients of a fft are enormous ... our ear detects a change of 0.5% in the geometry of the envelope (shape), then, in the harmonic component (then in FFT).
Not to be confused with 0.5% in dB.

Simple experiment: (sorry if it's not very scientific)
with a microphone connected to a fft (even a simple card in the computer), beat two fingers on the table and see the fft. now changes (very little) the angle of the fingers, so that clap fingernail. ear the sound has changed a lot, please see fft.
 
Founder of XSA-Labs
Joined 2012
Paid Member
I know what you are saying, but the 32 coefs is just published scientific research. 32 is just the absolute limit of what is required to recreate the most complex passage of music that the human ear can discern if the coefs were reduced. The fact that it is 32 has more to do with fact that it is double 16 and is basis for mp3 compression algorithm which was developed by same place (Fraunhofer Inst at Univ of Erlangen). The song Tom's Diner by Suzanne Vega was the test song used to develop mp3. Vega is the 'mother' of mp3.
 
By working principle D class cuts away very low level signals. On one hand it gives better signal to noise ratyio, especially subjectivly, one second hand it cuts away not only noise, but low level information too, so it's up to listener's taste and previous listening experience, which one he/she perferes.
 
I know what you are saying, but the 32 coefs is just published scientific research. 32 is just the absolute limit of what is required to recreate the most complex passage of music that the human ear can discern if the coefs were reduced. The fact that it is 32 has more to do with fact that it is double 16 and is basis for mp3 compression algorithm which was developed by same place (Fraunhofer Inst at Univ of Erlangen). The song Tom's Diner by Suzanne Vega was the test song used to develop mp3. Vega is the 'mother' of mp3.
I also understand what you mean.
I do not agree on the concept of recognition of sounds. seems absurd that in 2013 I found only theories and scientific disclosures distant from the real mechanism that we use to recognize sounds.
We do not recognize a frequency;
We recognize only "relative to" amplitude of a sound;
But we fully recognize as a snare with coil springs, and it is impossible that you can change or disguise adding tone filter. this means (in opposite) we decode the geometric structure of the sound. it is sufficient that you change to 0.5% this geometry, and the snare will become another thing.

regards
 
Founder of XSA-Labs
Joined 2012
Paid Member
AP2,
I don't know where you are getting you info about human hearing. It is well known that the human cochlea has nerves positioned and arranged in a way similar to a mechanical real time spectrum analyzer with various lengths of cilia sensitive to different frequencies, and these cilia are positioned along the length of a fluid channel such that different frequencies excite different portions where the nerves representing various frequency bands are located. In fact human hearing is very much first a FREQUENCY and phase processor and an amplitude processor second. The amplitude as you know is logarithmic and sensitivity is not as important. The frequency allows pattern recognition and matching that our brains are so good at so that the source of the signal can be instantly recognized from spectral content (bird, lion, baby cry, etc) and the phase information coupled with asymmetric shape of ear lobes permits 3d spatialization. The 0.5% discernable amplitude is not as important as the spectral content which can be well below 0.5% amplitude and still be recognizable. Limit of human hearing perception is primarily frequency dependent and secondly amplitude dependent.
 
^ This. Plenty of information available, including testing. "This is You Brain On Music" by Levitin, "Psychology of Music" by Deutsch, "Sound Reproduction, Acoustics and Psychoacoustics of Loudpseakers and Rooms" by Toole.

The ear/brain is exceptional at discerning frequency.
 
Hi,
I have no doubt the first part (mechanical ear). this is amply described anywhere.
I agree and I know the human system, detecting the position of a sound source.
we are able to perceive a displacement angle of 2 degrees, at a distance of 3 mt. right for example.
This is a process that makes our brain as trigonometry. (therefore, it is obvious that the phase is important)
my previous post refers to the recognition of a sound (eg snare, chord bass, trumpet, piano) I excluded baby as it does not contain a complex geometry of the sound.
So my question is: can someone describe what kind of process use our brain to recognize a snare? (this means in reverse, what can we alter in the sound of the snare drum, so that we do not recognize?)
this one has a perfect relationship with the subject of class d, or amplifier in general.
 
Last edited:
bwaslo,
Wouldn't class D have a similar process as class AB rather than class A? It does still seem that many keep thinking that class D follows some rules more akin to digital but you and I know that is not how this is working. It is just a more efficient amplification method using high speed multiplication as the process. Yes we have to deal with RF production in the process and out of band frequency high frequency modulation in the conversion process but this is all happening in the analog domain. I only see the lower noise floor as the limitation of low level details and not anything inherent in the process.
 
Class D doesn't do anything different at near zero signal as it does at higher levels. The outputs are always switching between max minus and max plus, the only thing that changes near zero signal is how long they stay at either state between each switch, hence no crossover distortion.

There is a mechanism at a mid-high level where the output filter inductor stops being able to supply its stored energy which changes the gain some at that transition. See some of Bruno Putzeys' presentation on the subject. But this is far above the usual Class A/B crossover region, so it operates more like a "Very Rich A/B" if anything.

But again, the mechanism of the output devices in Class D is to just switch when told to, which they do very well. In Class A or Class A/B, the output devices do a rather mediocre job of linearly converting input voltage to output current, and in class B or weak class A/B a rather bad job of it down where it likely matters most.

Whether that's all audible is maybe another matter..
 
Why does carrier freq need to be 50x audio freq to sound good? Is this to allow more headroom for pwm to have sufficient pulse width modulation depth? 50x seems excessive... especially when THD measurements show less than 0.1% distortion at 20khz fir 400khz carrier.

I really don't know. Digital music is also a sampled representation of the signal, and it is sampled at only 44.1 kHz ( or at most 192 kHz) So the reason is not connected with the representation of the signal (sampled vs. analog). I don't count myself an "audiophoole" and I also do not believe that tube amp MUST sound better than silicon...But there is some difference in my opinion.

Anyway, sound quality or not, the frequency will go up. It makes building components smaller and cheaper, not only sounding better. Also RF irradiation goes down, being the lowest with that "interleaved" technology which makes equivalent sw. frequency product of number of interleaving channels by individual sw. frequency. This technology is already in use for years in switching regulators for the same reason ( faster load change response and lower RF leakage).
 
Founder of XSA-Labs
Joined 2012
Paid Member
So my question is: can someone describe what kind of process use our brain to recognize a snare? (this means in reverse, what can we alter in the sound of the snare drum, so that we do not recognize?)
this one has a perfect relationship with the subject of class d, or amplifier in general.

This is the area of psychoacoustics research. Of course this is an active area of research but it is becoming accepted that Human recognition of sounds comes from pattern matching of stored frequency domain spectrograms in our hard wired memory. Comparisons of these stored spectrograms with the real time spectrograms produced by the ear in a similar way to how we do pattern matching for visual patterns is probably how it works. In addition, there is also the time dependent phase information of the sound spectrograms that are also used to provide enhanced speed and accuracy of matching. The decay and attack rates of the sound are also important - ie., the 'waterfall plot' generated by our brains and ears. There are critical frequency bands that have been identified by researchers in the 1 khz to 5 khz range where most of this information is used by the brain.
Your question of what can we do to distort a snare drum so it does not sound like a snare drum has to do with a well known phenomena whereby adding a small signal very close in freq to the actual signal will confuse the auditory nerves to thinking that there is actually no signal there, This can be used to great effect for noise reduction (Dolby Labs made $B exploiting this concept).
 
It seems to me, that in principle, Class D is capable of excellent sound quality. As are Class A and AB. And this is with real world components. The issue is therefore, only one of implementation. Good engineering and the involvement of skilled and passionate people will prevail. As a result of the conceptual benefits of Class D, I believe it will dominate. Class AB replaced much of Class A, and Class D will replace much of Class AB. Even if this happens, Class A and AB will survive but they will be niche. These niches though, will contract in the commercial world when their designers retire, except from time to time where young folk demand retro.

So, what's wrong with Class D ? - all the same things as was wrong with the other classes, the things that engineers must wrestle with and define tradeoffs between when they design amplifiers. In other words, nothing is wrong with Class D.
 
Last edited:
Founder of XSA-Labs
Joined 2012
Paid Member
Niches will always be there as can be seen with tube amps which are purely a boutique product. The transition to class D is more transformative and a paradigm shift compared to the change from class A to class A/B, or even tubes to solid state because the means in which the high current signal is generated has gone from continuous analog to discrete time series switched mode. With proper design and filtering, the realized output should be indistinguishable from high quality class A analog amplification but with close to 90% efficiency and lower distortion. A class D amp can be designed to be a transconductance (current amp) like the F1 but without being a space heater. Switching noise in the 400 kHz to 1.2 MHz range is inaudible to humans. I do not se why switch rates can't go to GHz as cell phone components and wireless switches are routinely doing that. The expensive inductor used to filter the 400 kHz (currently one of the most expensive components in a class D BOM) will become tiny and very cheap. It will get there and the devices will pump out 200 watts per channel from a package the size of a postage stamp.
 
xrk971,
Your ideas are sound and the real driver that will cause the shift to class D will be the size reduction and the legal requirements at some point for power efficiency, and class D which will transition to class G and H is only a legal opinion away. We will be forced at some point to design with this implementation by those simple rules that say you can not produce a class ab amp and most certainly not class A the most inefficient of them all. Only diy builders will even understand what these older circuits are in the near future and just as vacuum tubes have only been an esoteric product for so long now the same will happen to class ab before you know it.
 
Nope tubes gear is not just a boutique product. There are no microwave ovens powered by solid state transistors. In opposite most of them (all) do implement tubes (magnetron). Beam of electrons in vacuum is close to ideal current as ones could imagine.

I am an audiophile and cannot see anything wrong with that. Most stereos sounds terribly artificial only several setups I've audition performs as good as real stuff sounds. Yes they are pricey and bulky and share bad WAF so most would happy with simple inexpensive small footprint stereo no wonder. Especially taking into account that modern electronic consumer gear as getting very close to what that only multi grand HiEnd stereo delivered in the past. But there is difference still.

Life is a tread off. Always. Class D is not a free lunch either.
 
xrk971, one small correction (or maybe just a clarification). Class D isn't really a "discrete time" setup like sampled devices such as D/A converters that provide a value at certain spaced instants only. In Class D, the times at which switching can occur are continuous not discrete. In other words, the widths of "level high" time and "level low time" aren't limited to changing only at certain discrete spaced time instants, but can be at any time increment at all between zero and the switching frequency's period.
 
suntechnik,
Some of the worst sounding and distorted sound systems are the single ended tube amplifiers that can barely drive anything but a horn system. So you have to qualify what your requirements are and what you consider accurate sound. I do not consider a tube amplifier that has a 7 watt output as a truly useable device in most application, I do not listen to chamber music at 86db output and call it a day. I am sure with a push pull tube section or a few other high power tube circuits I would be happy but those are so expensive that they become again an esoteric product. It you want or need a higher powered tube amp you will be spending precious resources to get there.
 
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.