Do measurements of drivers really matter for sound?

Hah! It's Arie Kaizer. I met him in Hamburg at AES 1981 (?) over a couple of pints. It took me an hour before I realised his "wretched pistons" were actually "rigid pistons" :LOL: He was the first to attempt FEA of cone behaviour (p38) while we were the first to do Laser scans of diaphragms to confirm these simulations .. hence our mutual interest.

This is the best explanation of Wigner stuff I've seen but also illustrates its shortcomings.

Anyone wanna explain how his Fig 67 (p37) Wigner, tells you more useful stuff than his Fig 66 (p36) CSD aka 'waterfall' aka KEFplot?

His brother used (?) to be editor of Elektor and did several good articles on Ambisonics

Arie's papers touch on some less well known (completely ignored?) but really important stuff which I hope to pontificate on when hifijim starts a clean thread :)
I did in post #410... Apart from requiring double sample rates (or the analytic signal as the input as in this paper), there are no "shortcomings". With an appropriate window function, the Wigner Distribution depicts the actual sound pressure produced at a microphone (or at an ear).

By contrast, the other methods do not show the signal information at the microphone/ear, but rather a smeared approximation of it. As I said in my last post, there is nothing wrong with using a CDS (for example) to identify/depict a driver resonance if that is the limit of the information you seek, but any attempt to use the display as a model of what we hear is flawed by its approximate nature.

Notably too, the Wigner Distribution includes what is commonly referred to as "clutter", when in actual fact this is part of the information to which our hearing is exposed. It is not then appropriate in such an application to refer to the information as clutter, when, with a suitably designed window, this information is likely of relevance subjectively.
 
The AES paper about anechoic and in-room bla bla that I accidentally opened now contains an error in the premise, as it address the error of in-room measurements to be caused by the reflections. But the reflections do cause the perception of a room!
Etc. Etc
Exactly! This is an example of information that is routinely and purposefully discarded in measurements - and that (wrt many of my other posts here) has been dismissed as "clutter" in Wigner Distributions. Making nice looking pictures is not the task of an audio engineer, especially if we wish to better model subjective reports.
 
If you do not mind my asking: how have you established the Wigner relates to any subjective ratings?
I have not, but there is clearly a gap between measurements and judgements of sound quality. I have merely suggested that using the information in the Wigner Distribution is a means to part fill the gap. As I see it, deliberately ignoring information presented to the ears does not appear a good way to forward. It would certainly not be typical of nature to do so.
 
  • Like
Reactions: 1 user
Here is a 'typical' spectrogram of RIR (dB scaled) which is a convolution of anechoic IR of spk and the room (here, my living room with half of foam off):
Can you post the actual IR & RIR as maybe a CDV or Excel file? Anyone have a copy of ARTA (or other suitable software) and could do a 'waterfall' of Michael's IR & RIR ?

I would do this myself 12 mths ago, but my last XP machine died and all my own serious software needs XP to run :(
View attachment 1283655
It is kind of hard to get anechoic measurements for DIY. I know what it is but not sure why mention them if you can't get them?
Various people have suggested how to do quasi-anechoic measurements with REW and other stuff so its certainly within the scope of DIY today. Indeed I think its in the documentation for all these systems. I was going to write my own stuff for easy measurements for da DIY community but HD crashes and beach bum concerns put paid to this hi falutin aim.

Instead, I use observable data and pick the top of the ridge on wavelet sgram as spk:
View attachment 1283656
I'm not sure your RIR spectrogram 2ms or your arete wavelet display actually help us to see important 'defects' in the sound of a speaker compared to a naive 'waterfall' which I hope someone here will dream up with your IR & RIR data :)

The convolution methods like chirp, MLS and perfect sequences are much less computationally intensive but to pronounce them 'The Best' could be a bit contradictory to the current state of science as we know it.

Particulary, Chirp errs at low frequencies where both noise and non-linear distortions are at their worst.
Which of Angelo's papers on his method have you read?

In case it isn't obvious, Angelo's method is a Matched Filter measurement for both the fundamental and also each harmonic so is BY DEFINITION the BEST (estimate when contaminated with stochastic noise bla bla). I think this is made explicit in some of his later papers (in conjunction with some of his students) or at least supported by more hi-faluting maths :cool: https://en.wikipedia.org/wiki/Matched_filter

Angelo's log sweep method IS less accurate at LF but this is the case with ALL measurement methods. In fact his log sweep psd is better matched to typical noise spectrums than practically all other test signals. I've dabbled with even better test signals to deal with LF but that would lose the nice Matched Filter characteristics of his method. Far better just to use his method with a longer sweep.

U mus xcuse my ignorance of hi-falutin names as I no longer have da MatLab Signal Processing Toolbox so kunt pre10 to no wat de meen :)
 
I have not, but there is clearly a gap between measurements and judgements of sound quality. I have merely suggested that using the information in the Wigner Distribution is a means to part fill the gap. As I see it, deliberately ignoring information presented to the ears does not appear a good way to forward.
When mikets42 posts his IR & RIR in CDV or Excel format could you do a Wigner on it so we can see if it is better than his Wavelet display or the 'vanilla waterfall' that hopefully someone else will do with the same data?

Then you can tell us faults that da Wigner has highlighted, Mike can tell us what da Wavelet spectrogram shows and I'll attempt to read the tea leaves in da KEFplot (waterfall)

Of course we can't really complete da exercise for which we would need to do DBLTs on Mike's speakers in his room :) But it would be interesting to see what each method/display highlights
========================
An easier task might be for you to explain how Arie Kaiser's Fig 67 (p37) Wigner, tells you more useful stuff than his Fig 66 (p36) CSD aka 'waterfall' aka KEFplot?

https://pure.tue.nl/ws/files/3397667/240847.pdf

It seems to me that, in this case at least, the Wigner is the 'smeared' measurement. You can derive several other measurements from da KEFplot like the anechoic response bla bla
 
Last edited:
No, wait...
The same attributes of the sound aren't found in an anechoic room.
That means that the part of information (as you call it) of ambient(ambience) isn't there.
It follows that the definition of sound is not well...defined
The same attributes are there, just there is additional information convolved with that of the loudspeaker/drivers too. The sound is well-defined at the microphone/ear, just its analysis needs improving if we are to understand better different subjective impressions of different drivers, for example.
 
Do I understand you correctly when I read you can measure low frequencies in a normal listening environment (that is, without the usual gating) and derive nonlinear distortions in the low frequency range?
There was no problem doing this even in da late 70s. It's just that the response & distortion are that for the speaker in da room :)
 
Last edited:
  • Like
Reactions: 1 user
It seems to me that, in this case at least, the Wigner is the 'smeared' measurement.
That is not correct. The Wigner Distribution is not a measurement either - like cumulative spectra, it is only a method of displaying measured data. All cumulative spectra are smeared versions of the Wigner Distribution. Anything appearing in such spectra that is not evident in the Wigner Distribution is an added artefact of the spectra, not extra information gained from the measurement.
 
By contrast, the other methods do not show the signal information at the microphone/ear, but rather a smeared approximation of it.
I have been wondering about this smearing that you speak of. I would like to get your opinion of something...

Here are two burst decay plots as processed by ARTA. This is a far field measurements of a 5" driver, installed in a cabinet. The gate window is about 5 ms. Both plots are made from the same impulse response, which is in turn made from a 4-sample average of Periodic Noise.

ARTA offers two options when presenting a burst decay plot. One is "Prefer Frequency Resolution", and the other is "Prefer Time Resolution". I use both plots to gain insight into speaker behavior, but I have often wondered about the math behind the plot. I am guessing that both plots are limited, but limited in different ways. Is this related to the smearing that you talk about?

1709997840851.png


1709997865846.png


Thanks !

j.
 
I have been wondering about this smearing that you speak of. I would like to get your opinion of something...

Here are two burst decay plots as processed by ARTA. These are far field measurements of a 5" driver, installed in a cabinet. The gate window is about 5 ms.

ARTA offers two options when presenting a burst decay plot. One is "Prefer Frequency Resolution", and the other is "Prefer Time Resolution". I use both plots to gain insight into speaker behavior, but I have often wondered about the math behind the plot. I am guessing that both plots are limited, but limited in different ways. Is this related to the smearing that you talk about?
There are two related but distinct answers to this question...

A Wigner Distribution is limited in its resolution by Heisenberg's Uncertainty Principle. The non-linear windowing I mentioned previously is intended to trade-off time resolution for frequency resolution determined by this principle in order to better approximate how our hearing works as we move from a transient model to a steady-state "spectrum analyser" type one.

This is different from the additional smearing of the Wigner Distribution that occurs in producing decay (or attack) spectra, spectrograms or the like. You can imagine every such display a starting with as Wigner Distribution then filtering out information. If the ARTA is a Cumulative Decay Spectra, then by its (mathematical) definition, it will have discarded information that would have been evident in a Wigner Distribution.

The question to be answered is what relevance can be given to the discarded information...
 
  • Like
Reactions: 1 user
I have been wondering about this smearing that you speak of. I would like to get your opinion of something...

Here are two burst decay plots as processed by ARTA. These are far field measurements of a 5" driver, installed in a cabinet. The gate window is about 5 ms.

ARTA offers two options when presenting a burst decay plot. One is "Prefer Frequency Resolution", and the other is "Prefer Time Resolution". I use both plots to gain insight into speaker behavior, but I have often wondered about the math behind the plot. I am guessing that both plots are limited, but limited in different ways. Is this related to the smearing that you talk about?
Further wrt the plots you have shown: You can see the greater frequency resolution in the top plot where three, possibly four resonances are observed above 5kHz and within about the first 4 periods. In the lower plot, these peaks are smoothed into maybe two apparently broader resonances, but where features in the resonant decay at 5kHz appear more evident.

Which display is correct? They actually are both "correct", but the analysis/display method smears different aspects of the information presented. And that is just the decay, what about the attack spectra information too?

The question is then which display better models what you are hearing? I suggest they are both unduly compromised in this quest.
 
  • Like
Reactions: 1 user
Do I understand you correctly when I read you can measure low frequencies in a normal listening environment (that is, without the usual gating) and derive nonlinear distortions in the low frequency range?
I think that we need to design loudspeakers for the position where they will be used, and the surrounding walls become parts of the speaker. 8ms gating is kind of meaningless for f < 200Hz. A big problem with low freq distortions is that they are frequently not harmonic. Whenever they are - yes.
 
  • Like
Reactions: 1 user
Can you post the actual IR & RIR as maybe a CDV or Excel file? Anyone have a copy of ARTA (or other suitable software) and could do a 'waterfall' of Michael's IR & RIR ?

I would do this myself 12 mths ago, but my last XP machine died and all my own serious software needs XP to run :(

Various people have suggested how to do quasi-anechoic measurements with REW and other stuff so its certainly within the scope of DIY today. Indeed I think its in the documentation for all these systems. I was going to write my own stuff for easy measurements for da DIY community but HD crashes and beach bum concerns put paid to this hi falutin aim.


I'm not sure your RIR spectrogram 2ms or your arete wavelet display actually help us to see important 'defects' in the sound of a speaker compared to a naive 'waterfall' which I hope someone here will dream up with your IR & RIR data :)


Which of Angelo's papers on his method have you read?

In case it isn't obvious, Angelo's method is a Matched Filter measurement for both the fundamental and also each harmonic so is BY DEFINITION the BEST (estimate when contaminated with stochastic noise bla bla). I think this is made explicit in some of his later papers (in conjunction with some of his students) or at least supported by more hi-faluting maths :cool: https://en.wikipedia.org/wiki/Matched_filter

Angelo's log sweep method IS less accurate at LF but this is the case with ALL measurement methods. In fact his log sweep psd is better matched to typical noise spectrums than practically all other test signals. I've dabbled with even better test signals to deal with LF but that would lose the nice Matched Filter characteristics of his method. Far better just to use his method with a longer sweep.

U mus xcuse my ignorance of hi-falutin names as I no longer have da MatLab Signal Processing Toolbox so kunt pre10 to no wat de meen :)
Here is the RIR, attached as zipped .wav

Of course, I know what orthogonal Matched Filters are. I don't remember which Angelo's papers I read but AFAIK, the first open publication on the topic is the 1960 'Theory and Design of Chirp Radars' which refers to classified publications from 1947 by Darlington.

With all due respect, I disagree... but, please, don't allow me to punish you with a 1000-page lecture on adaptive signal processing.
 

Attachments

  • rir.zip
    7.9 KB · Views: 20
I just found this 1996 Linkwitz interview, from around when I recall the word wavelet showing up in the electronic design trade magazines. The whole thing is interesting for this thread, but page 3 is relevant to my interests here:
https://www.stereophile.com/interviews/503/index.html
@benb : thank you very much !

Just to prevent the loss in broken links:

"Linkwitz: From a practical standpoint, the advantage of using a shaped tone burst (one that rises and decays gradually in a sinusoidal envelope) is that all of the burst energy is concentrated into a very narrow frequency band. This is quite different from tone bursts used in the past, where you had a rectangular burst covering a fairly wide frequency band. I chose a spectrum width of a third of an octave for this stimulus—which is a 5-cycle burst—because this corresponds closely to how we hear. A third-octave is about the width of the critical band of hearing. Also, because the burst is so short in duration, you mask out the effect of reflections, so it becomes a sort of poor man's approach to anechoic measurements. As long as you measure the peak of the burst before the first reflection, you've essentially captured an anechoic-like response giving you some of the benefits of Time Delay Spectrometry or Maximum Length Sequence (MLSSA) techniques without the expense.

Now, the shaped tone burst can be used in several ways. For instance, one can just use a microphone to measure the peak amplitude that the burst reaches after you apply it to a speaker, which will give you an approximation of the frequency response. Likewise, after the decay of the 5-cycle burst, there shouldn't be any output from the speaker. In reality, however, if there is stored energy in the drivers or cabinet, the speaker keeps on ringing. Therefore, the shaped tone burst is very useful for identifying the sources of resonant storage. In any event, I do get extremely good correlation between the frequency response measurements derived from the shaped tone burst test and what we hear, as well as specific information about cabinet and driver resonances.

The real benefit of this type of test is that it concentrates the energy into a constant narrow frequency band so that it is a third-octave in width at 100Hz or 1kHz or 10kHz. Therefore, it is much narrower on an absolute basis at 100Hz than at, say, 10kHz. In other words, the tone burst test has a constant resolution on an octave basis. This is important when you compare it to FFT analysis, where you get good resolution at high frequencies but very little information at low frequencies. The shaped tone burst test works on a logarithmic scale so we can get good resolution all the way down to the lowest frequencies. I use this type of test signal to look at the decay of the burst, which gives me the same type of information that you would be looking for in a spectral decay or waterfall plot that MLSSA can generate.

I also have MLSSA, so I do generate the spectral-decay plots as well, but I have to say, I have not found the waterfall plots very useful except for maybe above 1kHz. Below 1kHz there are so many artifacts in the typical spectral-decay waterfall plot that it is useless. Anyway, it's simply a lot easier to get the same, and even much more, information out of the shaped tone burst response. Extending the time record for the FFT in order to get useful low frequency data is generally not practical; using a narrow burst signal makes it so direct and easy. Plus, you can change the frequency of the tone burst on the fly, while you watch the dynamic changes on an oscilloscope, as the tail of the burst stretches out—in effect allowing you to see directly when you're close to a resonance!

I guess I'm beginning to sound a little like a missionary for the shaped tone burst test, but I really do believe it is an extremely powerful technique that is too infrequently employed. Many people are just not aware of how it differs from traditional tone-burst stimuli. Today it is particularly easy to generate the required burst signals since you can buy an arbitrary waveform generator fairly inexpensively. Also, it would be very easy to include a series of 5-cycle-wide bursts at various frequencies on a test CD; then with an oscilloscope, or perhaps one of the PC-based software test systems, the audiophile would be equipped with a powerful tool for evaluating his system and speakers.

One final attribute of the shaped tone burst that I find very important is that it's a particularly safe signal with which to test the maximum output of components. For instance, if you use a burst rate of 1Hz with a 5-cycle burst you'll have a very low duty-cycle, so even if you require 100 watts to clip your tweeter, the short duration of the burst—it's essentially like a frequency specific pulse—will prevent you from overheating the voice-coil and damaging the driver."

@mikets42
A row of 32 shaped bursts instead of music as stimulus?


20240309 terz .png
 
Last edited:
Administrator
Joined 2004
Paid Member
Hi picowallspeaker,
The 'like/don't like' factor...
This has got to be accepted
Phisics has to stop at a certain point
Now it's all mixed up.
No, it isn't mixed up at all. If you want to confuse the issue, invoke a "like/don't like" factor. You'll not get groups of people to agree firstly. If you don't like the answer you're getting, I guess that confuses the issue for you.

We are closing in on what we should measure and how much it matters. Still being defined and that is how it should be, and how scientific method works. One very sad fact for you is this, physics never stops being part of the equation. If you trip and fall, not matter how much you wish you are exempt from the laws of physics - you're falling. Too bad. So I'm sorry, you can't stop at some point and ignore the laws of physics. It won't go well for you, and later you will find that something you thought sounded good, doesn't. Not compared to something truly better.
But the reflections do cause the perception of a room!
Absolutely! Guess what? They can be, and are measured / calculated when designing any public space, theater or auditorium. All you can do is produce a product with known characteristics. It is up to the installer / user to install them so they deliver the desired results. A manufacturer will never know the exact characteristics of a listening space, this is not their job to be honest. They can design for an average volume and dimensions that are close enough for the speaker to perform well. That's it. If you have a problem room, that would be your problem alone.
 
@benb : thank you very much !

Just to prevent the loss in broken links:

"Linkwitz: From a practical standpoint, the advantage of using a shaped tone burst (one that rises and decays gradually in a sinusoidal envelope) is that all of the burst energy is concentrated into a very narrow frequency band. This is quite different from tone bursts used in the past, where you had a rectangular burst covering a fairly wide frequency band. I chose a spectrum width of a third of an octave for this stimulus—which is a 5-cycle burst—because this corresponds closely to how we hear. A third-octave is about the width of the critical band of hearing. Also, because the burst is so short in duration, you mask out the effect of reflections, so it becomes a sort of poor man's approach to anechoic measurements. As long as you measure the peak of the burst before the first reflection, you've essentially captured an anechoic-like response giving you some of the benefits of Time Delay Spectrometry or Maximum Length Sequence (MLSSA) techniques without the expense.

Now, the shaped tone burst can be used in several ways. For instance, one can just use a microphone to measure the peak amplitude that the burst reaches after you apply it to a speaker, which will give you an approximation of the frequency response. Likewise, after the decay of the 5-cycle burst, there shouldn't be any output from the speaker. In reality, however, if there is stored energy in the drivers or cabinet, the speaker keeps on ringing. Therefore, the shaped tone burst is very useful for identifying the sources of resonant storage. In any event, I do get extremely good correlation between the frequency response measurements derived from the shaped tone burst test and what we hear, as well as specific information about cabinet and driver resonances.

The real benefit of this type of test is that it concentrates the energy into a constant narrow frequency band so that it is a third-octave in width at 100Hz or 1kHz or 10kHz. Therefore, it is much narrower on an absolute basis at 100Hz than at, say, 10kHz. In other words, the tone burst test has a constant resolution on an octave basis. This is important when you compare it to FFT analysis, where you get good resolution at high frequencies but very little information at low frequencies. The shaped tone burst test works on a logarithmic scale so we can get good resolution all the way down to the lowest frequencies. I use this type of test signal to look at the decay of the burst, which gives me the same type of information that you would be looking for in a spectral decay or waterfall plot that MLSSA can generate.

I also have MLSSA, so I do generate the spectral-decay plots as well, but I have to say, I have not found the waterfall plots very useful except for maybe above 1kHz. Below 1kHz there are so many artifacts in the typical spectral-decay waterfall plot that it is useless. Anyway, it's simply a lot easier to get the same, and even much more, information out of the shaped tone burst response. Extending the time record for the FFT in order to get useful low frequency data is generally not practical; using a narrow burst signal makes it so direct and easy. Plus, you can change the frequency of the tone burst on the fly, while you watch the dynamic changes on an oscilloscope, as the tail of the burst stretches out—in effect allowing you to see directly when you're close to a resonance!

I guess I'm beginning to sound a little like a missionary for the shaped tone burst test, but I really do believe it is an extremely powerful technique that is too infrequently employed. Many people are just not aware of how it differs from traditional tone-burst stimuli. Today it is particularly easy to generate the required burst signals since you can buy an arbitrary waveform generator fairly inexpensively. Also, it would be very easy to include a series of 5-cycle-wide bursts at various frequencies on a test CD; then with an oscilloscope, or perhaps one of the PC-based software test systems, the audiophile would be equipped with a powerful tool for evaluating his system and speakers.

One final attribute of the shaped tone burst that I find very important is that it's a particularly safe signal with which to test the maximum output of components. For instance, if you use a burst rate of 1Hz with a 5-cycle burst you'll have a very low duty-cycle, so even if you require 100 watts to clip your tweeter, the short duration of the burst—it's essentially like a frequency specific pulse—will prevent you from overheating the voice-coil and damaging the driver."

@mikets42
A row of 32 shaped bursts instead of music as stimulus?


View attachment 1283951
Absolutely! I use all kinds of tone bursts, usually with exponentially growing fronts and tails, or gaussian shaped. It is also nice to sweep the frequency inside the burst, as in the 1960 Chirp Radar reference. I have quite a collection of bursts on 10 - 15 - 22 - 33 - 47 - 68 - spaced frequencies. The interpretation of the results is not simple, however.

Yes, I do prefer testing on real music. That's by far the best way to realize that I have been a complete idiot.
 
I think that we need to design loudspeakers for the position where they will be used, and the surrounding walls become parts of the speaker. 8ms gating is kind of meaningless for f < 200Hz. A big problem with low freq distortions is that they are frequently not harmonic. Whenever they are - yes.
So what are you measuring then? Must be ‘reverberant’ field below 200Hz. Or rather room modes. Do you (try to) express distortion figures as a % to the reference level below 200Hz? If so, could you explain the relevance of such figures to speaker design? Or system design, the room included? I would expect results ending up very depending on the mike position.
 
  • Like
Reactions: 1 user