'Flat' is not correct for a stereo system ?

Status
Not open for further replies.
Hi,


Graph of the number of periods for frequency detection from Rossing 1989:

The pitch detection is irrelevant in terms of 'flat' where detection is about perceived loudness.



I can see why the experts say steady state is what you hear in the bass.

If it would be true we wouldn't be able to listen to music, only sinusoids.

It is to be noted that natural sounds do not have constant envelope i.e. they are not steady state.

- Elias
 
Detection is actually about detection--minimal perceived stimulus, but that's a different story. Look at our pitch detection time at bass frequencies--that's my point that you are missing. Perceived flat vs. flat is something else and I never say they are equivalent--in fact I say they are not but that so obvious its ridiculous to even mention it again. Instruments(which I consider natural sounds and crucial for MUSIC to take place) have an onset, steady state, and decay--if just the steady stated is played, our ability to detect a difference in instruments is greatly reduced though the spectrograms will still look wildly different. That has little to do with bass in a small room.

Dan
 
Indeed if onset and decay is removed most of the instruments sound almost the same. But actually this is very relevant in bass reproduction in small rooms! A system that is capable of reproducing the onset and decay is an accourate system. The steady state in between is not so important. The onset and decay are relatively short in duration, they resembles more like transient signals. Bass transient? 😀 Yes! It's about the temporal finesse that separates excellent from average bass. If the signal is steady state there is no separate notes, no music.

- Elias


Instruments(which I consider natural sounds and crucial for MUSIC to take place) have an onset, steady state, and decay--if just the steady stated is played, our ability to detect a difference in instruments is greatly reduced though the spectrograms will still look wildly different. That has little to do with bass in a small room.

Dan
 
Well at least you learned an instrument has a steady state. That a good starting point.😀

How is it relevant for bass reproduction in a small room? What transients exist there? To what degree of accuracy can we perceive them? What speed of decay is necessary to reproduce them and what evidence do you have for your answer?

Spell this out for me and everyone. Your belief is right in line with the typical audiophile's and it seemed intuitively important before I learned anything about acoustics, instruments, and psychoacoustics, but the deeper I dig into the situation, the more it seems it's another silly audiophile belief. If you have better knowledge here I'd appreciate a breakdown. The best illustration I can think of that anyone can do it to leave your subs on, play any music you like, and turn off your main speakers. Move around the room, get close to the subs....... What transients do you hear? Can you tell what instrument is being played if you didn't know what it was? The attack and decay seem completely removed. To me it could pretty much be electronica regardless of what's actually being played. Sometimes there's longer pitched blurbs, sometimes shorter, and they can be rough or not, but for me to determine which instrument is playing is well beyond my ability. Please avoid the superior hearing argument so often expressed by believers. Maybe I just haven't heard a subwoofer or room that's 'fast' enough, or maybe it's just a common audiophile misperception? Please elaborate your position. I've read your argument dozens of times over the years, but not one shred of evidence has surfaced as of yet--visual or auditory.

Thanks,

Dan
 
A complex signal is composed of many frequencies. For a bass note the transient part is caried by the upper frequencies not the lower ones.

To be clear I am not saying that the sound is heard in pure steady state, only that the gating time exceeds the measurement window so it IS steady state as far as any measurement is concerned. The ear is still going to detect its transients and envelope characteristics.
 
Dr. Geddes, what you are saying sounds like what everything I have read and experienced, but it is radically contrary to the popular belief for some reason. Elias seems to strongly hold the popular belief and I would love for a believer to stake their claim, but none have yet that I know of. Well at least beyond the superior hearing/gearing argument. I mean what is the integration time going to be on something like a 50Hz tone? It's going to be a large number. I haven't seen any literature on that but it's clear to me that it's beyond what Haas found with speech because we can't even perceive the pitch by then. I can most certainly perceive bass tones, roughness, and smoothness with just a sub on, but it ain't music. The bass notes essentially exist around us before we ever know they are there--well, maybe we can feel them before we hear them but that's just a guess. That might account for the highish Q bass bumps in headphones.

Any case, I think you are absolutely correct even though I don't know how to come up with better proof.

Dan
 
If one filters out most of the signal what is there to be perceived 🙄 Why listen to subwoofer alone? Except for to realise that the steady state is not so important for bass perception as you describe! Bass is not in the subwoofer, but goes all the way to 200-300Hz even to 400Hz+. I have come to the understanding that the quality of bass is not in the fundamental tone but in the accourate reproduction of the temporal coherence of the harmonics, that define the onset and decay of the bass instruments.

Within the context of this thread, a 'flat', there is a need to consider temporal flatness and how it is reproduced.


An example of a poor system, monopole box in a room:
12C_2m5_50Hz-2kHz_50ms_10dB_normalised-Bark-wavelet.png



And an example of a good system, dipole line array in the same room:
ARN-linja_2m5_50Hz-2kHz_50ms_10dB_normalised-Bark-wavelet.png



- Elias
 
One more time...

Papers by Kates and Salmi cover the most appropriate integration time for measurements. They both conclude that the subjective time window varys with frequency. It is long for low frequencies, approaching steady state measurements, short at HF, letting little but the direct sound through and just long enough at mid frequencies to include the floor bounce. Add in critical band smoothing and you are nearly there with a subjectively correct measuring approach.

I can't find those particular papers on the web but Ken Kantor gives a good summary in his Magic Speaker paper:

http://www.kenkantor.com/publications/magic_speaker/magic_speaker.pdf

"Tonal colorations become the major problem until times greater than 20 msec are reached, when reflections begin to effect perceived ambience.... all indicate that reflections around 2 msec. are the worst offendors. From this we can infer that the floor reflection seen in figure 1 will cause tonal coloration to an extent underestimated by conventional measurement techniques, a conclusion reached also by Kates."

David S.
 
Papers by Kates and Salmi cover the most appropriate integration time for measurements. They both conclude that the subjective time window varys with frequency. It is long for low frequencies, approaching steady state measurements, short at HF, letting little but the direct sound through and just long enough at mid frequencies to include the floor bounce. Add in critical band smoothing and you are nearly there with a subjectively correct measuring approach.
It's good to see that this has been studied in detail, but is the result that the integration time is longer at lower frequencies really that surprising if we stop to think about it ?

You can't "detect" a frequency (especially when the detectors in the ear are essentially a huge bunch of moderately high Q bandpass filters which generate nerve impulses when they "ring") until the waveform has had a chance to form. (thanks to the time-frequency uncertainty principle)

At a frequency like 5Khz, one period is 0.2ms, so in theory the fastest you could identify the pitch of that tone is about 0.2ms. Of course it's likely that the ear/brain isn't that fast, so the limitations in detection speed at high frequencies are probably limited by the specific implementation of the hearing mechanism and processing in the brain to perhaps several ms.

At this frequency you can get 50 complete cycles in (or some cycles and some gaps) before a 10ms room boundary reflection would arrive - so provided the brain can process the signal in the right way the reflection can be separated from the direct signal - as in fact it is.

But then consider a 50Hz tone, the period is 20ms long, so you can't even really say that a 50Hz tone exists or identify it as 50Hz until a complete cycle has passed - which takes 20ms, so by definition the integration and detection time of a 50Hz sine wave can't be any faster than this, no matter how advanced the ear is, and it's probably considerably slower if it's implemented as high Q bandpass filters.

By the time a single 50hz cycle has formed the wave-front has travelled several metres, and is already "contaminated" with room response, meaning that before the ear can even identify the formation or existence of a tone it's already bounced around the room quite a bit, (worse still lower than 50hz) making it impossible to separate the direct and reflected signals, even though the time delay of reflections is the same as it is at higher frequencies.

The transition frequency range in the lower midrange where we start to be able to identify the direct signal independent of reflections, rather than it all merging together into a "room response" like it does at bass frequencies, is probably just when the period of the note has become short enough compared to the time delay of early reflections (and processing time of the ear/brain) that the ear has a chance to clearly detect the starting and stopping of frequencies in that region in the direct signal before a reflection arrives.

(If the ear used equal Q bandpass filters at all frequencies - which it does over most of the range except bass, then the higher frequency filters will have a shorter ringing time and thus better time resolution than those at lower frequencies. Perhaps also why the filters are wider at bass frequencies - an attempt to use wider lower Q filters to improve the response time at the expense of frequency accuracy)

Also a good example of why you can't have "fast bass" at low frequencies, not only can a low frequency tone not start and stop quickly, you can't perceive it's appearance and disappearance quickly enough. This perception of speed in bass is all done by the harmonics of the bass note.

It's quite apparent listening to electronic music which uses a "sine bass" (basically a sine wave bass note) between 40-80Hz or so that no matter what you do the attack sounds "soft", as without the harmonics you can't perceive a sharp transient attack that most normal bass instruments with 2nd and 3rd harmonics have... 🙂
 
Last edited:
I'd say listen to the subwoofer alone so you can know the effects of the range in discussion of course. Plus it makes it easy for everyone to understand as subwoofers are common and the experiment is fast and easy to do.

Uh oh, coherence rears it's ugly head again. lol

Elias, I started to think we might be discussing different terms, b/c fast bass is sort of ridiculous IMO--fast harmonics may have some relevance and I'd say that would be more worth discussing. What you seem to be talking about the what I've thought of as midbass. I wouldn't make a strong argument against 'fast' midbass anyway--nor for it.

In the length of your window, we can barely hear a 50 Hz pitch! 😉 IOW, those pictures don't mean much at all for what I call bass. Well "nothing" might be more accurate. My dipoles modeled to 50Hz, but sure didn't sound like it. Maybe your picture explains why.😀

Bass is not in the subwoofer, but goes all the way to 200-300Hz even to 400Hz+
It's about 20 ms to hear a pitch at 400 Hz FWIW and I don't know the integration time, but you can bet its longer. It only takes 2.5 ms for the full 400 Hz wave to form. Maybe we should have a definition of terms. Heck, you even go into the lower registers of the midrange IMO and I believe most people's opinion. I've always though of bass as below 80-100 Hz, that midbass from 80 or 100-300 Hz, Midrange from 300-3000 Hz, and treble above there. Oh, and sub bass as 5-30 Hz or so. Of course those are all a 'roughly' figure. There are too many definitions of those terms in books and on the web.

Please define what you mean by 'temporal flatness' as that's a new term for me. We'll do one thing at a time if it is alright with you.

I'd rather read Dr. Geddes or DDF or speaker Dave's position on this stuff--like a "bass perception in a small room disambiguation dissertation". IOW, someone who knows this stuff deeply. Where is the line drawn to where you can no longer use an EQ to smooth the perceived bass? Is there a definite figure or a blurry line? It's definitely blurry to me where that should be, but the Schroeder frequency will do for me so far as that where the room is sort of behaving as the source.

Edit: Nice post Dave and DBM

Dan
 
Last edited:
Hello David,

"integration time for measurements" What measurements? Loudness?

"appropriate" In terms of what?


Are those times to be used as FFT windowing? If yes, regardless what they propose it only works if room decay is much longer than their proposed time at particular freq. Loudness integration can take up to 200ms if sound is present this time (Zwicker). During 200ms typical domestic room decays tens of dB already, it will not support the loudness of very short duration sounds. Now if one inputs a constant envelope signal into the speaker it will be much louder than the windowed FFT room impulse response predicts.

So what's the point of those times?


- Elias


Papers by Kates and Salmi cover the most appropriate integration time for measurements. They both conclude that the subjective time window varys with frequency. It is long for low frequencies, approaching steady state measurements, short at HF, letting little but the direct sound through and just long enough at mid frequencies to include the floor bounce. Add in critical band smoothing and you are nearly there with a subjectively correct measuring approach.

I can't find those particular papers on the web but Ken Kantor gives a good summary in his Magic Speaker paper:

http://www.kenkantor.com/publications/magic_speaker/magic_speaker.pdf

"Tonal colorations become the major problem until times greater than 20 msec are reached, when reflections begin to effect perceived ambience.... all indicate that reflections around 2 msec. are the worst offendors. From this we can infer that the floor reflection seen in figure 1 will cause tonal coloration to an extent underestimated by conventional measurement techniques, a conclusion reached also by Kates."

David S.
 
Hello David,

"integration time for measurements" What measurements? Loudness?

"appropriate" In terms of what?

Are those times to be used as FFT windowing? If yes, regardless what they propose it only works if room decay is much longer than their proposed time at particular freq. Loudness integration can take up to 200ms if sound is present this time (Zwicker). During 200ms typical domestic room decays tens of dB already, it will not support the loudness of very short duration sounds. Now if one inputs a constant envelope signal into the speaker it will be much louder than the windowed FFT room impulse response predicts.

So what's the point of those times?

Absolutely, the times are appropriate for windowing of the impulse response. Loudness in terms of perceived loudness vs. frequency. The intention is to create a measuring system that parallels human perception. This isn't tied to the acoustics of a particular room size or acoustic, but should lead to a perceptual curve method that holds true in any size room.

The essence of this thread is "what curve sounds flat?". Many are proposing a slight downtilt to the steady state room response in a domestic room. People who work in auditorium design or cinema propose even more rolloff. SMPTE suggests a wide variety of rolloff curves for various seating areas in cinema. Why?

It only makes sense if the ear keys in on the early sound and ignores the later duller sound of the larger venues.

David S.
 
I see a major problem in this FFT windowing approach. If one measures loudspeaker outside without reflections, then no matter how long is the window the loudness will not increase. Whereas inside room measurements produces much more signal with the same window length. How it can be not tied to room size and acoustics?

I have a better solution! 😎 😀


- Elias

Absolutely, the times are appropriate for windowing of the impulse response. Loudness in terms of perceived loudness vs. frequency. The intention is to create a measuring system that parallels human perception. This isn't tied to the acoustics of a particular room size or acoustic, but should lead to a perceptual curve method that holds true in any size room.
 
I see a major problem in this FFT windowing approach. If one measures loudspeaker outside without reflections, then no matter how long is the window the loudness will not increase. Whereas inside room measurements produces much more signal with the same window length. How it can be not tied to room size and acoustics?
I'm not saying the measured result is independent of the room, but the model doesn't need to vary with, or be tied to, room size. (People were suggesting a gating time somehow connected to room size.)

Loudness of the system will increase in the live room, especially at low frequencies where the perceptual window is long enough to essentially measure the steady state response of the room.

David S.
 
In FFT windowing the problem remains of the signal duration. As it was brougth up loudness depends on signal duration, but this impulse response windowing does not help if there is little or no reflections (dead room, outside). How an impulse sounds loud outdoors regardless of windowing? 🙄 But then if one inputs a long tone from an instrument it sounds loud! This is not predicted by the FFT windowing method.

Instead we need the analysis signal to match the integration time, not the window.

My proposal: On impulse response use a wavelet that matches the integration time.


- Elias


I'm not saying the measured result is independent of the room, but the model doesn't need to vary with, or be tied to, room size. (People were suggesting a gating time somehow connected to room size.)

Loudness of the system will increase in the live room, especially at low frequencies where the perceptual window is long enough to essentially measure the steady state response of the room.

David S.
 
In FFT windowing the problem remains of the signal duration.

Problem? I'm speaking of a window to be applied to impulse response testing of frequency response, not of any other stimulus. If the room and loudspeaker make the response longer, then the window determines what energy is measured and what is discarded.

As it was brought up loudness depends on signal duration, but this impulse response windowing does not help if there is little or no reflections (dead room, outside).

Help? If there are no reflections then the window has no effect, the impulse is collected full strength. Isn't this what we want?

Not sure what point you are trying to make.

David S.
 
Well at least you learned an instrument has a steady state. That a good starting point.😀

How is it relevant for bass reproduction in a small room? What transients exist there? To what degree of accuracy can we perceive them? What speed of decay is necessary to reproduce them and what evidence do you have for your answer?

Spell this out for me and everyone. Your belief is right in line with the typical audiophile's and it seemed intuitively important before I learned anything about acoustics, instruments, and psychoacoustics, but the deeper I dig into the situation, the more it seems it's another silly audiophile belief. If you have better knowledge here I'd appreciate a breakdown. The best illustration I can think of that anyone can do it to leave your subs on, play any music you like, and turn off your main speakers. Move around the room, get close to the subs....... What transients do you hear? Can you tell what instrument is being played if you didn't know what it was? The attack and decay seem completely removed. To me it could pretty much be electronica regardless of what's actually being played. Sometimes there's longer pitched blurbs, sometimes shorter, and they can be rough or not, but for me to determine which instrument is playing is well beyond my ability. Please avoid the superior hearing argument so often expressed by believers. Maybe I just haven't heard a subwoofer or room that's 'fast' enough, or maybe it's just a common audiophile misperception? Please elaborate your position. I've read your argument dozens of times over the years, but not one shred of evidence has surfaced as of yet--visual or auditory.

Thanks,

Dan

I'm not a scientist like some of the others here and I certainly can't answer all your excellent questions... but I humbly submit the following at the risk of exposing my ignorance and inferior scientific understanding. 😱

Perhaps "Speed" and "transient bass response" are difficult concepts to understand/explain because there just aren't enough masters/thesis research projects that go far enough to explain these things..... but it seems to me that SL explained some of this fairly well. (At the very least this must be related)

### From Frontiers ###

"The step response of the 3-way looks even further removed from the ideal and it is easy to understand why people like to think this should be accompanied by audible defects. The response is the sum of woofer, midrange and tweeter outputs (c)"
trans-c.gif


The tweeter response, barely visible on the 60 ms time scale, really shows up in a display of the first 5 ms (d).
trans-d.gif


"
Note that the tweeter response has settled in 1 ms. The midrange has its second zero crossing at 4 ms. The woofer has barely started to move after 2 ms. A time window of 5 ms or less is frequently used in speaker reviews to comment about driver polarities and waveform preservation (e). This covers less than 10% of the transient response time for many speakers."

### END QUOTE from SL ###

So, considering the above and thinking of the "Kick" of the large drum and the sound pressure wave that emanates, it's difficult for me to believe that we can't hear/integrate the experience & sound until after many 10s of ms have passed such that several positive and negative pressure waves have passed my ear.

Certainly we can at the very least "feel" the frontal pressure wave of the first and lowest frequency sound wave. In the first pressure wave I suppose that we experience the transient bass more than we "hear" what is thereafter defined as the steady state signal of this very low frequency. Of course we will also hear higher frequency and steady state harmonics that come along with the lowest frequency of the kick which are of course presented at a much higher frequency. But I would hope that we don't start hearing those higher frequency harmonics before the "kick".

Using the traditional double gated measurement, one can "see" much of the above in the response. If your woofer is large and slow the lowest octaves bottom of the audible spectrum does not rise in pressure to match the impulse response of the higher frequency drivers. In my own fully digital 3 way speaker system I am able to time align the drivers so that the peak of the woofer matches more closely the peak of the midrange and tweeter. The effect of this time alignment is that the rise of the woofer is relatively close to the rise in SPL at the top of the spectrum.

Here is a 2ms duration

Screen%20shot%202010-10-06%20at%2012.49.23%20PM.png


(Please ignore the start time of 63ms... this is the delay of the USB processing time in my soundcard/computer)

Now certainly this doesn't represent an accurate picture of the response at the lowest frequency but does this not show that the woofer has created the frontal soundwave in unison and in comparison to the high frequency drivers? Thinking of Elias' visual presentation (impulse "wavelet"?) it seems to me (on the surface at least) that this concept and visual representation may be very helpful. Does it not provide us the ability to more accurately "see" the coherent response across the audible spectrum at the onset of the impulse?

Feel free to demonstrate the error in my understanding if any.... but I would appreciate if you keep it simple for me. 😉
 
Last edited:
In my own fully digital 3 way speaker system I am able to time align the drivers so that the peak of the woofer matches more closely the peak of the midrange and tweeter. The effect of this time alignment is that the rise of the woofer is relatively close to the rise in SPL at the top of the spectrum.

Now certainly this doesn't represent an accurate picture of the response at the lowest frequency but does this not show that the woofer has created the frontal soundwave in unison and in comparison to the high frequency drivers?

I'm not sure I see that in your figures. It looks like most of your treble energy is happening at 63.3 ms and the woofer is more centered around 64. This is frequently the case. With a digital crossover you can exactly line up the energy mounds, but this seldom gives a good crossover. You would be time aligning the middle of each drivers range, when you really need to time allign the crossover region. Your system is smooth through the crossover precisely because you didn't fully time allign the respective driver's energy.

Note also that the reason your LF response extends (artificially) flat below 100 is because the truncation is canceling a negative part of the bass ringing that hangs on much later than the more visible upper range energy.

Notice that Linkwitz choses his words very carefully: "it is easy to understand why people like to think this should be accompanied by audible defects." meaning he is not so sure that this is the case. At KEF we had a piece of gear that would allow you to take a second order (sealed box) system and dial in exact compensation for its resonance and Q, then replace that cutoff corner with any other resonance and Q. This was a perfect opportunity to explore the audible effects of woofer Q. I did a listening test where I varied Q, with the expectation that higher Q's would change the character and lead to boominess and a very different sound. Somewhat disappointingly raising the Q seemed more just to increase volume of bass elements in the vicintiy of resonance. The big character change didn't happen, at least not for a reaonable range of Q (less than 2).

The whole "fast bass, slow bass" perception, in my opinion, has nothing to do with transient response, but is more about level proportions between upper bass and lower bass.

David S.
 
Very interesting to look carefully at SL's graphs. At first glance they look like the same figure repeated, but what is the woofer in the upper figure is the midrange in the lower figure, due to the different time scales.

If the typical 3-way is three bandpasses of similar bandwidth, then their impulse responses are also similar except for a time scale stretch inversely related to center frequency. If you made it a 4 way with a subwoofer section below, the impulse response of that section would stretch out for days.

Slow bass indeed.

David S.
 
interesting discussion that now goes in every direction at once

for my part back to the original topic and an answer from outside looking in -> flat is correct, but conventional triangular stereo is not

if You want it right - accurate, realistic, whatever the word 😉 - You have to abandon such conventional stereo

such questions as the title 'Flat' is not correct for a stereo system? arise from our hearing's permanent dissatisfaction from triangular stereo

less or more - it basically sucks

and bringing in of equalizers etc. won't help, it cannot, simply not possible

best regards,
graaf
 
Status
Not open for further replies.