One thing is for sure: a speaker hás to sound different at low levels compared to high levels.
Serious question: assuming the speaker is operating within its limits (ie, no excessive distortion from being driven too hard), and putting aside psycho/physio-acoustic effects,
why should a speaker sound different at low levels vs high levels?
I've made attempts to measure this "sticktion" that supposedly occurs at low levels.
What I did was this: place a very sensitive microphone (the sort that gets used for choral recordings) very close to the speaker cone. Then, using REW, run a frequency sweep.
Then, I decreased the signal level by 6dB and ran the sweep again.
I repeated this process, looking for any evidence that the speaker suspension (or whatever) was "sticking" - perhaps a change in the frequency response curve or similar.
For the penultimate sweep, I only heard a short range of frequencies in the kHz area, where we know the ear is most sensitive. For the final sweep, I heard nothing at all.
The microphone picked up everything perfectly well, though.
What I found was this: there was no change to the measured frequency response, down to levels below my threshold of audibility.
If sticktion was occurring, I would expect some changes. Perhaps the lower bass would still come through (larger excursions), but the upper bass (less excursion) would disappear or reduce in level more than just the 6dB changes I was making.
No changes were observed, so I can't support the sticktion theory.
The sensible explanation I can find for this stuff is related to two things:
- Acoustic signal-to-noise-ratio
- Our hearing mechanisms
For the case of acoustic SNR, let's say that music has 30dB of difference between the loudest peaks and the quietest details. It might be more than that, but it's a number we can play with and adjust later.
I'm sitting in my listening room right now, and I can hear the fridge in the kitchen. It's pretty quiet, perhaps as low as 30dBSPL at my ears.
The implication for music, though, is this: if I'm listening at 60dBSPL peaks (a comfortable, level, if a little on the quiet side), then the lower-level details are at 30dBSPL. That's the same level as the fridge, so it's likely that some of those details may be masked.
If I want to hear those low-level details clearly, then I must increase the level of the program material, or switch off the fridge. Since the fridge is keeping my drinks cool, there's only one solution.
When designing speakers, cone break-up of a mid or woofer can be a problem. In order to stop those peaks from becoming audible, the aim is to attenuate them. I've seen some designers aim for 20dB, others as high as 40dB. Some settle in the middle, at 30dB. We'll take that number.
So, in order for the low-level details of the music to completely mask the noise level of my fridge, they must be at 60dBSPL. The peaks, therefore, would be at 90dBSPL, which is a fairly high level: holding a conversation would be difficult.
Taking a more difficult goal of signal being 40dB over the noise, we must then subject our ears to 100dB peaks. Bear in mind that the goal here is to simply make sure we're hearing all the detail in the music.
If we want to hear what's 40dB down in the recording (instead of 30dB), we must increase the levels again, resulting in 110dB peaks.
We can clearly see, then, that ensuring proper acoustic SNR will improve the amount of detail we can hear, subject to certain factors.
The mechanisms behind human hearing have been studied by better scientists than I. There's plenty of reading around online for that side of things.
I'd also suggest that subtle (or not) tactile sensations can contribute to our overall enjoyment of music.
I'd also mention that a speaker's harmonic distortion generally rises with SPL, and they generally produce low-order distortion, which we might find pleasant to hear. That deviates from strict "high fidelity", though, towards more subjectivist landscapes.
Chris