Smooth (Flat) vs. Accurate (Hi-Fidelity)

Status
Not open for further replies.
10 year ex- recording/mixing engineer here.

Thanks for such an informative post! It strikes to the heart of Toole's "circle of confusion".

In your experience, when pro's mix for high sonic fidelity, do they try to maximize the experience in their mix environment or do they try for a sound that would be optimal in what they think a normal listening room would give, even if the sound` may not be the 100% in the recording environment?

It's so important to understand the recording's side intentions, but this is so rarely discussed.

Thanks for sharing your insights!
 
It is an important subject, and one on which there's got to be a lot of opinions. 🙂

I have often hoped to make my listening room better than the mastering suite. That may sound odd to some, but I've spoken to a number of producers and engineers who say "I'd love to hear it on a really good system!" Not that they aren't proud of their work environment, but they do think there could be something better than the everyday tools they use.

My experience is more in the editing suite, where I used to do a lot of video monitor calibration. Most post production and TV studios were very happy to use calibrated monitors and loved that they all matched. They'd still keep a regular, non-calibrated consumer TV just a "home style" reference, tho. What does it look like on a typical consumer TV?
 
Thanks Earl and DDF.

DDF, this is a pretty good (old) paper on answering your questions as it covers a variety of control room designs, pros, cons, monitoring preferences, etc. http://usir.salford.ac.uk/9458/2/A_Study_on_the_Acoustics_of_Control_Rooms_for_Critical_Music.pdf

The section on Preferences and Opinions of Professional Audio Control Room Users on page 18 provides anecdotal quotes from 18 audio professionals. As Pano says, lots of opinions – it’s a great read as one can get a sense of what actually goes on in that world. I found my mixes translated better as the monitoring environment became more neutral – but I only found this out over time by. In bad rooms and/or monitors, you ended up checking the mixes overnight as everyone took a copy home listening in the car, home stereo, headphones, ghetto blaster, etc. Come back next day, get everyone’s feedback, make incremental adjustments, do it all over again until everyone is satisfied or the money runs out. Similar stories as in the doc.

Bottom line was that we mixed the sound as wideband as possible at 83 dB SPL measured at the mix position. Our ears have the flattest response in the 80 to 90 dB SPL range allowing for transients to 105 dB SPL. If the mix sounded wideband and balanced at this monitoring level then it also translated well. Mixing at a high level will actually have the opposite effect as our ears frequency response is different at that level, so a balanced mix at that level will sound thin at 83 dB SPL. Another hint as to what level to critically listen to music for the right balance that the engineer was trying to achieve. More here: How to Make Better Recordings Part 2 - Digital Domain: CD Mastering | Mastered for iTunes | Audio Mastering | Blu-Ray Mastering

35 years ago, the live end dead end control room design came into existence and had a significant impact on control room designs from that point on. If one can get access to this doc: http://www.aes.org/e-lib/browse.cfm?elib=3965 it highlights the key psychoacoustic effect in figure 12. “The psychoacoustic effect of the LEDE design technique is to give the mixer’s ears the acoustic clues of the larger space, thus allowing the perception of hearing the studio rather than the control room.” That’s the most misunderstood concept of this design as leverages the Haas effect or law of the first wave front: https://en.wikipedia.org/wiki/Precedence_effect also the masking effect is in play here as well: https://en.wikipedia.org/wiki/Auditory_masking

To get an idea of the Haas effect, this video gives an audible demonstration of it. https://www.youtube.com/watch?v=UQOkSF8auFc while one can listen on speakers, try headphones and really tune into what it sounds like. Note the different “width” or localization of sound as the milliseconds of delay are introduced. Even a 3 ms delay is clearly audible.

How this does this translate into early reflections destroying imaging? Another way to think of it is sound travels roughly 1 foot per millisecond. In the video you can hear the effect of a 3ms (or more) delay on the signal. So if your speaker sound is bouncing off the floor, ceiling, back wall, whatever, and exceeds the audible masking threshold relative to the direct sound, then the image will be shifted based on the amount of delay introduced by the first reflection points and their amplitudes. See Fig 7.9 in http://www.embedded.com/print/4015907 interestingly the data is credited to Toole.

Why apply this psychoacoustic effect to home listening environments? While acoustic diffusers and absorbers can be expensive, there are lot of DIY designs that work as effectively and reasonably cheap to do, with a reasonable WAF. In my home environment, I have heavy carpet with double underlay from the speakers to couch to reduce floor bounce below the masking threshold. A couple of bass traps and some absorbers on the back wall and ceiling. Taking an ETC measurement using REW helps identify the reflection points and the threshold values in dB. Reducing the 1 or 2 or 3 early reflection spikes as shown in the ETC below the masking threshold does not take a great deal of passive treatment. It is more a question of where than how much.

The end result will have similar effect where one hears more of the recording before hearing the effects of the listening environment. But that’s my preference. Cheers!
 
Mitch

My room is setup almost exactly like yours. I have a small throw rug on top of a futon - about 4 inches thick - placed between the speaker and the couch. The back wall (behind the speakers) is heavily damped - several inches thick - and there is a ceiling diffuser at the first ceiling reflection point. Otherwise the room is quite reflective. The imaging is superb and yet there is a lot of spaciousness to the sound - the room sounds a lot bigger than it is, which is difficult since it is sound proofed from the rest of the house and as such is quite closed up.
 
after trying the Katz curve and the B&K curve for a couple of days, I definitely agree for rolling off highs, but did find in my system that those curves were a bit too aggressive.

for example, I played the sax for a couple of years, and the tone of coltrane is too dull and dark with a definite unrealism concerning the resonant nature of the instrument.


I will try the EBU-Tech 3276 curve which is:
flat at 2khz
-3db at 10khz
-4 at 20khz
In 1998, the European Broadcast Union produced a Tech note (EBU-Tech 3276) called, “Listening conditions for the assessment of sound programme material: monophonic and two–channel stereophonic”. https://tech.ebu.ch/docs/tech/tech3276.pdf Again from a frequency response perspective, the recommendation was to have a flat frequency response out to 2 kHz with flat 1 db per octave rolloff.See Fig 2 on Page 6:

Tech%203276%20Target%20FR_zpsayfvcdkx.jpg
 
Last edited:
after trying the Katz curve and the B&K curve for a couple of days, I definitely agree for rolling off highs, but did find in my system that those curves were a bit too aggressive.

for example, I played the sax for a couple of years, and the tone of coltrane is too dull and dark with a definite unrealism concerning the resonant nature of the instrument.


I will try the EBU-Tech 3276 curve which is:
flat at 2khz
-3db at 10khz
-4 at 20khz

The EBU curve sounds to be quite close to the room curve preferred by trained listeners, from Olive's study (attached). Looks like a good next step
 

Attachments

  • Room Curve Olive.JPG
    Room Curve Olive.JPG
    89.3 KB · Views: 253
Thanks Earl and DDF.

DDF, this is a pretty good (old) paper on answering your questions as it covers a variety of control room designs, pros, cons, monitoring preferences, etc. http://usir.salford.ac.uk/9458/2/A_Study_on_the_Acoustics_of_Control_Rooms_for_Critical_Music.pdf

The section on Preferences and Opinions of Professional Audio Control Room Users on page 18 provides anecdotal quotes from 18 audio professionals. As Pano says, lots of opinions – it’s a great read as one can get a sense of what actually goes on in that world. I found my mixes translated better as the monitoring environment became more neutral – but I only found this out over time by. In bad rooms and/or monitors, you ended up checking the mixes overnight as everyone took a copy home listening in the car, home stereo, headphones, ghetto blaster, etc. Come back next day, get everyone’s feedback, make incremental adjustments, do it all over again until everyone is satisfied or the money runs out. Similar stories as in the doc.

Bottom line was that we mixed the sound as wideband as possible at 83 dB SPL measured at the mix position. Our ears have the flattest response in the 80 to 90 dB SPL range allowing for transients to 105 dB SPL. If the mix sounded wideband and balanced at this monitoring level then it also translated well. Mixing at a high level will actually have the opposite effect as our ears frequency response is different at that level, so a balanced mix at that level will sound thin at 83 dB SPL. Another hint as to what level to critically listen to music for the right balance that the engineer was trying to achieve. More here: How to Make Better Recordings Part 2 - Digital Domain: CD Mastering | Mastered for iTunes | Audio Mastering | Blu-Ray Mastering

35 years ago, the live end dead end control room design came into existence and had a significant impact on control room designs from that point on. If one can get access to this doc: AES E-Library The LEDE- Concept for the Control of Acoustic and Psychoacoustic Parameters in Recording Control Rooms it highlights the key psychoacoustic effect in figure 12. “The psychoacoustic effect of the LEDE design technique is to give the mixer’s ears the acoustic clues of the larger space, thus allowing the perception of hearing the studio rather than the control room.” That’s the most misunderstood concept of this design as leverages the Haas effect or law of the first wave front: https://en.wikipedia.org/wiki/Precedence_effect also the masking effect is in play here as well: https://en.wikipedia.org/wiki/Auditory_masking

To get an idea of the Haas effect, this video gives an audible demonstration of it. https://www.youtube.com/watch?v=UQOkSF8auFc while one can listen on speakers, try headphones and really tune into what it sounds like. Note the different “width” or localization of sound as the milliseconds of delay are introduced. Even a 3 ms delay is clearly audible.

How this does this translate into early reflections destroying imaging? Another way to think of it is sound travels roughly 1 foot per millisecond. In the video you can hear the effect of a 3ms (or more) delay on the signal. So if your speaker sound is bouncing off the floor, ceiling, back wall, whatever, and exceeds the audible masking threshold relative to the direct sound, then the image will be shifted based on the amount of delay introduced by the first reflection points and their amplitudes. See Fig 7.9 in http://www.embedded.com/print/4015907 interestingly the data is credited to Toole.

Why apply this psychoacoustic effect to home listening environments? While acoustic diffusers and absorbers can be expensive, there are lot of DIY designs that work as effectively and reasonably cheap to do, with a reasonable WAF. In my home environment, I have heavy carpet with double underlay from the speakers to couch to reduce floor bounce below the masking threshold. A couple of bass traps and some absorbers on the back wall and ceiling. Taking an ETC measurement using REW helps identify the reflection points and the threshold values in dB. Reducing the 1 or 2 or 3 early reflection spikes as shown in the ETC below the masking threshold does not take a great deal of passive treatment. It is more a question of where than how much.

The end result will have similar effect where one hears more of the recording before hearing the effects of the listening environment. But that’s my preference. Cheers!

Mitch, thanks for your thoughtful reply. Its heartening to know that the consensus was to aim for neutrality and detail in the control environment, and they try to factor in their room environment.

Thanks for the tip that control environments monitor at 83 dB (77 dB for highly compressed pop), points to using 83 dB level when voicing loudspeaker designs on the other end of the process, if trying to reproduce an average as close to that intended by the studio.

The one link explained calibrating the level "Adjust the monitor gain to yield 83 dB SPL using a meter with C-weighted, slow response." My background is telecom audio, and the following standard is effective at predicting how frequency response affects perceived loudness. I think adapting this to the recording world could significantly improve calibration accuracy studio to studio (Annex G, wideband) https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-P.79-199909-S!!PDF-E&type=items
It's quite easy to implement in excel from a quick measurement

A couple surprises for me in the opinions section of the Fazenda paper :
"I find it difficult to make decisions about the balance of instruments on big speakers, it sounds to good"
"Although main monitors will almost invariably give a better detail of sound"
It's interesting they find main monitors more detailed near field. Near field has higher proportion of direct sound than reverb

Great demo of the haas effect, thanks for sharing. 3ms pulled the image right over, 5 to 7 ms the transient parts were pulled right to one side but any sustained low frequencies were very diffuse (all with headphones). Spaciousness is a lateral phenomena: its increased by decreasing inter-aural correlation

Recording environments uses LEDE to see into the mix and obtain highest perceived SNR possible (Moulton is a fan of this as well). So if aiming for highest detail on playback, LEDE there makes sense.

Doesn't discredit those those that favour spaciousness over pin point detail, where dipoles or omnis and more earlier reflections in the room are helpful. One of the other benefits of early reflection in the listening environment is that it allows you to better hear reverberations buried in the recording (I couldn't find the Olive paper with this outcome, but it's interesting). But as Earl pointed out, at a cost to timbre. Here's a good paper getting into some of this
http://www.davidgriesinger.com/pitch3.doc

Many good listening tests report papers from Hartmann, Griesenger, Bech, Olive and others on the topic worth investigating for anyone interested in more detail
 
Thanks for the tip that control environments monitor at 83 dB (77 dB for highly compressed pop),....
But do they? There is a lot of talk on the pro forums that they "Should" but many don't. Cinema has standards, the music industry does not.

I remember seeing somewhere a measurement survey of many mastering suites. It was an FR measurement, I don't know of levels were included. Anyone else seen that?
 
But do they? There is a lot of talk on the pro forums that they "Should" but many don't. Cinema has standards, the music industry does not.

I remember seeing somewhere a measurement survey of many mastering suites. It was an FR measurement, I don't know of levels were included. Anyone else seen that?

Certainly many don't and its a bit of the wild west out there, but for those on the recording side that are quality oriented (ie recordings you'd care about in hifi), this as close to guidance as I think we'll get
 
Pano, are you referring to this one?
Yes, that's a good one. And all with the SAME monitor, too! Makes you wonder. Alas no SPL given there, tho I do know what 83dB is a widely used target. But 83dB is only referenced to listening level, I suppose. Not to a given level on the digital file.

Would be nice to see a similar study of levels taken with a reference file.
 
Geeez.. Accurate and flat FR response are variations of the same theme.
Distortion is the result of an inadequate to purpose driver.
IMO 'Pro" control rooms are likely THE worst possible reference ponts
 
Last edited:
The EBU curve sounds to be quite close to the room curve preferred by trained listeners, from Olive's study (attached). Looks like a good next step
wow, havent saw that again, but yeah, it seems that the EBU curve is very alike the olive study.


I also wonder how the effect of very good bass trapping in a room would reduce or increase the preference for more bass.
Ive been reading all week about house curve, subwoofer integration, target curves. The bass is ALWAYS boosted. I really wonder if preference for boosted bass would be so prevalent in very well bass treated rooms

Geeez.. Accurate and flat FR response are variations of the same theme.
Distortion is the result of an inadequate to purpose driver.
IMO 'Pro" control rooms are likely THE worst possible reference ponts
Flat, for most listeners in a blind setting, is unnaturely bright. A accurate speaker needs imo to be flat. but a flat response in a room will end up sounding brighter then the real instrument and its therefore inaccurate.

why ''pro'' rooms are likely the worst reference points?
 
Last edited:
If we can judge by the survey referenced above, then the average control room with Genelec monitors is flat with a little bass bump.
Our house curve does not fit that. What's up?

Per the Toole paper and the attachment I just posted, non trained listeners preferred a room response that's tilted up in the high end, trained listeners tilted down in the high end. Olive proposed something in the middle to try and suit both.

It's a mistake to read too much into this though. Who knows what the room responses were in the control environments for the recordings used by the Olive study, for example?

If those recordings were made on a system with flat on average room responses per the Genelecs, it would indicate Olive's trained listeners like sound less hot than the recording pros that made those recordings. But without that reference: what was the in room response of the recordings used to create these targets, it's really hard to interpret any of these house curves recommendations
 
Last edited:
Ive been reading all week about house curve, subwoofer integration, target curves. The bass is ALWAYS boosted.

Every explanation I've read from Harmon and other industry sources is that the average home system has some bass boost because of the way average speakers and room are set up, so they recommend the same slight bass boost in the recording environment, then by extension, speaker designs should target this response in room as well (as far as they're concerned).

Looking at the responses of highly regarded high end speakers from NRC measurements, they follow this trend as well.

For home theater, its even more aggressive. Attached is JBL's recommended starting point for the room curve for home theater
 

Attachments

  • Harmon Synthesis Curve.JPG
    Harmon Synthesis Curve.JPG
    89.3 KB · Views: 203
Status
Not open for further replies.