If that is the case, then there should be some trace messages about flushing denormals. @spfenwick could you try a quick run with trace level logging and check for any "flushing subnormal" messages?
In case it's helpful, in my LADSPA plugins I modeled what was done by Richard Taylor in his code to "kill" denormals.
He/we add into all incoming data a square wave signal at the sample rate that has a very low float value, like 1E-15 (-300dB WRT a signal level of 1.0). It's way too low to result in any audible effect but it prevents denormals from occurring and slowing things down.
To do this you add +1E-15 to one sample, -1E15 to the next one, and repeat for all incoming samples.
He/we add into all incoming data a square wave signal at the sample rate that has a very low float value, like 1E-15 (-300dB WRT a signal level of 1.0). It's way too low to result in any audible effect but it prevents denormals from occurring and slowing things down.
To do this you add +1E-15 to one sample, -1E15 to the next one, and repeat for all incoming samples.
Thanks @CharlieLaub, I have seen that (of course I have looked at your biquad code 😁) and I considered to use denormal killers the same way, but I decided to try with just flushing denormals after each chunk. That runs a bit faster, and it just feels a little sad to spend CPU time on adding a dummy signal.
Did you ever see any issues while a signal is playing?
Did you ever see any issues while a signal is playing?
@HenrikEnquist The overhead for adding the denormal killer is just one extra addition and multiplication per sample. The addition adds the value of the denormal killer and the multiplication flips the sign of it in preparation for the addition to the next sample. You can run a frame of data through this process in a single loop and probably gain some speed via optimization. I didn't do anything special IIRC.
I have never encountered any sort of issue related to denormals or the slowing down of audio processing when using my LADSPA plugins.
I have never encountered any sort of issue related to denormals or the slowing down of audio processing when using my LADSPA plugins.
It looks as maybe the compiler generates some strange combination of SSE instructions. Not sure about this yet but it looks dodgy. There is an SSE multiplication that only uses one of the two lanes, but doesn't seem to clear or load anything into the unused lane. If the unused lane contains some rubbish from before, it could explain how it's possible to stumble on denormals that should not exist.
That would also explain why choosing a different target CPU helps, since that likely generates a different instruction sequence.
That would also explain why choosing a different target CPU helps, since that likely generates a different instruction sequence.