Let's build a FIR convolver for Pulseaudio Crossover Rack

Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.
That's fair. And frankly, filter synthesis tooling does not interest me.
I can generate my own and there are existing tools others can use if they don't know the fundamentals.

I had the impression you were all about the UX/UI?
Do you not find 2) to be a significant user burden?

To recap, 2) was:

Require users to synthesize filters for each different "common" sampling frequency (44.1, 48, 96kHz, etc) they will use. The framework can chose what filter to use based on the rate of the input and that carried by a filter's metadata.

Actually not really, I generally pin the sampling frequency that pulseaudio runs at internally (and thus all the DSP stuff is done at) to a specific frequency (option "avoid-resampling=false"). Pulseaudio Crossover Rack conveniently shows this in the status bar and so people only need to generate IRs for one sample rate. We still can check at the time of loading an impulse response file from disk, if it matches the sample rate that Pulseaudio is running at.

Actually this makes me think of a nice feature where PaXoverRack can do the modifications of /etc/pulse/daemon.conf for you with proposed changes like setting avoid-resampling to false, sample format to float32le and setting up sample rate according user's choice.

Another thing - perhaps your intentions were different from your wording, but one cannot simply "resample the IR". As I said in 4), it's not that simple. One must resample the input stream or reconstruct the filter targeting a different sampling rate

No, you understood me correctly, thanks for that correction! So reconstructing the IR for a different sample rate sounds like there might be quite some noise introduced by rounding, it doesn't sound very feasible because of that. So let's forget it and just tell the user that he wants to load an IR for the wrong sample rate. Also when loading a .paxor file we should check that the sample rate we're running at still matches the stored IR files' sample rate, something to also keep in mind.

AFAIK, rePhase is Windows-only software. That presents some obvious issues (but not insurmountable, of course).

That's true, but wine comes to the rescue. rePhase is just a single .exe, no need to even install it and it runs perfectly fine with wine. As do some other programs I recently tried like ARTA and foobar, I was quite impressed by the progress that wine has made.

Actually I could imagine some kind of integration with rePhase where the rePhase configuration file is stored on disk or even in the .paxor file and we offer a button to open rePhase with the specified config for the user. Then the user makes some changes, saves the IR to disk and closes rePhase, which in turn makes PaXoverRack read the newly generated IR. Something along those lines... Didn't really try it but to me sounds doable and very feasible.

As for requiring the FFT length:
For a filter specified as a certain curve in the frequency domain (magnitude and/or phase response), the length of the FFT will determine the resolution that the curve is sampled at. Longer FFT == finer resolution per frequency bin, if one views the magnitude of an FFT as a histogram of energy/freq. Note this is independent of the number of taps (coefficients) an FIR filter is spec'd with, and this parameter is also a resolution of sorts.

So the FFT length just has a performance impact at the time of calculating the IR, correct? And it might affect quality of the generated IR, if chosen too low...

---

Finally, do you think it will be any time soon that I can look at your code? If not, I will continue to reverse-engineer brutefir to get something going ;)
 
Hmmm, I'm thinking about the "big picture" of how a LADSPA FIR plugin would work...

The LADSPA API is written such that from the LADSPA HOST to the plugin code can be passed:
  • the sample rate
  • a few user-defined floating point parameters.
  • buffers of audio data
You cannot pass text (e.g. a filename), and passing tens of thousands of FIR coefficients aint going to work because you must declare each passed parameter in the code ahead of time. This means you need to know how many there will be at compile time...

On the other hand, what the LADSPA code itself does inside the plugin is completely free and open. The LADSPA HOST will call the APIs' setup and initialization routines at start-up. I faintly recall that there is a way to have the host call the initialization routine again later without having to fully unload the plugin and load it again - I would need to read the API docs again to refresh my memory. At this point the sample rate is known, so this might be a route for the code to open a known, fixed file name to read in the FIR filter coefficients. I envision a file for each supported sample rate, e.g. FIR_coeff_44100.txt, FIR_coeff_48000.txt, FIR_coeff_96000.txt, etc. As long as the path and filename are known and fixed, the code could remain static and the contents of the external FIR coefficient file(s) could be updated by the user anytime they would like to change the FIR filter.
 
Last edited:
Hmmm, I'm thinking about the "big picture" of how a LADSPA FIR plugin would work...

The LADSPA API is written such that from the LADSPA HOST to the plugin code can be passed:
  • the sample rate
  • a few user-defined floating point parameters.
  • buffers of audio data
You cannot pass text (e.g. a filename), and passing tens of thousands of FIR coefficients aint going to work because you must declare each passed parameter in the code ahead of time. This means you need to know how many there will be at compile time...

On the other hand, what the LADSPA code itself does inside the plugin is completely free and open. The LADSPA HOST will call the APIs' setup and initialization routines at start-up. I faintly recall that there is a way to have the host call the initialization routine again later without having to fully unload the plugin and load it again - I would need to read the API docs again to refresh my memory. At this point the sample rate is known, so this might be a route for the code to open a known, fixed file name to read in the FIR filter coefficients. I envision a file for each supported sample rate, e.g. FIR_coeff_44100.txt, FIR_coeff_48000.txt, FIR_coeff_96000.txt, etc. As long as the path and filename are known and fixed, the code could remain static and the contents of the external FIR coefficient file(s) could be updated by the user anytime they would like to change the FIR filter.

Passing a single float value is enough at this point because you can have a hard-coded directory where to look for IR files (eg. ~/.config/PaXoverRack/IRs) and the convert the float to an int for simplicity and load <int>.wav from named directory. Easy.

Also I went around that whole problem of pulseaudio not being able to change LADSPA plugin parameters at runtime (at least I don't know of any and got no response whatsoever on the pulseaudio mailing list) by using a shared memory interface which the plugins set up at initialization time. The filenames in /dev/shm are based on the plugin name, a float parameter given at init time and current unix timestamp/nanoseconds to avoid clashes. Details can be found in the code.

To make a long story short, there ARE ways to do all that. Might not be very elegant, but who cares...

PS: a FIR filter plugin could even expose the IR or the already FFT-ed version of it via shared memory, too. That way the IR could be changed without any glitches on the fly.
 
Using /dev/shm instead of a directory on a fixed disk is a good idea. I do this in my GSASysCon for files that have to be read over and over again every couple of seconds. I just use the OS to copy files into a subdirectory of /dev/shm that is specific to my program and that I create. The user can maintain their version on fixed disk, but the code works with the in-mem copy.
 
To recap, 2) was:

Require users to synthesize filters for each different "common" sampling frequency (44.1, 48, 96kHz, etc) they will use. The framework can chose what filter to use based on the rate of the input and that carried by a filter's metadata.

Actually not really, I generally pin the sampling frequency that pulseaudio runs at internally (and thus all the DSP stuff is done at) to a specific frequency (option "avoid-resampling=false"). Pulseaudio Crossover Rack conveniently shows this in the status bar and so people only need to generate IRs for one sample rate. We still can check at the time of loading an impulse response file from disk, if it matches the sample rate that Pulseaudio is running at.

Actually this makes me think of a nice feature where PaXoverRack can do the modifications of /etc/pulse/daemon.conf for you with proposed changes like setting avoid-resampling to false, sample format to float32le and setting up sample rate according user's choice.

This means of "pinning of (re-)sampling frequency" is functionally equivalent to 1) from my list, which I referred to as a normalization of sampling rate, and is what I was cryptically referencing when I said that "pulseaudio can help here".

So the FFT length just has a performance impact at the time of calculating the IR, correct? And it might affect quality of the generated IR, if chosen too low...

Yes. Given an IR of length N, appending M zeros to the IR and performing an FFT of length N+M will improve the frequency domain resolution for interpolation. This is only a concern during filter synthesis, not at runtime.

Finally, do you think it will be any time soon that I can look at your code? If not, I will continue to reverse-engineer brutefir to get something going ;)

I am somewhat conflicted on the timing aspect. I would appreciate another set of eyes on my code (feedback, find mistakes, improvement suggestions, etc) at some point. But, I have very limited time at the moment to dedicate to my own project, let alone multiple projects. I also do not want to hand over what is a substantial amount of effort on my part and be excluded or limited in role and direction just because I lack the time right now to meet other's schedule. I am both impressed and a little envious of the time/effort you have put forward :), and also feel a little awkward sharing something that is not polished to my standards.

I am also not sold on the notion of using LADSPA for integration into the audio stack. My concerns are that the (IMO) crippling limitations of that framework must be worked around at great (or inelegant) effort, and perhaps worse, the codebase must compromised to accommodate or become heavily dependent upon pulseaudio or even PaXoverRack. Right or wrong, that is my current thinking. Additional evidence could either bolster or change my opinion of course.
 
Passing a single float value is enough at this point because you can have a hard-coded directory where to look for IR files (eg. ~/.config/PaXoverRack/IRs) and the convert the float to an int for simplicity and load <int>.wav from named directory. Easy.

Easy? Sure.
But this sort of compromise is a fine example of limitations imposed by insisting on LADSPA.

Also I went around that whole problem of pulseaudio not being able to change LADSPA plugin parameters at runtime (at least I don't know of any and got no response whatsoever on the pulseaudio mailing list) by using a shared memory interface which the plugins set up at initialization time. The filenames in /dev/shm are based on the plugin name, a float parameter given at init time and current unix timestamp/nanoseconds to avoid clashes. Details can be found in the code.

To make a long story short, there ARE ways to do all that. Might not be very elegant, but who cares...

This problem (runtime control) is a general one, and I'm glad you've thought about and done something to address it.

I have been considering something similar for my codebase, but using sockets so that some measure of control can be exposed locally and/or remotely. So far, I've only stubbed out the command thread and UI socket - it's not actually handling commands or communicating with the core convolution engine context. It does offer some attractive possibilities though...

PS: a FIR filter plugin could even expose the IR or the already FFT-ed version of it via shared memory, too. That way the IR could be changed without any glitches on the fly.

There will almost certainly be audible glitches if filters are naively changed under operation - that is taking the "In" out of Linear Time-Invariant. Naturally, there are ways to mitigate audible artifacts.
 
I have been considering something similar for my codebase, but using sockets so that some measure of control can be exposed locally and/or remotely. So far, I've only stubbed out the command thread and UI socket - it's not actually handling commands or communicating with the core convolution engine context. It does offer some attractive possibilities though..

As parameter updates must be done inside the run() function, sockets are out of the question for LADSPA as they might block. So i decided to use shared memory because it's fast and efficient, though admittedly not very elegant!


Regarding your codebase - it's up to you. I will copy concepts from other people anyways, if not from you it will likely be brutefir. You can make up your mind in the next weeks... I will come back and bug you by the time I sort of finished the measurement stuff and have time to tackle the FIR filters :D
 
I am also not sold on the notion of using LADSPA for integration into the audio stack. My concerns are that the (IMO) crippling limitations of that framework must be worked around at great (or inelegant) effort, and perhaps worse, the codebase must compromised to accommodate or become heavily dependent upon pulseaudio or even PaXoverRack. Right or wrong, that is my current thinking. Additional evidence could either bolster or change my opinion of course.

One more thought, sorry for the multitude of posts in a row.

You might pack it into a library for me and give me a simple convolve(float* in, float* ir, float* out, int nSamples) function and I'm happy... Yes, sometimes I'm dreaming... lalala ;):D

PS: I know it's not that simple as laid out, as far as I understand for performance reasons you will want to pre-FFT the IR and store it etc...
 
If not LADSPA, which API does your convolver use, or in other words, how did you integrate it into your audio chain?

As I said previously, I'm exploring the ALSA ioplug mechanism. Same means as pulseaudio itself uses to integrate with ALSA. I've got an instrumented, skeleton plugin in place right now, and am building on that. Now you know where I am at progress-wise :).

I've said elsewhere, creating a convolution engine is not the hard part, system integration is the hard part. It's a little disheartening and sad that such is the state of the linux audio stack.

I first played around with the extplug and that did not offer a sufficient interface. Meh, it was only ~350 lines and I don't mind throwing code away if I at least learned something useful along the way. And I did...
 
Actually, for the lack of knowledge, creating the convolution engine will be the hard part for me. All the rest will fall into place, I'm pretty sure.

But what the heck, I'm plenty stubborn and I will get it all done eventually. Not even a year ago there was zero code in my ladspa-t5-plugins folder :D
 
As parameter updates must be done inside the run() function, sockets are out of the question for LADSPA as they might block. So i decided to use shared memory because it's fast and efficient, though admittedly not very elegant!

I'm sure you know this, but sockets don't have to block (SOCK_NONBLOCK).
That is what I'm doing.

You might pack it into a library for me and give me a simple convolve(float* in, float* ir, float* out, int nSamples) function and I'm happy... Yes, sometimes I'm dreaming... lalala ;):D

PS: I know it's not that simple as laid out, as far as I understand for performance reasons you will want to pre-FFT the IR and store it etc...

Dreaming is good :).

BTW, you're not far off.
A snippet of actual use of my library, called cvngn (ConVolution eNGiNe).
The name was shamelessly and tastelessly inspired by/pilfered from nginx.

Code:
    /* Config cvngn */
    cvngn_context_t *ctxt = cvngn_init_context(blocksize, sfinfo_in.channels,
                                               sfinfo_in.channels, sfinfo_in.samplerate,
                                               CVNGN_S16, -1.0f);

    cvngn_load_coeffs(ctxt, "left", l_filename);
    cvngn_load_coeffs(ctxt, "right", r_filename);

    cvngn_init_channel(ctxt, 0, 0, "left", false, false, 0.0f, "left");
    cvngn_init_channel(ctxt, 1, 1, "right", false, false, 0.0f, "right");
Then loop for each block of input - load, convolve, copy output buffers wherever:
Code:
        cvngn_load_ibufs(ctxt, rbuf, rcnt);
        cvngn_convolve_all(ctxt);


        obuf_t *ch0 = ctxt->obufs[0];
        obuf_t *ch1 = ctxt->obufs[1];
Let me see where I can get to, in terms of progress, over the weekend.
AFAIK, my wife has no "plans for us", so I may actually have time to play.
 
Sorry for the radio silence. I've been busy.


I got deep into the ioplug approach for ALSA integration, then realized that it was going to be way more work than I bargained for. So I back-tracked, reverting to an extplug-based approach that carries some important lessons learned in pursuing the ioplug design.


I also made some minor API changes, lots of internal changes, and added some benchmarking scripts to the suite of tests I've been developing in tandem (checking math, api smoke testing, etc).


Maybe more later? Am at work now :-(
 
Hey Tfive and others interested in this effort and related work:

I just stumbled on this DSP library:
KFR

See the section "FIR filtering" from the above (documentation) page. Seems to be templated. If I was a better programmer and more familiar with FIR using FFT I would give it a try. With some help and guidance I could give it a try...

It's available for free under GNU Public License v2:
KFR - C++ DSP Framework | Purchase a license

What do you think?
 
Hey Tfive and others interested in this effort and related work:

I just stumbled on this DSP library:
KFR

See the section "FIR filtering" from the above (documentation) page. Seems to be templated. If I was a better programmer and more familiar with FIR using FFT I would give it a try. With some help and guidance I could give it a try...

It's available for free under GNU Public License v2:
KFR - C++ DSP Framework | Purchase a license

What do you think?


Looked at that lib some time ago. It's a nice piece of work.
You will need to write some C wrappers/adapters to use it in within the ALSA or Pulseaudio APIs.
 
My goal is a little different than what Tfive is doing. I want to use it as part of a LADSPA plugin like I mentioned HERE. I primarily use DSP for loudspeaker crossovers. In this application, the filters are set up and then left alone. For that reason I would have the user inline the FIR coefficients in the code as a const float or similar. Each time they wish to change the FIR filter the LADSPA plugin would need to be re-compiled, but this is no problem for me.
 
Don't really know about the licencing situation with the GPL vs the BSD licence I release all my projects under. I hate the GPL for the reason that you can never be really sure if you can use the stuff released under it or not, all the crap of programs vs. libraries etc. The real open source licence is the BSD licence IMO.

The API looks quite reasonable, will have a look at it some time
 
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.