jojip--that's basically my question as well. It's really going to come down to how fast the NEON hardware is on the quad a53. The double-precision FP ops are lost on us (I don't need my math to be perfect through -300 dB), but I do wonder if FP math or int math will be faster on the hardware. Another option is to do a single long FIR then split the signals with a IIR filter block, which, ostensibly, should need less hardware to crunch.
Cannot help you at the moment about the I2S link, as my original plan is to use a USB DAC (can work from there).
Speak for yourself I guess. In my LADSPA plugins (DSP IIR crossover filters done in software) all the calculations are done in double precision (real numbers). I am looking forward to trying the 64-bit processing capability of the Pi 3, since this should speed up that portion of my code.
That's fine, it's just wildly excessive: I'll carry the baton of FP32 being far more than good enough; beyond loading down your hardware, what *are* you gaining from FP64 over FP32? 24 bit mantissa seems, especially if you're smart about your coefficients and do volume control at the last second, the accumulated errors are very, very, VERY small. I haven't even looked to check to see what the differences between FP32 and FP64 are after dithering to a 24 bit (even if full scale) PCM stream.
No need for controversy, though, hardware is generally beefy enough to handle it.
Hah, I suppose this is a safe point to jump off to a new thread.I'm wont to implement my system as a hybrid IIR + FIR (a single, long tap to handle low frequency effects) or FIR throughout, so we have different goals. I would be interested to see what kind of error rate you're seeing in FP32 vs. FP64, though. Sounds like quite a long filter chain.
Edit to add: Difference Between FIR and IIR Filters might be helpful for those confused about our respective concerns.
Most current minicomputers like the Raspberry and Orange Pi, ODROID, etc have an FPU in hardware. There is no "slow" FPU emulation going on, and hasn't been for years!he beauty of FIR filters, and quite possibly their most important feature, is that they can be implemented with integer math. As you are surely aware, everyone wants small, low power, low cost, portable devices. These devices typically use a processor similar to the Texas Instruments MSP430, or an FPGA, or an ASIC.
These types of processors work great and are as common as dirt, but seldom have a floating point math core.
I read through a couple of tech reviews of the Pi 3 yesterday. It seems that the performance has been improved more or less across the board: memory, graphics processor, CPU/FPU frequency, etc. This has resulted in an improvement in "speed", as measured with benchmarks and real applications completing is less time, of 50%-100% with the price point remaining the same! Awesome! The addition of onboard WiFi (found to be decent in terms of performance) saves me the cost and resources of a USB WiFi dongle compared to the Pi 2.
After multiple failed efforts to order Orange Pi PCs from China and the price of those climbing from what was around $20 to more like $30+ now, combined with the poor quality of the OS ports to the Orange Pi, I can finally kiss that platform goodbye and embrace the Pi 3. I should be able to move all the code and other tools that I have developed for audio from my Pi 2 to the Pi 3 without a hitch. Judging by the excellent online community for software support and tips, as well as the good quality hardware of the Pis I have used to date, it's a no-brainer to stick with this platform for all my future audio projects.
Too bad that MCM has already sold out of Pi 3 stock!
Charlie--my point was less architecture specific and more the computational demands of each filter design and the importance of having a large mantissa for IIR. That said, NEON speedup on integer ops seems to be greater than FP ops (JPEG is int-heavy), but a far call from emulation.
Second, hybrid FIR/IIR data flow would go:
Input PCM -> Resample to (FP32/96k) -> long-FIR (2ch) -> IIR (6ch) -> Resample to output PCM (TBD)
Where that long FIR will help with DRC/bulk EQ/phase preemption, and the IIR filters can be very lightweight/idealized--get the most bang:buck out of that very expensive FIR step. Or, flip-flop, where low frequencies corrections are done IIR and everything else is done less expensively shorter-tap FIR. (Will explore all architectures)
Obviously FIR gives you a lot more flexibility, and you'd probably go INT32 instead of FP32 for intermediate calculations.
Ultimately, it's important to look at all the compromises and pick the best working solution (even if that's a simpler, less efficient solution and throw more hardware at the problem) after characterization. Or if you're happy, just roll with what you've got.![]()
FIR filters can be run through BruteFIR on RPi. Most of the heavy mathematical lifting is done by the FFTW library, which is already has optimizations for NEON.