Open Source DSP XOs

Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.
To clarify...

I spent 8 years writing code for the Blackfin DSP, which is deployed in thousands of radio broadcast transmitters. Turn on an AM/FM radio and spin the dial, odds are you'll land on at least one station where you'll hear a 16-bit Blackfin doing 32-bit audio processing.

The Blackfin is a 32-bit RISC processor (has 32-bit registers and a 32-bit ALU) so it can handle 32-bit data natively for everything but multiplication. And with that, the processor does two 16x16->40 MACs in a clock cycle, but you can use 3 MAC instructions in 2 clocks, across 2 accumulators, to accomplish an effective 32x32->71 MAC. So a 500MHz Blackfin is more or less equivalent to a 250MHz, 32-bit, single-MAC DSP.

For a little crossover project, something like the BF592 processor would be perfectly suited. 200/400MHz, 64 lead LFCSP, costs <10 bucks in low quantities, doesn't require much care/feeding (nearby SPI flash to boot from, or a small uC to throw a code image at it). Unfortunately, I can't really recommend a Blackfin for an open source crossover project, for a couple of reasons:

- You need proprietary tools (VisualDSP++) to develop for it. There is a Linux/GCC port to the Blackfin, but as of a couple years ago it couldn't generate code that could run wholly within the DSPs own SRAM.
- To get a VisualDSP license, you can either download it and use it for 30 days, buy it outright for a few k$, or buy a $250 evaluation kit which comes with a license for your chip.

So onto ARMs...

I wouldn't do a clock-by-clock comparison of any of the ARM processors with "DSP instructions" to a real DSP chip like a Blackfin, SHARC, C5000, DSP56K, etc. A multiply-accumulate instruction doesn't make a DSP, you need to be able to do the following in a single clock to earn that title:

- One or more MACs
- Fetching from two memory pointers
- Writing to one memory pointer
- Incrementing/decrementing all three memory pointers with address wrapping.
- Loop management

ARM isn't there yet, and I'm sick of explaining this to chip representatives that come to visit and tell me "DSPs are obsolete with these new ARMs!" :D

But that being said, a carefully programmed 100+MHz Cortex-M can probably rip through more IIR filters than you'll ever need for any home crossover application, even if it's spending most of its CPU cycles doing pointer arithmetic.


Blackfin is 16 bit so 32 bit processing takes quite a few clocks. The 32 bit data path and 80 bit MAC on the SHARCs is a more proper competitor to the M4's 32 bit data path and 64 bit MAC. A DSC like the M4 tends to be faster on IIR and DSPs like the SHARC faster on FIR and even fairly low normalized frequency biquads don't benefit from the extra 16 bits in the SHARC MAC. So it's logical diversification on the DSP side, greater flexibility of a microcontroller aside.

Each SPORT has SCLK, WCLK, and two data lines which can operate in either direction. From a quick look in the neighborhood of page 23-27 it sounds like the data lines are independent of each other. So with the four half sports on the 176 pin LQFP one could probably do one I2S in and three I2S out for, say, SPDIF receive and three way XO. Or USB in and four way XO.

If one opts for a part compatible with the SPORTs packed I2S format then IO is a non-issue for most scenarios. The CS4365 and CS4385 DACs and CS42526 and CS42528 codecs come to mind.

Interesting offering. Given the right tools and pricing it could be a viable alternative to NXP's LPC4300s for the mainline case where only one "codec"'s worth of IO is needed.
 
So a 500MHz Blackfin is more or less equivalent to a 250MHz, 32-bit, single-MAC DSP.
Yup. :) In some ways I'd rather use a Blackfin or SHARC than an M4---especially the audio SHARCs---but there's no way I can justify the VisualDSP++ license cost for a DIY project. And, given the pace my projects go at, I'm not hitting code complete and stable in a 30 day trial window. Several of the eval boards are attractive for getting around the limitations of miniDSP's miniSHARC but the projects I'm doing ultimately want their own boards. My read of Analog's eval board docs was that the license is locked to the specific eval board you get. If it's actually such that you can program any instance of the part you choose on any board that'd be quite a bit more powerful.

Base editions of the M4 tools are free downloads and debuggers eBay as low as 99 cents plus shipping. That's a more accessible price point for supporting something like an open source DIY project base than a few hundred for an eval board. And, as you point out, the M4 has enough oomph for IIR and short FIR. The ADSP-21369 on the miniSHARC isn't exactly a slouch but, last I checked on miniDSP's long FIR support (seven months ago or so), it was rather a struggle to fit filters of a size useful for bass correction onto the processor even at 44.1. So, for DIY stuff, PCs or embedded PCs and a pro audio interface remain the most viable long FIR approach for most folks. That's a capability I've had set up for years.

So I wound up targeting the M4 despite its limitations compared to a DSP.

that site doesn't explain anything related to hardware.
Hope reading more carefully helps...
 
I want to thank twest820 and gmarsh for their comments on various processors and their comparative abilities. I've designed with a couple of different DSP chips, but it's good to hear your experience with other chips that I've not yet used.

The discussion of fixed point is also welcome. I've done extensive fixed point on C5000, mostly in assembly, but also in C where it made sense. Nice to see you taking the time to explain the details here.
 
Yup. :) In some ways I'd rather use a Blackfin or SHARC than an M4---especially the audio SHARCs---but there's no way I can justify the VisualDSP++ license cost for a DIY project. And, given the pace my projects go at, I'm not hitting code complete and stable in a 30 day trial window. Several of the eval boards are attractive for getting around the limitations of miniDSP's miniSHARC but the projects I'm doing ultimately want their own boards. My read of Analog's eval board docs was that the license is locked to the specific eval board you get. If it's actually such that you can program any instance of the part you choose on any board that'd be quite a bit more powerful.

Base editions of the M4 tools are free downloads and debuggers eBay as low as 99 cents plus shipping. That's a more accessible price point for supporting something like an open source DIY project base than a few hundred for an eval board. And, as you point out, the M4 has enough oomph for IIR and short FIR. The ADSP-21369 on the miniSHARC isn't exactly a slouch but, last I checked on miniDSP's long FIR support (seven months ago or so), it was rather a struggle to fit filters of a size useful for bass correction onto the processor even at 44.1. So, for DIY stuff, PCs or embedded PCs and a pro audio interface remain the most viable long FIR approach for most folks. That's a capability I've had set up for years.

So I wound up targeting the M4 despite its limitations compared to a DSP.


Hope reading more carefully helps...

take two fractional values of 0.5 and multiply them together using an integer multiplier and tell me what you get ? Maybe there is some other trick to it but all of the dsp books that I have are very vague on these sides of things.

And how long has this thread been going for with talk of M4 this and M4 that but where are the tools and where is the hardware ? So far nothing. There is an old saying that you get what you pay for. ;)
 
Last edited:
take two fractional values of 0.5 and multiply them together using an integer multiplier and tell me what you get ? Maybe there is some other trick to it but all of the dsp books that I have are very vague on these sides of things.

And how long has this thread been going for with talk of M4 this and M4 that but where are the tools and where is the hardware ? So far nothing. There is an old saying that you get what you pay for. ;)
In a 16-bit DSP with the usual "1.15" format, the value 0.5 is 0x4000 or 16384.

If you multiply 16384*16384 together with an integer multiplier, resulting in a 32-bit number, you get 268435456 or 0x10000000.

In "2.30" format, this corresponds to 0.25. Typically you'd shift the result 1 to the left to put yourself in 1.31 format (with the result being 0x20000000), and then to go back to 16-bit 1.15 format, just drop the low 16 bits to give you 0x2000 - which in the original 1.15 format, is 0.25.

Things get weirder when you include sign bits and such.
 
I want to thank twest820 and gmarsh for their comments on various processors and their comparative abilities. I've designed with a couple of different DSP chips, but it's good to hear your experience with other chips that I've not yet used.

The discussion of fixed point is also welcome. I've done extensive fixed point on C5000, mostly in assembly, but also in C where it made sense. Nice to see you taking the time to explain the details here.
I'm banging out a C55x design right now at the day job. There's been a few times that I've contemplated flying to texas to key someone's car, mostly CCS and emulator issues, but so far so good. Nobody else even come close to what the C55x can do for power efficiency.
 
I'm banging out a C55x design right now at the day job. There's been a few times that I've contemplated flying to texas to key someone's car, mostly CCS and emulator issues, but so far so good. Nobody else even come close to what the C55x can do for power efficiency.
I don't want to get too far off-topic, but since this is "open source DSP" I think it's still mostly on-topic:
Anyone working on C5000 firmware who is also using Texas Instruments' open-source dspLib should check out my errata section in their Wiki
DSPLIB - Texas Instruments Wiki
During my development process, I found five bugs in the source. These Wiki entries describe the problems and give information useful for correcting the errors.
 
In a 16-bit DSP with the usual "1.15" format, the value 0.5 is 0x4000 or 16384.

If you multiply 16384*16384 together with an integer multiplier, resulting in a 32-bit number, you get 268435456 or 0x10000000.

In "2.30" format, this corresponds to 0.25. Typically you'd shift the result 1 to the left to put yourself in 1.31 format (with the result being 0x20000000), and then to go back to 16-bit 1.15 format, just drop the low 16 bits to give you 0x2000 - which in the original 1.15 format, is 0.25.

Things get weirder when you include sign bits and such.

Yes I had a look at the SHARC programmers reference manual and there are indeed different variants of the multiply instruction depending on whether you are dealing with fracts or ints so you don't use the integer multiplier variant if you want to multiply two fracts in a single cycle multiplier. No doubt the Cortex M4 has similar instructions to deal with fracts or integers ;)
 
I want to thank twest820 and gmarsh for their comments on various processors and their comparative abilities.
You're welcome. :)

And another not quite as old saying that goes 'don't feed the trolls'.
Hmm, I wouldn't necessarily go that far, though the continuing confusion over instruction sets isn't something I've a new response to. Did remind me to check on what's been happening with NXP's acquisition of Code Red, the answer being LPCXpresso 6 released last week with the maximum program download doubling on the free version to 256k. LPCXpresso 6 includes an NGX_LPC4337-Xplorer example, so looks like an update to the LPC4330-Xplorer's coming. The LPC4337 also happens to be the only 4300 part with flash DigiKey and Mouser stock in LQFP.

Could end up being a convenient configuration of offerings despite being limited to utilizing a quarter of the flash. Email to sales@ngxtechnologies.com bounces due to misconfigured forwarding on their end so I've got nothing in regards to a date for availability.
 
M4F too, but only available in BGA. (Besides, it'll be a small miracle if I get my LPC4337 board done before the part's LQFP package moves from qualification to production. So it's not like I'm stuck on that. :p)

The LPC4330 Xplorer apparently underwent a silent morph to the LPC4337 Xplorer (ref here, here, and here). Based on my email exchanges with them NGX sales is unaware of this despite similar changes on the LPC1850 Xplorer.
 
Those guys from PHILIPS/NXP know what they do. The NXP LPC4370 looks like a seductive 32-bit audio DSP engine with high quality USB-audio connectivity (asynchronous 2496 USB-audio?) and Ethernet audio streaming.
- Cortex-M0 #1 for dealing with USB/Ethernet audio protocols without involving the Cortex-M4
- Cortex-M4 left unencumbered so can dedicate entirely to audio DSP - this enables reintroducing audio DSP architectures basing on one hardware interrupt per sample, with the full audio DSP routine being executed each time a new sample is coming in - audio latency is kept minimal
- Cortex-M0 #2 dedicated to SGPIO emulating four I2S lanes, connecting to audio ADCs and audio DACs (multichannel audio, up to 8 channels).

Is there a prototyping board available ?
I wish there is one embedding a WM8580/8581 multichannel Audio Codec, because in such combination MCLK gets locally generated by the Codec (high quality audio clock not involving the DSP engine), and because the WM8580/WM8581 is providing an SPDIF transceiver as bonus, that can help building modern PHILIPS DSS930, DSS940 and DSC950 incarnations.

Is there a patent on the trick Philips used in 1994 in the DSC/DSS for superimposing a low bitrate serial channel on the SPDIF datastream ? This was needed for conveying the listening volume setting. See the attached pictures.
 

Attachments

  • Philips DSC-950 EBU + DIG CONTROL Interface.jpg
    Philips DSC-950 EBU + DIG CONTROL Interface.jpg
    134.6 KB · Views: 226
  • Philips DSS-940 EBU + DIG CONTROL Interface.jpg
    Philips DSS-940 EBU + DIG CONTROL Interface.jpg
    351.6 KB · Views: 222
The LPC4330 Xplorer apparently underwent a silent morph to the LPC4337 Xplorer (ref here, here, and here). Based on my email exchanges with them NGX sales is unaware of this despite similar changes on the LPC1850 Xplorer.
This looks frightening. I've just got a few Infineon XMC4500 Relax Lite Kit delivered from HITEX. That Cortex-M4 engine looks especially clean.
XMC4500 Relax Lite Kit
HITEX is now using the same concept as Embedded Artists: a built-in hardware programmer-debugger containing its own XMC4500, that you can separate from the XMC4500 target. All pins are brought out using standard pin headers. This motivates me for designing a "shield" dedicated to audio DSP, containing a multichannel Audio Codec. Possibly a PHILIPS DSC clone, and a PHILIPS DSS clone. Oh, I forget to tell : on the XMC4500, the (many) SPI ports feature a I2S modality.
 

Attachments

  • XMC4500 Relax Lite Kit.png
    XMC4500 Relax Lite Kit.png
    69.8 KB · Views: 214
Last edited:
This looks frightening.
It's common to sell things without knowing what they are and use misleading marketing names. Though usually the naming is misleading in the other direction. ;) And, looking back at the emails, they don't rule out NGX just never responding to my follow up questions to their poorly composed initial reply. Agree this isn't a particularly good customer experience but the hardware seems fine and NGX has the only 43xx eval board offerings available under USD 100.

The UDA1380 codec on the LPC4330, er, LPC4337 Xplorer's nothing special. But it's enough to verify one's I2S or USB in -> DSP -> I2S out code. SGPIO0 to 14 are exposed on the Xplorer headers. So an alternate codec or DAC can be strapped to it. Or, more minimally, one can verify one's DSP XO and SGPIO code wiggles the proper pins. I like the arrangement as it allows almost total toolchain and firmware verification before sending an LPC4337 PCB to fab.

I just have, oh, five or six other projects to finish up first.
 
SGPIO 0 to 14 are exposed on the Xplorer headers. So an alternate codec or DAC can be strapped to it.
Well, I never succeeded setting up a proper development environment for the few Xplorer boards I have here. If somebody has a kind of framework from where I can start - how to configure the development environment - how to run a very basic USB-audio -> twin I2S output (4 channels), that would help a lot op people (including me). Starting from there, one can safely design a "digital audio shield" for the Xplorer, hosting a WM8580 or WM8581 Codec. Possibly, implementing a PHILIPS DSC clone, or a PHILIPS DSS clone, a few straps allowing to select the modality.
 
Last edited:
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.