idea for homebrew recording studio over ethernet

Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.
RAW Ethernet?

Have you considered skipping IP altogether and just throwing out raw Ethernet packets? IP seems overkill since you're not using any routers. In fact, if you had one Ethernet card per channel, you end up with a dedicated ful-duplex point-to-point link (via crossover cables). This should help simplify your "black box" now that you don't need to cram an IP stack on top of what you already have to do.

Since you're working with raw point-to-point Ethernet, packets will always arrive in order, and there will never be any collisions. Retransmission on CRC error is usually *NOT* done by the Ethernet controller. You just have to ensure that the black box has the smarts and enough buffers to deal with the occasional re-transmit.

The retransmission protocol can be as simple as sending an acknowledgement for every packet sent. A better algorithm would be much like what ZModem uses: the black box continuously sends packets. The master machine only sends on an error (ie. the timestamp of the packet before the corrupt packet). This is a bit more complex because the black box must keep enough old packets around to in case they need retransmission.

If you don't like having lots of Ethernet cards on your computer, make sure you use a switching hub. Never use a dumb repeating hub! A dumb repeating hub will cause lots of collisions under heavy loads. With a switching hub, you end up with each channel getting a point-to-point link to the HUB with zero collisions. The switch has its own internal buffers and it will automatically interleave the packets, so you'll never get collisions on the PC <-> switch link. The bad news is that you might overwhelm the connection to your PC or the buffers in the switch. In either case, depending on how cheap your switch is, you might start losing packets.

WARNING: Not all Ethernet chipsets are made equal. Just like there are modems and WinModems, there are smarter and dumber Ethernet chipsets. Some make the software drivers do the CRC calculations, some do it for you. Some have larger buffers, some have smaller ones. This can and will affect performance under high loads. You have been warned.

Definitely use 100Mbit Ethernet not only for the bandwidth, but because you can probably get lower latencies out of it.

If your put a periodic counter in your black box, simply attach the timestamp to each packet. If you're black box is pure hardware (ie. FPGAs or other programmable logic), then simply hook a high-resolution clock signal to an adder.

If your black box is a standard PC, use the Pentium's 64-bit high resolution counter register, though with that much overhead (sound card, PCI, CPU, OS), network iduced jitter might be the least of your concerns. In this case, you might as well record to the local hard drive and reassemble during post processing.

As long as you stamp each packet, you can align the streams during post-processing, assuming the black boxes have a reasonably accurate clock, there shouldn't be much jitter. What little jitter there is between each black box's clocks will definitely be smaller than the round trip latencies (ping times) over Ethernet. As such, there's really nothing you can do to fix the inter-clock jitter in real-time.

As far as giving the "go" signal... why bother? When your black box is running, just have it start slamming out samples. The master machine simply ignores the samples until it's ready to record. After all, this is how traditional analogue recording equipment usually works anyways.

The basic strategy is to unload as much work onto the master computer as possible. The critical real-time work is all in the black boxes, they must be as simple (and thus hopefully as deterministic) as possible. They will probably be the toughest part, especially if you're gonna use FPGAs or embedded CPUs. Hopefully they'll teach you that in school :)
 
Daryl, you can't correctly assume that point to point will work for this application, since there are more than two elements peer to peer or client to server in operation, as I will explain further below. And further you can not correclty assume or assert that "raw" ethernet packets will arrive in order, for many reasons, but for one specific to this application, since its not possible to presume that every packet will be transmitted in order with respect to a multithreaded higher layer application which may need to open several communication streams between the functional partners. What you esstentially and incorrectly are trying to analogize is an external buss system in which multiple point to point communications are strictly ordered, somewhat like like in fiber channel or token ring, but even more so with respect to this higher layer applications proceedural requirements. Think about all of the logic glue that is required just to make a "simple" internal buss system work for a single multi-element autotomus system. Ethernet does not work that way, and way not designed to work that way. And applications running over ethernet do not work that way, they need some help from IP, and the other communication protocols I mentioned. To be successful in the market place this application would have to work in a way that will allow other companies to introduce complimentary products and functions. Sorry and respectfully, if you think differently at this point you have some more studying to do. I would prefer to be one of those people who encourage other peoples creative thinking but I'm not going to participate in a conversation where the engineering basics are mangled. The right way to think about this is to study current engineering trends, instead of trying to re-invent the wheel. See the texts I recommended. Your overall approach, trying to limit this to a star configured point to point, single threaded application would be so restrictive and limited as to be rediculus. Further when applications are created, standard libraries are used from software vendors who provide all of the protocols needed for the full stack required to support all of the application's functionality. If you were actually designing the application you would concentarte on the application and again use standard libraries for the interface as well. At this stage of the game an application is not going to come into existance with one person doing it all hardware software etc. from scratch a-z, or even just all of the hardware or just all of the software himself either from scratch. If this were a single autotomus system using a simple serial protocol for communication between two end elements it could be done that way, but we were talking about moving the whole set of capabilites foward several decades and making it truely networkable. Ethernet and the other protocols I mentioned were designed to do a whole lot more than a serial protocol, don't hobble ethernet or the applcation under discsuuion to make it fit into a simpler scheme than it was meant to serve. I think that there are a number of half considered or confused ideas in this thread and I have a feeling this discussion may turn less constructive, so I'm out. I hoped to move the dicsuuion forward by intorducing some points about how it is really done in the real world but the conversation is now regressing. Best wishes to you and your projects. This message is genuinely not meant as a flame but I will not participate in this disscussion any further.
 
i think the TCP/IP idea rocks, it will be extremely scalable, even once firewire has already gone by the wayside.

I think your timing issue is not that tough, the time stamp idea, imho, is the way to go. its the typical method of internet streaming media to keep audio and video synchronized. if you connect a physical sync clock to your black boxes, and wrap the sample packet in a sync packet before you wrap it in a TCP/IP packet, the PC will be able to re-sync them, no prob.

A great source of information on this process can be found in the Windows Media SDKs

PS: if your putting everything on a PCB, just control your ethernet chip with a micro-controller. leave some room for flexibility in the design.

cheers mate, good luck
 
ps

Leschwartz has pretty much hit the nail on the head, no wheel re-invention in this project. use tcp/ip for what it is, as it already works wonderfully. your packeting and transmission from the black box is a two-chip affair (off-the-shelf, even). One thing i do disagree with Leschwartz is not being able to do everything solo, at least thats what I interpreted his post to mean. Sure in engineering companies every proceedure is departmentalized, but I'm sure lots of us in this forum have worn all the hats from time to time. probably plenty of us are even now concurrently working on hardware, firmware, and software development... and if youre not you should be... shame on you :D

later, Dan
 
Leschwartz, is on the right track. TCP/IP bogs down when the black box have to re-send the packet again. Real-time protocols are better for data that doesn't need to be re-analysed. Video Conferencing works on real time protocol. In video conferencing software, do you want to wait for the last frame to be re-sent or do you want to skip that frame and go on to the next.
 
Ultimate solution

Use this one:

http://magic.gibson.com/thisismagic.html

MaGIC, a new digital transfer protocol from Gibson Labs.

This looks like a very promising format.

Media-Accelerated Global Information Carrier (MaGIC)

Revolutionary new Digital Media distribution standard
Audio, video, control information
Standard CAT-5 (Ethernet) cable
Vastly increased bandwidth (more data)
Much lower latency (higher speed)
Much longer cable lengths with no signal degradation
Only available distribution method for new DVD-A (DVD-Audio) format specifications
Scalable, linkable and upgradeable system
Multiple applications:
High-Resolution Audio / Video distribution
Home Automation Systems
Commercial Wiring Installations
Consumer Products
Telecommunications Products

Low-latency Real-time System

250 µs point-to-point latency times across 100 meters
Works as free-standing or controlled (host-based)
Intelligent, self-identifying networking
Studio and stage systems include Routing Matrix software

Standard Connections

Conforms to IEEE 802.3 physical layer
Uses standard CAT-5 cable and RJ-45 connector
Supports wireless networking
Cost-efficient, easily-accessible cables and jacks

Superior Bandwidth

Up to 32 channels (fully duplexed)
Up to 32-bit data stream
Up to 192 kHz sampling rates
Supports multiple control channels (up to 100 MIDI)
Conforms to multi-channel DVD-A 32/192k specification
Supplies Phantom Power
Scalable and upgradeable
 
I think this concept sounds interesting, and it has certainly been applied in other applications, i.e. digital, etc.

Some thoughts:

In terms of Firewire being long gone, I think you will find that that does not happen. 1394B will push data rates out past 800mbps and give 100 meter distances. Add in the isochronous data transfer capability and you have a set of capabilities that are beyond Ethernet for certain applications.

I think the biggest issue is going to be synchronization and jitter as many people have stated. While A/D's have gotten cheap, it is still takes money and design expertise to design a really good clock source for low jitter to take advantage of those A/Ds.

Not only will you need a low jitter clock source in each of your Ethernet enabled modules, but those modules will all need to be locked in step in some method. A highly accurate clock will not help, since you still need to synchronize the modules. If you can synchronize the modules, you could probably live without as much clock accuracy as you could simply re-sample the signals at the PC end to remove clock speed variations between the modules. An accurate clock would probably improve the issue though.

In terms of micro-synchronization, you could probably grab a 125MHz clock off the Ethernet PHY Receiver. Have to check some data sheets to see which of them bring this out. They need to generate a recovered clock, it is just a matter of whether you have access to it.

Even if you do not, you may want to consider a different method. Your PC will likely be the standard time base for the system (I think it will have to be)... and that is another problem, because the clock on the $20 PCI NIC in your PC is neither accurate or low jitter. Actually, nothing in your PC is terribly accurate beyond the time clock and even then.... As your PC is the "master" of the system, you could send out a "data stream" that is unique, i.e. a byte sequence say 8-10 bytes that you know what it is. Of course, you will need to detect if it collides or not. On the receiver end, you could put a CPLD/FPGA between the Ethernet MAC and Ethernet PHY and detect this sequence. This will allow you access to an unprocessed data stream that has little added other than the jitter/latency of clock and data recovery. I don't think you would be able to transmit the unique data stream with any sort of regularity or frequency such that you could generate a PLL off of it, but you could use it to "reset" or "set" a time base on the individual modules that could then be used to time-stamp the audio packets. I think to make this all work though, you would need to develop a customer ethernet card for the PC that could also detect this time base generated signal as you would have little to no control over when you initiated the data signal and how much time it takes to actually come out of the NIC as different NICS would have different levels of latency, jitter, etc.

I think this is one of those problems that sounds simple to start, but when you start to turn it into a robust quality system, you find there are lots of pitfalls, and fixing them removes some of the advantages you started out with. I think what you are trying to do is doable, it is all a matter of the quality statement you are trying to achieve.
 
my concept

Hey guys, just thought I'd put a few keystrokes to keep this idea/thread alive...

alvaius - this isn't meant as a complaint or insult, but I think you're suggesting to over-design the project. I can say with a high degree of certainty that letting software take care of synchonization is the best method - using the ethernet as a semi-transparent transmission medium. This entire process could be implemented of any physical medium including Firewire. 2nd, I didn't say that Firewire was gone or obsolete in any way. To re-phrase what I said is that TCP/IP with still be around long after Firewire is obsolete.

johan - I read the standard and the website about Magic. It's very cool, and looks basically like what Jason wants to do. The sychronization method they use is very interesting, but it requires more or less tearing apart and rebuilding a custom protocol stack. They claim 80 picoseconds of jitter. I find this standard very interestng, but expensive to implement initially.

jason - take a look at the Magic site and look at the over-all concept. Consider a $600 eval board and if you want to make your own software, another $2500.

Gibson is a musical instrument company, who wants to integrate this technology as a way to bring music from the instrument itself into the recording software. Thats where we reach a big difference in what this product is vs. our homebrew setup. Latencies aren't a big issue with us.

In the Magic system, effects done in real time justify the powerful Sharc DSP. Custom protocol stack assebly/disassembly I would gather is done by the Xilinx PLD (in reference to the pictures of the Magic Eval. kit). The other small square chips near the left of the Xilinx are probably something like Cirrus CS8952 Ethernet driver.

I have used the CS8900 controller before. They work great and easy to control with a micro, I imagine the 8952 (100Mbps) is relatively similar.

A minimal amount of real-time processing needs to occur here, basically inclusion of a [media type] compatible sychonization packet, and TCP/IP packet. So get rid of the Sharc and the Xilinx parts. Although they aren't really expensive, they take a very high level of expertise and equipment to use.

Motorola, Analog Devices & Microchip technology offer hybrid DSP/micro devices. Powerful processing of a DSP, with simple control of a micro. They can use perhaps from 50 to 100 instructions. Thats nothing compared with a pure DSP or microproccessor, so they aren't difficult to get running.

The DSP/micros run up to 100 Mhz - 150 Mhz and offer enough processing power to take data from your ADC and arrange the media synchonization packets, and TCP/IP stack. Perhaps you could set up a small RAM buffer with it too, to make up for network latency or stalls. It would also cover the basic micro functions of controlling the ethernet chipset and the ADC chipset as well. Keep in mind, that this setup is only for 1 "black box".

Also I mentioned "media type" synchonization. By media type, I mean whatever SDK you will use for creating your recording/editing software. Your timing/synchronization packets generated in the black box will need to be compatible with whatever methodology described in the SDK.

Myself - I am quite familiar with Windows Media SDK. This whole project could be done in Windows Media, but there must be 100's of other brands. I don't know about prices, but I do know that Windows Media SDK is free. (and sometimes it works with Windows!)

As for "black box" synchronization, I would recommend a 50ohm coax, non-terminating at the box, connected with BNC "T" connectors on front or rear panel of each box. Provide a very stable master-clock, and terminate the open end of the cable run. Careful attention to grounding, signal path and trace impedances with give you a very accurately synchronized clock to each ADC module.

Over all it's not as fancy as the Magic system, but I don't think its wise to reinvent a real-time transmission protocol for a homebrew set. As well, I think a lot of the real-time DSP functions aren't really neccessary in a recording environment. They can all be done in the computer afterward.

Anyway, bedtime (again!) cheers guys, good night.

Dan

PS: Ultimately with enough internet bandwidth, you could have a remote recording studio - Or - Say you want to record 'your' band live in another city, but you can't go... they take the black boxes with them and plug it to a broadband internet connection, then you just sit at home and record, controlling the mixing and everything. Of course thats just an afterthought.
 
have you thought of 'udp'?

hi
going back to your original post about taking inspiration from the picoweb project - have you thought of using udp sockets rather than tcp/ip sockets?
(just look up 'udp' information on google).

in my realtime systems programming work, i've found that the full-shebang handshaking of tcp/ip just bogs down the network (you'd be lucky to achieve 25-40% of the bandwidth in a 100Mbps system, but you'd ensure 100% data continuity). using udp, you can really get 80-90% of the bandwidth, with some dropped frames, but usually you can design integrity checks into your data format, such that elementary checks at the recieving end make sure that you are not getting garbage.

in general though, i like the way this thread started out. let me know if you want to collaborate; i've built the picoweb thingy some time ago (it was a lot of fun!) and have a fair amount of DSP/realtime systems programming background.

i had been toying around with ideas on the reverse of what you're suggesting, i.e.
consumer hard-disk music storage unit (tivo etc)
-> broadcasting digitally overly wireless-lan
-> powered speakers with builtin wireless recievers and DAC to
convert the digital stream and reproduce music.
it would be hard to make it a hi-fi application, but it would be some fun. ... as would be your recording idea!

cheers -
pradeep
 
Madalo, no insult taken.

However, I was trying to describe a system that works exclusively over ethernet/TCP-IP, without extra wires, modified protocol stacks, etc. ,etc. The system you describe requires an extra co-ax to be run to every unit. Extra wires, extra drivers, isolation, etc.. This did not seem to be in theme with the origal post. Even with your scheme, you would still need some level of sophistication in your clocking scheme to guarentee an equal number of samples from every unit, or the ability to measure unit time.

At the module end, I think my idea is actually fairly simple to implement and could cover off all the timing and synchronization issues. The only thing I don't like is my suggestion for a custom PCI NIC for the PC, but I am open to thoughts on how to fix this and achieve the goals.


I think that it is a given that 100 MBPS is needed. To that end, I would suggest using an MCF5272 from Motorola which has a 10/100 MAC built in. They are reasonably inexpensive and you can readily find a Linux port for them. They also have a MAC if you want to do some basic DSP work. Add a $2 PHY from National and you are all set. You have access to the MII port between the two and could watch for data patterns for synchronization.

I am not sure that I would make much use of DSP on the modules. A 1.7 GIG P4 Pentium could take the place of numerous DSPS, whether they be Shark, DSP563xx, etc. This keeps the modules simpler. (If you don't believe me on the DSP Horsepower claim, go to BDTI and check their tests). Contrary to what TI may tell you, the fastest DSP processor is not a DSP at all. It would be a toss-up between a Motorola 7455 or a P4-2.8GIG.

One issue you will find with 100meg ethernet is that almost all the MAC and MAC\PHY combos have ethernet interfaces with the exception of a few chips from SMSC. That is why I suggest the MCF5272. However if you want an external MAC\PHY, you may want to consider the MCF5249 processor for the great audio interfaces and large internal memory, not to mention flexible ports. It may make for an easier system than a DSP.
 
The subject of sending audio over networks was recently discussed on the www.churchsoundcheck.com. This is more than just an intellectual exercise. There are several players in this market for live sound and recording studios. To quote Ray Rayburn who definitely knows something about digital audio:

Gibson’s MaGIC
Yamaha's mLAN
Peak Audio's CobraNet

They all are similar but not exactly the same. The similarity is that they provide a multi-channel digital audio transport mechanism. Here are a few of the differences:

AES3 which is sometimes called AES/EBU carries two channels over a shielded twisted pair cable or coax. It is not a network but strictly point to point. Maximum distance is around 100 meters.

SPDIF is the "consumer version" of AES3 runs only over coax and relatively short distances.

MaGIC uses the "physical layer" of Ethernet. In other words they use Cat 5 cable and "RJ45" style connectors but after that the similarity ends. They say they can send 32 audio channels each direction over a Cat 5 cable of up to 100 meters long. This is not a network but strictly a point to point connection. Their big advantage is low latency. MaGIC is looking like it’s getting a bit closer to reality. The guitar with the MaGIC output has been announced but not delivered. They are selling development boards in an effort to try to get manufacturers on board. Last year they did not even have development boards.

MADI is a multichannel version of AES3 and allows up to 56 channels over a cable but with shorter distances such as in a control room.

mLAN uses Firewire or IEEE1394. It is a real network. It is compatible with other mLAN devices on the 1394 network, but not all audio over 1394 is mLAN so you must not assume that non mLAN devices will work just because they are also using 1394. One of the limitations is distance. 1394 works fine over short hops around a room, but is not suitable for long distance connections. I note that Yamaha has licensed CobraNet and has come out with their first CobraNet products.

CobraNet is fully compliant with all the Ethernet rules, and is transported over standard Ethernet networks or other networks that can carry Ethernet such as ATM. It is a real network and allows signals to come on and off the network at any point. A single Cat 5e can carry up to 64 channels of audio each way if Fast Ethernet is used (100BaseT). The same cable can carry up to 640 channels of audio each way if Gigabit Ethernet is used. Fiber versions are available. Over copper wire a single run can be up to 100 meters and using fiber distances in excess of 70,000 meters are possible. Standard Ethernet techniques can be used to build networks that are highly reliable with automatic changeover of failed components or links. Something like 28 companies have licensed CobraNet, and many products are available.

Conclusions:

Just about all of the above can be converted to AES3 as the most basic interface type so you should be able to get almost any sort of digital audio converted to any other form.

AES3 is the most basic and almost everything else can be converted to this.

SPDIF is only on your consumer gear, but can often just connect to an AES3 input and work. If not converter boxes are available.

MaGIC you don't have to worry about yet since nothing is shipping :>)

MADI is usually only found on recording studio gear and can be converted to AES3.

mLAN is found on some recording studio gear and a few other things and can be converted to AES3 if needed.

CobraNet is what the vast majority of the sound reinforcement gear is using, and can be converted to or from AES3.

So I would say you don't have to worry unless you buy into something which can't convert to AES3.

Ray A. Rayburn
Application Engineer
Peak Audio a division of Cirrus Logic, Inc.
http://www.PeakAudio.com/
 
haldor - interensting post, it's commercially viable! yeah! (in churches no less!)

alvaius - thanks for reply. for a stand-alone single ethernet cable connection, i don't see any way around making a custom nic. that's essentially what the Magic system uses but not pci & more stack mods. i haven't yet taken a look at the others haldor mentioned. looking at it this way, a custom nic wouldn't be any harder than custom hardware in the black boxes. the only disadvantage i see of the single ethernet cable is you may be limited to a single node. I think you have to forget about routers and switches. you could definately make the custom nic fully TCP/IP compatible even though it has a lot of trick features.

i have a feeling that its possible to do it either way, nic or coax. jason's (cellular mitosis) oiginal post was based around the word "cheap" and using off the shelf computer parts for the networking segment. Other than that, I think we're genrally headed in the right (at least constructive) direction, it's too bad some people left the discussion so early.

pradeep - if any of this becomes more than just a day dream, your dsp skills will be appreciated :D

i like udp, its more efficient for real time. good for my windows media. it's the real-time breakfast of champions. i have to note though, that if were using this system to record, rather than real-time transmission - and the availability of an over-run (latency) buffer in the transmitter is an option - wouldn't 100% data integrity be better trade for the efficiency (think CLASS A). it's not running a very large number of channels anyway. i guess that's why i've never veered far away from the TCP/IP stack. i dunno, what would we call this thing? "quasi-real-time virtual recording"? :)

anyway, now i've started to make us sound like a product development team, but.... what were talking about is slowly becoming in the same caliber as the current systems being marketed. where's the manager!? - has anyone seen a post from Jason recently? By the time he reads this, we may be finished - ahead of schedule, but over budget :D

Cheers, Dan

PS: - on a project i once went over budget on coca-cola -
 
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.