Audio over ethernet and PC Ethernet controller

Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.
Okay, I know there has been a lot of discussion about this subject, however I have not found a good answer for following question: Can PC's Gigabit Ethernet Controller be used with realtime audio over ethernet systems?

The network would have just 2 points, computer and fpga based adc/dac.

The things I'm unsure about are order of packets and clock.

Does the PC software (including drivers, of course) have control over the order in which the ethernet controller will send frames? So, if driver tells controller to first transmit frame A and right after that tells it to transmit frame B, is it possible that fpga receives packet B first and then packet A? Of course assuming there is a direct physical connect between them without switches or anything.

What goes to word clock, well, is there any chance that word clock could be transmitted accurately enough, PC either being master or slave?

I did found a diy rpject where a dac was connected to computer ethernet port (WTF DAC | Peufeu's Electronic Stuff). However, the project seems to be frozen and there is no note about why. Maybe PC's ethernet controller turned out to be unstuitable for this kind of applications?
 
Can PC's Gigabit Ethernet Controller be used with realtime audio over ethernet systems?

The short answer is yes, after all, you should have enough bandwidth.

There are 2 principal types of communication commonly taking place over networks. These are the virtual circuit and the datagram. Virtual circuits establish a fixed route for an end-to-end connection between 2 points on a network, typically Frame Relay or ATM. Sequential and error free delivery of packets is virtually guaranteed, in part due to a variety of enhancements.

Internet (Ethernet) communications are (in general terms) based on IP, Internet Protocol. The IP protocol operates at the network layer protocol of the OSI reference model and is a part of a suite of protocols known as TCP/IP. It uses datagrams to communicate over a packet-switched network. TCP provides end-to-end error control and sequence control equivalent to that provided by virtual circuits. The later IPv6 provides facilities for bandwidth reservation for services such as VOIP, but bandwidth is not a concern in this case.

Typically establishing an Ethernet IP connection is straightforward from the PC end, facilities are provided in many programming languages (sockets). Each socket is mapped by the operating system to a communicating application process or thread. Stream or connection-oriented (TCP) sockets are often used where a datastream such as audio is to be transferred, although other protocols can be used.

At the receiving end any FPGA must contain sufficient intelligence to establish a bound (addressable) socket in the listening state. Then the client (PC) can initiate a TCP connection. Obviously if gigabit ethernet is employed the hardware must accommodate the line pairs employed. Gigabit ethernet employs various technologies described in the IEEE 802.3-2008 standard.

w
 
Can I suggest Running an old Base 100 and use the 2 non working pairs for analogue audio?
LOL.

1Gbps is very technical. Data decoded over 4 pairs ...... If you get it to work, you got
$$$$$$ in you bank account.

Altera seems to offer free ethernet IP core for their FPGAs. I though of using that. But if I can't find free, working IP core for 1Gbps ethernet I certainly won't start writing my own and will use Base 100 ethernet.

I don't believe that TCP is suitable for tranferring audio in realtime. There simply is no time to care about lost data. It will cause some noise to audio, but if we wanted to do software monitoring keeping latencies low is more important.

Also, I'm unsure whether IP protocol should be used at all, or if I should write my own protocol. As there would be only two points connected to network, something simpler than IP would do well.

At 48kHz audio there are 48000 audio frames in one second. Is there any way to make sure that packets are transmitted (and then received) in right order. If my ethernet controller driver to transmit some data at constant time steps, is it sure that controller really sends data in same order as I asked driver to do it?

EDIT: Of course, there are some ways to reorder the packets at FPGA end if needed. But still I'd like to know is there any maximum latency in sending data, ie. maximum time it takes from asking driver to send data, to the event of data being sent to the copper cable.
 
Last edited:
Altera seems to offer free ethernet IP core for their FPGAs. I though of using that. But if I can't find free, working IP core for 1Gbps ethernet I certainly won't start writing my own and will use Base 100 ethernet.

Good plan. You don't need gigabit.

I don't believe that TCP is suitable for tranferring audio in realtime. There simply is no time to care about lost data. It will cause some noise to audio, but if we wanted to do software monitoring keeping latencies low is more important.

Why not? But if you don't like TCP there are many other protocols you can use. RTP, RTCP run on top of UDP, or you can use UDP itself. With UDP the receiving app needs to detect loss or corruption, but you have a lot of excess bandwidth. 48k 16-bit stereo is still only 1536000bps (plus overhead), you can repeat the data 50 times @ 100mbps and still have spare bandwidth

Also, I'm unsure whether IP protocol should be used at all, or if I should write my own protocol. As there would be only two points connected to network, something simpler than IP would do well.

What is simpler than using something that is already written?

At 48kHz audio there are 48000 audio frames in one second. Is there any way to make sure that packets are transmitted (and then received) in right order. If my ethernet controller driver to transmit some data at constant time steps, is it sure that controller really sends data in same order as I asked driver to do it?

Use TCP or another protocol which will manage this for you. Why do you care about latency? You can stream sound, video over your home network with off-the-shelf equipment The system CAN do it. All you need to do is what has already been done.

Of course, there are some ways to reorder the packets at FPGA end if needed. But still I'd like to know is there any maximum latency in sending data, ie. maximum time it takes from asking driver to send data, to the event of data being sent to the copper cable.

If you are using Windows, there is always an unknown with regard to timing, due to interrupts. You could use RT Linux I guess, but again, the processor bandwidth is so high in modern machines, plus you have the additional capacity to manage timing in the FPGA, I don't see this being a problem.

w
 
As I said, at one end of Ethernet cable there would be computer, and at the other end there would be something DIYed FPGA based. And that's the reason I'm not very keen on using UDP/TCP and not even IP based system; I don't know any (free) UDP/TCP/IP FPGA cores, and I wouldn't really bother to write it myself. Also, implementation would propably take quite a lot of resources. But if you know a good way of doing this, please tell me.

I did forgot to mention in first post what I would use this system for. It would be used to transfer 12 or 24 channels of 24 bit 48kHz audio in studio from computer to DACs and from ADCs to computer. On PC I'm running Linux with RT kernel.

I took a look at Livewire, but I didn't find any papers describing it's operation in depth.

Transferring the audio data itself won't be a problem.

But thing I'm very unsure is how to make computer's sound system sync to word clock. I can send make master clock device send a packet informing about word clock change at (1/48000) seconds interval. Then, I can tell sound server to sync to those packets. Do you think this would work?
 
Last edited:
So this is a digital audio workstation (DAW) for use in a studio environment.

If you were buying this off-the-shelf it might comprise an Apple Mac Pro, possibly 8-core Xeon, MOTU 828Mk3 (maybe) firewire audio interface (28 channels in, if you can believe the blurb), some add-on ADCs and some expensive software. It would work out of the box, and if it didn't, you'd have the right to complain because it would have cost a fortune. I have seen Apple video editing suites in operation, they are very good. I would expect the audio ones to be very good too.

You have started with RT Linux, so this makes things easier, since it is open source. Better to start with a powerful computer, too.

This is a very big job for one person starting from scratch. You will not complete it in a year, even if you have no day job. It's tempting to use the expertise you have, but when do you want to see it working? Meanwhile teams of people will build affordable systems with as good or better functionality.

I guess you will have to start with many things unknown and hope you can resolve them as time goes by. Most professional systems are Firewire with some USB coming along. I would think about following this example, rather than Ethernet. Or plug your audio interfaces into PCI slots and run multiple cables to the DACs/ADCs, I think you can get 24 channels in. This will cut down the development work involved.

Look at some of the example set-ups on here:- TweakHeadz Lab Electronic Musician's Hangout

What do you think?

w
 
Thanks for your advices. In fact, I do already have recording equipment (Presonus FP10 + Linux + Ardour) software I have been very satisfied with. However, now I have need for more channels. I want to stick with Linux, and that adds extra difficulty to finding. What comes to MOTU devices, on ffado.org (Linux firewire audio driver project) it's said that "MOTU is hostile towards Linux"... Generally speaking, support for PCI cards is better than firewire devices'. I have throught of getting RME hdspe raydat and two adat converters. Then I would be able to easily add more channels as need arises. I will probably never need more than 32 channel.

I'm moving my instruments to a seperate recording room, and I need 15 meters cabling between recording room and computer. An ideal solution would be that I could transfer at least 12 channels full duplex digital audio through one digital cable. But only widely spread standard for digital multicore is ADAT which can carry only 8 channels. MADI equipment would be far too expensive, and AES50 is used by very few manufacturers. It seems very likely that I have to use either ADAT or analog multicore cabling.

What comes to that Ethernet diy project, I know it will be a very time consuming project.

The reason I thought using Ethernet rather than USB2/FireWire/PCI(Express) is that it will be much easier to make FPGA talk with computer through Ethernet than with any other protocol mentioned. If you know any DIY project that uses PCI or FireWire, I'd be glad to hear about them.

I found a thing called NetJack, extension to JACK ( Linux audio server, used with almost all Linux DAW setups). It can do realtime audio transfers between two computers, physically connected via Ethernet (Internet/WLAN not supported). Take look at WalkThrough/User/NetJack2 - Jack Audio Connection Kit - Trac . I see no reason why another end couldn't be FPGA. I should somehow deal with UDP, though.
 
Altera seems to offer free ethernet IP core for their FPGAs. I though of using that. But if I can't find free, working IP core for 1Gbps ethernet I certainly won't start writing my own and will use Base 100 ethernet.

The ethernet cores available from Altera, Xilinx, etc. just get you the MAC layer and not any protocol processing. In other words, the FPGA can receive ethernet frames off the wire but it is up to other intelligence to decode and act upon them. At the MAC level you can do filtering based on IP address and also calculate checksums but that's about it.

I don't believe that TCP is suitable for tranferring audio in realtime. There simply is no time to care about lost data. It will cause some noise to audio, but if we wanted to do software monitoring keeping latencies low is more important.

TCP is not appropriate for what you want to do, simply because the protocol overhead requires a processor, and you're thinking about a DIY project in an fpga.

Also, I'm unsure whether IP protocol should be used at all, or if I should write my own protocol. As there would be only two points connected to network, something simpler than IP would do well.

The advantage to using IP is you can take advantage of all the existing equipment out there - fpga MAC cores, networking cards, switches, etc. that already talk IP. What you want to do can certainly be accomplished using UDP, no need to go invent something new. Multiple channels can be implemented by using a different UDP port number for each channel.

At 48kHz audio there are 48000 audio frames in one second. Is there any way to make sure that packets are transmitted (and then received) in right order. If my ethernet controller driver to transmit some data at constant time steps, is it sure that controller really sends data in same order as I asked driver to do it?

EDIT: Of course, there are some ways to reorder the packets at FPGA end if needed. But still I'd like to know is there any maximum latency in sending data, ie. maximum time it takes from asking driver to send data, to the event of data being sent to the copper cable.

The answer is that ordering and latency depend on the OS and the software driver for the ethernet interface. In general, on a local area network, packets will arrive at the destination in the order they were sent. Only on a WAN where there are multiple routes to a destination do you have to worry about re-ordering. But I have seen some Unix implementations that send in reverse order when a large datagram is fragmented into smaller MTUs to go on the network - i.e. last fragment gets sent first.

Latency depends on how efficient the OS network stack and driver is, how the application that is creating the buffers of audio data to send is written, whether the app is running in kernel space or user space so context switching overhead is minimized, what else is going on in the system, etc. etc. Generally you can buffer adequately to handle the variation in latency from one packet to the next.

If you didn't understand any of this, go read the RFCs - start here: http://tools.ietf.org/rfc/index, and/or go get a good book on Ethernet.

Regarding sending the clock from source to destination: There has been a lot of work done on this over the years, go look up SRTS (Synchronous Residual Time Stamp) to see one way it can be solved with high accuracy. Probably overkill for what you want to do.
 
Status
This old topic is closed. If you want to reopen this topic, contact a moderator using the "Report Post" button.