BIP IP driver for LINUX performance |
Our benchmark with TCP |
Test method details |
length = ...; buf = malloc(length); for (i=0;i<nbtest;i++) { starttimer(); send(tcp_fd,buf,length,0); do { n = recv(tcp_fd,buf,length,0); buf += n; length -= n; } while (length > 0); end_timer(); time[i] = timer_value(); }The code of the receiver is the same expect the send and recv are done in the opposite order. The TCP connection have been set with the following options (on each side):
ndelay = 1; snd_buf = 65500; rcv_buf = 65500; setsockopt(fd,SOL_SOCKET,SO_SNDBUF,(char*)&snd_buf,sizeof(int))); setsockopt(fd,SOL_SOCKET,SO_RCVBUF,(char*)&rcv_buf,sizeof(int))); setsockopt(fd,IPPROTO_TCP, TCP_NODELAY,(char*)&ndelay, sizeof(int)));Note that for the Linux TCP implementation, the TCP control flow window that will be used is about half the SO_RCVBUF option size, so about 32000 bytes.
Overview of some implementation details |
There is a kind of bug in the Linux 2.0 TCP implementation that will make performance drop (to about 2Mbytes/s) for certain size of messages, this behaviour can be observed with both the Myricom driver and our BIP-IP driver, here is a patch that correct this behaviour. (Note that all tests were done with this patch applied).
Not that this patch will only eliminate the performance drop for TCP connection that have the Nagle algorithm disabled with the SO_NDELAY option. Fortunately this is the case for PVM, LAM-MPI and MPICH over IP.
Another problem is that Linux kernel allocation is not really designed for network buffers greater than a page size, so we have also written a small patch to allow to use efficiently larger MTU. Basically this patch allow the Linux kernel to keep a pool of buffers of "large" sizes that are recycled.
Netperf benchmarks |
Througput benchmarks
BENCH TYPE | PAQUET SIZE | THROUGPUT (Mb/s) | SYSTEM | NETWORK | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
UDP_STREAM | 16384 | 356.11 | DEC500/PPro | Myrinet | ||||||||||||||||||||
7872
352.42
| PPro | Myrinet with BIP
| 8192
| 287.79
| DEC500/DEC500 | Myrinet
| TCP_STREAM
| 1048576
| 750.16
| SGI Power Challenge | HiPPI
| 16384
| 338.05
| PPro | Myrinet with BIP
| 7872
| 315.89
| PPro | Myrinet with BIP
| 65536
| 271.35
| DEC500/DEC500 | Myrinet
| |
BENCH TYPE | REQUEST SIZE | TRANSACTIONS/s | SYSTEM | NETWORK | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
UDP_RR | 1
8826.16
| PPro | Myrinet with BIP
| 1
| 4403.99
| HP K460 | FC-266
| TCP_RR
| 1
| 7506.20
| PPro | Myrinet with BIP
| 1
| 4184.95
| Dual PPro | FastEthernet
| |
![]() |
Last modified: Thu Aug 21 14:06:34 CEST 1997 © BIP team |