Many new workstations and servers are coming with integrated gigabit network cards nowadays, but quite a few people soon discover that they can’t transfer data much faster than they did with 100 Mb/s network cards. Multiple factors can affect your ability to transfer at higher speeds, and most of them revolve around operating system settings. In this article we will discuss the necessary steps to make your new gigabit enabled server obtain close to gigabit speeds in Linux, FreeBSD, and Windows.
First and foremost we must realize that there are hardware limitations to consider. Just because someone throws a gigabit network card in a server doesn’t mean the hardware can keep up. Network cards are normally connected to the PCI bus via a free PCI slot. In older workstation and non server-class motherboards the PCI slots are normally 32 bit, 33MHz. This means they can transfer at speeds of 133MB/s, but since it is a shared bus between many parts of the computer, realistically it’s limited to around 80MB/s in the best case. Gigabit network cards are 1000Mb/s, or 125MB/s. If the PCI bus is only capable of 80MB/s this is a major limiting factor for gigabit network cards. The math works out to 640Mb/s, which is really quite a bit faster than most gigabit network card installations, but remember this is probably the best-case scenario. If there are other hungry data loving PCI cards in the server, you’ll likely see much less throughput. The only solution for overcoming this bottleneck is to purchase a motherboard with a 66MHz PCI slot, which can do 266MB/s. Also, the new 64 bit PCI slots are capable of 532MB/s on a 66MHz bus. These are beginning to come standard on all server-class motherboards.
Assuming we’re using decent hardware that can keep up with the data rates necessary for gigabit, there is now another obstacle – the operating system. For testing, we used two identical servers: Intel Server motherboards, Pentium 4 3.0 GHz, 1GB RAM, integrated 10/100/1000 Intel network card. One was running Gentoo Linux with a 2.6 SMP kernel, and the other is FreeBSD 5.3 with an SMP kernel to take advantage of the Pentium 4’s HyperThreading capabilities. We were lucky to have a gigabit capable switch, but the same results could be accomplished by connecting both servers directly to each other.
For testing speeds between two servers, we don’t want to use FTP or anything that will require data be fetched from disk. Memory to memory transfers are a much better test, and many tools exist to do this. For our tests, we used ttcp (http://www.pcausa.com/Utilities/pcattcp.htm).
The first test between these two servers was not pretty. The maximum rate was around 230 Mb/s, about two times as fast as a 100Mb/s network card. This is an improvement, but far from optimal. In actuality, most people will see even worse performance out of the box. However, with a few minor setting changes, we quickly realized major speed improvements – more than a threefold improvement over the initial test.
Many people recommend setting the MTU of your network interface larger. This basically means telling the network card to send a larger sized Ethernet frame. While this may be useful when connecting two hosts directly together, it becomes less useful when connecting through a switch that doesn’t support larger MTUs. At any rate, this isn’t necessary. 900Mb/s can be attained at the normal 1500 byte MTU setting.
For attaining maximum throughput, the most important options involve TCP window sizes. The TCP window controls the flow of data, and is negotiated during the start of a TCP connection. Using too small of a size will result in slowness, since TCP can only use the smaller of the two end system’s capabilities. It is quite a bit more complex than this, but here’s the information you really need to know:
For both Linux and FreeBSD we’re using the sysctl utility. For all of the following options, entering the command ‘sysctl variable=number’ should do the trick. To view the current settings use: ‘sysctl
Maximum window size:
Default window size:
FreeBSD, sending and receiving:
Linux, sending and receiving:
net.core.wmem_default = 65536
net.core.rmem_default = 65536
This enables the useful window scaling options defined in rfc1323, which allows the windows to dynamically get larger than we specified above.
When sending large amounts of data, we can run the operating system out of buffers. This option should be enabled before attempting to use the above settings. To increase the amount of “mbufs” available:
net.ipv4.tcp_mem= 98304 131072 196608
These quick changes will skyrocket TCP performance. Afterwards we were able to run ttcp and attain around 895 Mb/s every time – quite an impressive data rate. There are other options available for adjusting the UDP datagram sizes as well, but we’re mainly focusing on TCP here.
Windows XP / 2000 Server / Server 2003
The magical location for TCP settings in the registry editor is:
We need to add a registry DWORD named TcpWindowSize, and enter a sufficiently large size. 131400 (make sure you click on decimal) should be enough.
Tcp1323Opts should be set to 3. This enables both rfc1323 scaling and timestamps.
And similarly to Unix, we also want to increase the TCP buffer sizes:
One last important note for Windows XP users needs to be made. If you’re installed service pack 2, then there is another likely culprit of poor network performance. Explained in knowledge base article 842264, Microsoft says that disabling Internet Connection Sharing after an SP2 install should fix performance issues.
The above tweaks should enable your sufficiently fast server to attain much faster data rates over TCP. If your specific application makes significant use of UDP, then it will be worth looking into similar options relating to UDP datagram sizes. Remember, we obtained close to 900Mb/s with a very fast Pentium 4 machine, server-class motherboard, and quality Intel network card. Results may vary wildly, but adjusting the above settings are a necessary step toward realizing your server’s capabilities.
1 Comment »