[LWN Logo]

From:	smurf@noris.de (Matthias Urlichs)
Subject: NFS read performance ugliness
Date:	30 Oct 1998 16:13:10 +0100
To:	linux-kernel@vger.rutgers.edu

I was performance-testing the kernel and userspace NFS implementations
yesterday, to see if we have any chances to outperform an NetApp server.

As of 2.1.127-pre3 (with the NFS patches from the latest knfsd package),
we do NOT.

The main problem is horrible read performance across NFS, which seems to be
related to the server not doing its job WRT reading ahead from disk.

I verified this by doing a few tests with bonnie:

              -------Sequential Output-------- ---Sequential Input-- --Random--
              -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine    MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
Barrac.   350  4333 88.7  6622 12.2  2394  8.5  4154 83.6  5275  6.8  92.0  1.7
+DCAS-W   250  4304 88.2  6634 12.1  2692  9.7  3052 61.5  5750  7.4 122.7  2.1
 +2SCSI   350  4344 89.1  8757 16.1  3534 12.9  3309 66.6  7784 11.1 101.7  2.0
NetApp   1000  2019 41.5  2243  3.8  1522  3.8  2691 57.0  4723  7.9  95.6  2.8
nfs       350  2047 41.8  5445 10.5  1516  4.1   976 20.3  1017  1.8  31.0  1.2
knfs      350  2421 49.1  4639 11.0  1799  5.1   970 20.4  1027  1.9  32.2  1.2
           80  2464 49.1  4132 11.2  1655  4.8  1459 30.2  1741  3.3 453.6  9.6
           70  3057 63.0  4340 12.2  1713  4.3  1848 38.0  3582  6.7 969.6 30.3
           60  3522 74.4  4184  8.0  2064  5.4  2536 51.3  4946  4.4 376.6  4.4
           50  2574 52.0  2447  8.1  5735 21.1  4697 91. 187292 98. 4067.4 26.5

The +2SCSI test is the fastest I managed to push the system with the
adapters and disks I have for testing (two Symbios adapters, one of them
wide-SCSI, and two disks, 4k-RAID-0'd together). "NetApp" is the performance
of the Network Appliance box I'm testing. Below that are various tests with
a Linux 2.1.127 client and the server (running on the +2SCSI array).

Since the client has 64 MB of RAM and the server has 96, the last three
lines demonstrate that reading from is plenty fast when the data are
in-cache on the Linux server. (With 50 MB, the data obviously fits into the
client's buffer cache...)

As you see, write performance is rather good (though unsafe -- the NetApp
has NVRAM), but why can't the server read faster? It must be related to
disk readahead, but I played with values (to "hdparm -a") from 64 to 250
for both the MD and the physical partition devices, and nothing changed.
Good ideas urgently needed. 

Related to that, is it true that I cannot tell the Linux kernel how long
to cache NFS replies? This would be a huge win, esp. for stat() calls and
directory contents. I don't care if the client does not immediately see a
change on the server if it means that the NFS-mounted /www we're thinking
of will not spam the server with getattr() requests.

NB: Obviously, all tests were run with 100-mbit Ethernet cards (tulip
driver) and a 100-Mbit switch, full duplex.

-- 
Matthias Urlichs  |  noris network GmbH   |   smurf@noris.de  |  ICQ: 20193661
The quote was selected randomly. Really.    |      http://www.noris.de/~smurf/
-- 
Reality is a crutch for people who can't cope with drugs.
                -- Lily Tomlin

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/