Date: Thu, 23 Dec 1999 23:07:24 +0100 (CET) From: arjan@fenrus.demon.nl (Arjan van de Ven) To: zab@zabbo.net (Zach Brown) Subject: Re: Bloat? (khttpd) [I respond to Zach's mail, but most of it is in reply to the entire thread] Hi, Zach Brown wrote: > what we _do_ want is a nice API for doing async bulk file->file transfers. > Thats all the meat of khttpd really is. We can get this with a bit of > thinking and proper async io. This will happen in time. I really hope so. Linux will benefit from this big time. kHTTPd (and any webserver, and in a way NFS/FTP server) is two things 1) A Request-decoder/Startup (header) part 2) Bulk data transfer kHTTPd is not very good at #2. "stock" Apache is very bad at #1, as it needs a lot of syscalls (9 or so, I don't remember exactly) to do this. Nobody objects to having #2 inside the kernel. Linux needs that for performance, for FTP/HTTP/NFS (be it userspace or kernelspace) and file->file operations. (In fact sendfile() does this in a synchronous way, and is in the 2.2 kernels since ages). For #2, it makes no sence to switch to userspace a lot, as userspace effectively would only increment some counter. Webservers usually serve a lot of small files (.html and .gif/.png). For benchmarks and other file-server like situations, latency counts above all. kHTTPd achieves this by _not_ doing the syscalls in #1, by not doing all the rare complex stuff (it "bounces" those requests to userspace), and by reducing the number of context-switches in the fast path. ph[h]httpd, as I understand it, reduces the number of syscalls by using pre-calculated headers, and reduces the number of context-switches by "grouping" events into blocks of async-RT signals. (Not to speak of the reduced overhead wrt select()/poll() implementations). It also seems to "bounce" all non-static requests to a modified Apache. I haven't benchmarked phhttpd vs kHTTPd yet, as I have not patched my Apache for phhttpd yet. (Zach: Can phhttpd run without these modifications, or run Apache without phhttpd afterwards?) The discussion about "bloat" cannot be over the filesize of the kernel-tarbal, as this overhead is minimal. It also cannot be "Everything that can be done in userspace, should be banned from the kernel". Even the TCP/IP stack and the VFS would have to be banned. It is the common perception, that it is a task of the kernel to give file-data to "processes" that ask for it. The VFS is there for userspace programs, kNFSd is there to do this to remote processes over the NFS protocol, for the very same reasons (latency) as mentioned above. kHTTPd is _one way_ of doing this for remote processes over the HTTP protocol. HTTP is much cleaner than NFS in many ways, even though the RFC is 175 pages long. kHTTPd could also have been done the Microsoft way, that is to do everything from interrupt-handlers. That would increase performance, sure, but it would blast all modularity to bits. I am not arguing that all such protocols should be implemented inside the kernel (far from that), but there are a few protocols that matter a lot in the outside world. HTTP is increasingly important, and not just for Benchmarks. GNOME and KDE will provide transparent network-access through HTTP, for example. This will change HTTP in the direction of a file-server protocol, where latency counts. About the remark from Alexey that kHTTPd isn't modular at all: There is (was) some miscommunication between Alexey and me, and the issues at hand will be resolved shortly in a way that I hope is satusfactory for both of Alexey and me. I look forward to further improvements in phhttpd and have all confidence that in the end, Linux as a whole will become better from discussions like this (at least from the parts that use real arguments and facts). Greetings, Arjan van de Ven - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/