Date: Sat, 19 Jun 1999 07:47:38 +0200 (CEST) From: Arjan van de Ven <arjan@fenrus.demon.nl> To: linux-kernel@vger.rutgers.edu Subject: kHTTPd: Good or Bad Hi, In the last few days, a discussion about khttpd has begun. Some find it bloat, others see a use for it. I have not had time to answer many of the discussions, but if feel that I need to explain my point of view and the current status of kHTTPd. Implementation -------------- Some argument was made about the "ugly" implementation of kHTTPd. kHTTPd is currently designed much like Apache: There are X threads blocking in accept(), one of these threads get a new connection and handles the requests exclusively. When the request is handled, it blocks in accept() again. This design is far from optimal, it places a huge burden on the scheduler. There are some inconveniences in Linux that enhance this effect (for example, a threads wakes up on an incomming connection, but has to go to sleep right away while waiting for data). I am building a version that handles things differently (one thread per CPU that does all the work), but the current design was easy to implement as a proof of concept. An other point was made that the way kHTTPd handles userspace is uggly. This is certainly true for versions up to 0.1.1, but in the (pre)0.1.2 version this is handled in a much more graceful way. phhttp ------ Alan Cox argumented that phhttpd was faster than kHTTPd. I have not seen any numbers that indicate that, but I have not performed any benchmark myself either. phhttpd "cheats" though, it caches all files into memory and doesn't do a lot of things a httpd should, such as sending "Date:", "Last-Modified:", "Content-Type:" headers. Some of them might be easy to build into phhttpd, but the "Date:" header is more complex as it requires the daemon to send the _current_ that at the time of sending. kHTTPd doesn't cheat, sends those headers AND supports"If-Modified-Since" requests. If someone benchmarks kHTTPd agains phhttpd, this should be recognized. Cache ----- The point was made by an Apache developer that kHTTPd would do a better job if it was just caching requests and that userspace "commands" kHTTPd to send a certain file. If this is true, Alan Cox is right and kHTTPd shouldn't be in the kernel, but the problem with the kernel should be fixed. I don't think this is true though: 1) There already is a cache in the kernel, the buffercache. Caching things twice is insane 2) Caching the header is a joke since building one costs only one sprintf (At least, from within the kernel) and some good bookkeeping. Why in the kernel ----------------- kHTTPd solves one specific problem. Someone wants to see a file, and kHTTPd gets that file to this someone. knfsd also does this. The VFS also does this (it's just that the VFS doesn't use TCP/IP to communicate). Why are knfsd and the VFS in the kernel: because it is silly not to do this. I see no difference between a remote client that asks over NFS if a specific file has changed and a remote client that asks over http if a specific file has changed. The same for reading the content of this file. I find the argument that kHTTPd bloats the kernel and phhttpd doesn't somewhat strange. kHTTPd requires no kernel patches (it helps if you export some extra symbols though) and no modifications to the userspace-client of your choise. phhttpd _does_ require kernelpatches and although the concept can be used by other http-daemons, it means that those _USERSPACE_ daemons have to be tuned to Linux only. For Apache, this might work. But there are a lot of other (great) webservers, good at a specific job. They benifit from kHTTPd without any burden on their developers, but their architecture prohibits the use of the phhttpd-tricks. Greetings, Arjan van de Ven - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/