[LWN Logo]
[Timeline]
=====

TUX  -  Kernel checksum-caching httpd server/accelerator

Copyright (C) 2000 Red Hat, Inc.
Licensed under the terms of the GNU General Public License

=====


1. Introduction
---------------

   TUX is a http-daemon (webserver) for Linux. TUX is different from other
   webservers in that it runs partially from within the Linux-kernel as a
   module (device driver).  It caches partial TCP checksum data and uses
   the partial checksums to speed network transmission of data.  It also
   (given sufficiently-capable networking cards) enables direct scatter-
   gather DMA from the page cache directly to the network, avoiding
   extra data copies.

   TUX handles static pages directly, and can work in concert with
   kernel modules, user-space modules, and regular user-space
   web server daemons to provide dynamic content.  Regular user-space
   daemons do not need to be altered in any way for TUX to use them
   to provide content, but in order for TUX to cache dynamic content,
   user-space code has to use a new interface based on the tux(2)
   system call.

   Static web pages are not a very complex thing to serve, but these are
   very important nevertheless, since virtually all images are static,
   and a large portion of the html pages are static also. A "regular"
   webserver has little added value for static pages; it is simply a
   "copy file to network" operation.  This can be done very efficiently
   from within the Linux kernel; for example, the nfs (network file system)
   daemon performs a similar task and also runs in the kernel.

   But dynamic content is becoming a larger and larger part of the
   web, and TUX provides a way to cache dynamic content as well.
   TUX modules (which can be build in kernel space or in user space;
   user space is recommended) can create "objects" which are stored
   using the page cache and their precomputed TCP checksum data is
   cached as well.  To respond to a request for dynamic data, a TUX
   module can send a mix of dynamically-generated data and cached
   pre-generated objects, taking maximal advantage of TUX's zero-copy
   cached checksum architecture.  (Kernel-space modules are currently
   the only modules capable of making use of TUX's SSI support; that
   will change in the future.)

   This completely new architecture for providing dynamic content
   requires a new API.  Existing standard APIs for CGI are not
   sufficient to be mapped to TUX's API.  This means that existing
   CGI applications must be re-coded in order to take advantage of
   TUX's architecture.  TUX can, however, call CGI programs via
   its CGI module, so you can choose to convert only programs that
   need TUX's speed to the TUX api and run other programs using
   the standard CGI interface.  TUX can also redirect requests
   to another webserver, such as Apache, so on a single site, you
   can mix and match static content, TUX modules, old-style CGIs,
   and programs written to other webservers' APIs.

   Whenever TUX isn't sure what to do (that is, encounters input
   that it is not prepared to handle), it always redirects the
   request to Apache to handle in an RFC-compliant manner.

   Note: This document sometimes uses "Apache" instead of "any webserver
   you might ever want to use", just for reasons of readability.


2. Quick Start  
--------------

   1) Build a kernel with tux support built in or as a module (if
      necessary.
   2) Configure the tux subsystem via /proc/sys/net/http/* (or sysctl
      equivalents(
   3) Start the tux(8) daemon
   4) Stop tux with the stoptux(8) program

   N.B. Your distribution may include a /etc/rc.d/init.d/tux script;
   if so, your distribution's documentation overrides this.


3. Configuration 
----------------

   Modes of operation
   ==================


   There is one recommended mode of operation:

   1) TUX is main webserver, "Apache" is assistant running on port 8080
      (or whatever):
	clientport   -> 8080 (or whatever)
 	serverport   -> 80

   There is one less useful mode of operation:

   2) "Apache" is main webserver, TUX is assistant
	clientport   -> 80
  	serverport   -> 8080 (or whatever)

   
   Configuring TUX
   ==================

   Before you can start using TUX, you have to configure it. This
   is done through the /proc filesystem, and can thus be done from inside
   a script. Most parameters can only be set when TUX is not active.

   The following things need configuration:

   1) The port where TUX should listen for requests
   2) The port (on "localhost") where "Apache" is listening
   3) The location of the documents (documentroot)
   4) The strings that indicate dynamic content (optional)
      [  "cgi-bin" is added by default ]

   It is very important that the documentroot for TUX matches the
   documentroot for the userspace-daemon, as TUX might "redirect"
   any request to this userspace-daemon.

   A typical script (for the first mode of operation) to do this would 
   look like:

	#!/bin/sh
	modprobe tux
	echo 8080 > /proc/sys/net/http/clientport
	echo 80 > /proc/sys/net/http/serverport
	echo /var/www > /proc/sys/net/http/documentroot
	echo php3 > /proc/sys/net/http/dynamic
	echo shtml > /proc/sys/net/http/dynamic
	tux $THREADS $DOCROOT $MODULES

   For the second mode of operation, this would be:

	#!/bin/sh
	modprobe tux
	echo 80 > /proc/sys/net/http/clientport
	echo 8080 > /proc/sys/net/http/serverport
	echo /var/www > /proc/sys/net/http/documentroot
	echo php3 > /proc/sys/net/http/dynamic
	echo shtml > /proc/sys/net/http/dynamic
	tux $THREADS $DOCROOT $MODULES

   If the clientport is 8080, you also have to change the configuration of the 
   userspace daemon. For Apache, you do this by changing

   Port 80

   to 

   Port 8080

   in /etc/apache/conf/httpd.conf. For security reasons, you can also change 
   
   BindAddress *

   to

   BindAddress 127.0.0.1

   (in the same file) to prevent outside users from accessing Apache 
   directly. Only do this if TUX is the main webserver.


   If you have /etc/sysctl.conf, you probably want to use it instead
   of echoing numbers directly into /proc/sys/net/http/*.  For example,
	echo 8080 > /proc/sys/net/http/clientport
   becomes a line in /etc/sysctl.conf like
	net.http.clientport = 8080


   For each CGI program you have (under $DOCROOT/cgi-bin/ by default),
   there must be a corresponding file under $DOCROOT (not in the cgi-bin
   directory) to tell TUX that it has permission to run the CGI program.
   So for TUX to run $DOCROOT/cgi-bin/foo/bar/xx, $DOCROOT/foo/bar/xx must
   exist and have the permissions specified in mode_cgi.  The file
   $DOCROOT/foo/bar/xx must, of course, be executable.

   
   Stopping TUX
   ===============
   In order to change the configuration, you should stop TUX by running
   "/etc/rc.d/init.d/tux stop" at a command prompt.

   If this doesn't work fast enough for you (the commands above can wait for 
   a remote connection to close down), you can send the daemons a "HUP"
   signal after you told them to stop, using "killall -HUP tux". This will
   cause the daemon-threads to stop immediately. 

   Note that the daemons will restart immediately if they are not told to
   stop.

   

4. Permissions
--------------
   The security model of TUX is very strict. It can be, since there is a 
   userspace daemon that can handle the complex exceptions. 

   TUX only serves a file if

	1)  There is no "?" in the URL
	2)  The URL starts with a "/"
	3)  The file indicated by the URL exists
	4)  The file is world-readable (*)
	5)  The file is not a directory, executable or has the Sticky-bit
	    set (*)
	6)  The URL doesn't contain any "forbidden" substrings such as ".."
	    and "cgi-bin" (*)
	7)  The mime-type is known (*)

   The items marked with a (*) are configurable through the
   sysctl-parameters in /proc/sys/net/http.


   In all cases where any of the above conditions isn't met, the
   userspace-daemon is handed the request.



5. Parameters
-------------
   The following parameters are settable through /proc/sys/net/http:
   (Permissions are set via hexadecimal values, not the symbolic values
   shown here for maximum clarity)
 
	Name		Default		Description

	serverport	80		The port where TUX listens on

	clientport	8080		The port of the userspace
					http-daemon

	threads		2		The number of server-threads. Should
					be at most 1 per CPU.

	documentroot	/var/www	the directory where the
					document-files are

	start		0		Set to 1 to start TUX 
					(this also resets "stop" to 0)

	stop		0		Set to 1 to stop TUX
					(this also resets "start" to 0)

	unload		0		Set to 1 to prepare TUX for
					unloading of the module

	sloppymime	0		If set to 1, unknown mime-types are
					set to text/html. If set to 0,
					files with unknown mime-types are
					handled by the userspace daemon

	perm_required	S_IROTH		Minimum permissions required
					(for values see "man 2 stat")
	
	perm_forbid	dir+sticky+	Permission mask with "forbidden"
			execute		permissions.
					(for values see "man 2 stat")
	
	mode_cgi	S_IXUGO		Permission mask for CGI permision
					files.

	dynamic		cgi-bin ..	Strings that, if they are a subset
					of the URL, indicate "dynamic
					content"

	maxconnect	1000		Maximum number of concurrent
					connections