[LWN Logo]
[Timeline]
Date:         Mon, 26 Jun 2000 21:05:34 -0400
From: Patrick Reynolds <reynolds@CS.DUKE.EDU>
Subject:      Linux capability bounding set weakness
To: BUGTRAQ@SECURITYFOCUS.COM

Linux capability bounding sets are not as secure as originally intended,
particularly for disabling the loading of kernel modules, as suggested in
the documentation for the 'lcap' package and in two back issues of Linux
Weekly News.

Background: recent Linux kernels include a system setting in /proc called
the "capability bounding set" that allows administrators to set which
POSIX-ish capabilities should be denied all processes on the system.
That is, if you disable a capability in /proc/sys/kernel/cap-bound, no
process on the system can possess this capability, and no process except
init may re-enable the capability in /proc/sys/kernel/cap-bound.  (No
existing init supports this feature AFAIK, so the capability bounding set
is effectively irreversible.)

However, the capability bounding set is useless unless you disable
/dev/mem, because /proc/sys/kernel/cap-bound maps directly to the cap_bset
variable in kernel memory.  With a quick poke (remember peek and poke from
the days of BASIC on C64s and IBM PCs?) into /dev/mem, you can reset the
cap_bset variable, reenabling any or all capabilities, despite the
intended one-way-ness of the capability bounding set.  To get the address
for cap_bset, just:
  $ grep cap_bset System.map
  c01d46b0 D cap_bset
Strip off the leading 'c' (since the kernel segment maps to 0xc0000000 on
x86s) and you get the raw physical memory address (i.e., offset into
/dev/mem) to write to.  On an x86, it's a 32-bit, little-endian integer.
Write 0xffffffff to it to re-enable all capabilities.  (This does not give
processes these capabilities; it just prevents the kernel from universally
denying them as intended.)

To make capability bounding sets at all useful, you have to disable
CAP_SYS_RAWIO, which governs access to /dev/mem.  Be advised that doing so
will break X and any other user-space program that needs raw access to
memory or I/O ports.

As an aside, more fun with module security...  Even if you compile a
kernel with module loading completely disabled, a clever attacker could
still load custom, module-like code into the kernel using /dev/mem.  It's
trickier than changing cap-bound, but it's still feasible, because page
tables and syscall tables are similarly exposed through /dev/mem.

Exploit: read open(2) and mmap(2) and write it yourself.

Fix: if you disable anything in the capability bounding set, you must also
disable CAP_SYS_RAWIO and CAP_SYS_MODULE.

--Patrick