[LWN Logo]
[Timeline]
Date:	Wed, 13 Sep 2000 01:56:39 -0400
To:	linux-kernel@vger.kernel.org
Subject: Update Linux 2.4 Status/TODO list
From:	tytso@mit.edu


OK, here's the updated Linux 2.4 bug list.  I let myself get a bit
behind, so it took me a while to process through all of my backlogged
l-k mail archives to assemble this list.  As always, it's complete as I
can make it, but it's not perfect.  In particualar, some bugs listed on
this page may have been fixed already.  If so, or if you know some bug
that didn't make on to this list, please let me know.

For people who are wondering what changed, the differences from the last
major release of this page can be found at 

	http://linux24.sourceforge.net/status-changes.html

As always, if you're curious what state this document is in, you can
always get the latest copy by going to:

	http://linux24.sourceforge.net

						- Ted

                         Linux 2.4 Status/TODO Page

   Last modified: [tytso:20000913.0151EDT]

   Hopefully up to date as of: test8

1. Should Be Fixed (Confirmation Wanted)

     * Fbcon races (cursor problems when running continual streaming
       output mixed with printk + races when switching from X while doing
       continuous rapid printing --- Alan)

2. Capable Of Corrupting Your FS/data

     * Use PCI DMA by default in IDE is unsafe (must not do so on via
       VPx, x < 3) (requires chipset tuning to be enabled according to
       Andre Hedrick --- we need to turn this on by default -- TYT)
     * Fix the OOPS in usb-storage from the error-recovery handler.
       (reported by Matthew Dharm)
     * Non-atomic page-map operations can cause loss of dirty bit on
       pages (sct, alan)

3. Security

     * Fix module remove race bug (still to be done: TTY, ldisc, I2C,
       video_device - Al Viro) (Rogier Wolff will handle ATM)

4. Boot Time Failures

     * Use PCI DMA 'lost interrupt' problem with some hw [which ?] (NEC
       Versa LX with PIIX tuning)
     * HT6560/UMC8672 ide sets up stuff too early (before region stuff
       can be done)
     * Crashes on boot on some Compaqs ? (may be fixed)
     * Boot hangs on a range of Dell docking stations (Latitude)
          + Almost certainly related: PCI code doesn't see devices behind
            DECchip 21150 PCI bridges (used in Dell Latitude). Reported
            by Simon Trimmer . (Patch from Martin Mares exists but it
            disables cardbus devices, according to Tigran.)
          + Derek Fawcus at Cisco reports similar problems with Toshiba
            Tecra 8000 attached to the DeskStation V+ docking station.
            (once again, caused by bridge returning 0 when reading the
            I/O base/limit and Memory base/limit registers which confuses
            the new PCI resource code).
     * IBM Thinkpad 390 won't boot since 2.3.11 (See Decklin Foster for
       more info)

5. Compile errors

     * arcnet/com20020-isa.c doesn't compile, as of 2.4.0-test8. Dan
       Aloni has a fix
     * drivers/sound/cs46xx.c has compile errors test7 and test8 (C
       Sanjayan Rosenmund)

6. In Progress

     * Finish I2O merge (Intel/Alan)
     * Restore O_SYNC functionality (Stephen) - core code and ext2 done
     * Fix all remaining PCI code to use pci_enable_device (mostly done)
     * Fix, um, interesting races around dup2() and friends. (Al Viro)
     * Finish the audit/code review of the code dealing with descriptor
       tables. (Al Viro)
     * DMFE is not SMP safe (Frank Davis patch exists, but hasn't gotten
       much commens yet)
     * Audit all char and block drivers to ensure they are safe with the
       2.3 locking - a lot of them are not especially on the
       read()/write() path. (Frank Davis --- moving slowly; if someone
       wants to help, contact Frank)

7. Obvious Projects For People (well if you have the hardware..)

     * Make syncppp use new ppp code
     * Fix SPX socket code

8. Fix Exists But Isnt Merged

     * Update SGI VisWS to new-style IRQ handling (Ingo)
     * Support MP table above 1Gig (Ingo)
     * Dont panic on boot when meeting HP boxes with wacked APIC table
       numbering (AC)
     * Scheduler bugs in RT (Dimitris)
     * AIC7xxx doesnt work non PCI ? (Doug says OK, new version due
       anyway)
     * Fix boards with different TSC per CPU and kill TSC use on them
     * Floppy last block cache flush error
     * PPC-specific: won't boot on 601 CPU's (powermac) (Andreas Tobler;
       Paul Mackerras has fix in PPC tree)
     * IRDA fixes (patches from Russell King sent to Linus and DAG)
          + IRDA calls get_random_bytes before random is set up 
          + Infinite loop in IrDA parameter code
          + Device name in /proc/net/irda/irias is not updated when
            /proc/sys/net/irda/devname is written
          + IrDA Discovery slot allocation is not random
     * Splitting a posix lock causes an infinite loop (Stephen Rothwell)
     * Many network device drivers don't call MOD_INC_USE_COUNT in
       dev->open. (Paul Gortmaker has patches)
     * 2.4.0-test8 has a BUG at ll_rw_blk:711. (Johnny Accot, Steffen
       Luitz) (Al Viro has a patch)
     * using ramfs with highmem enabled can yield a kernel NULL pointer
       dereference. (wollny@cns.mpg.de has a patch)
     * Writing past end of removeable device can cause data corruption
       bugs in the future (Jari Ruusu)

9. To Do

     * Check all devices use resources properly (Everyone now has to use
       request_region and check the return since we no longer single
       thread driver inits in all module cases. Also memory regions are
       now requestable and a lot of old drivers dont know this yet. --
       Alan Cox)
     * Tulip hang on rmmod/crashes sometimes
     * Devfs races (mostly done - Al Viro)
     * Fix further NFS races (Al Viro)
     * Test other file systems on write
     * Fix mount failures due to copy_* user mishandling
     * Check all file systems are either LFS compliant or error large
       files
     * Issue with notifiers that try to deregister themselves? (lnz;
       notifier locking change by Garzik should backed out, according to
       Jeff)
     * Mount of new fs over existing mointpoint should return an error
       unless forced (Andrew McNabb, Alan Cox)
     * Misc locking problems
          + drivers/pcmcia/ds.c: ds_read & ds_write. SMP locks are
            missing, on UP the sleep_on() use is unsafe.
          + drivers/usb/*.c
               o usblp_{read,write}(): on UP concurrent read/write
                 operations could corrupt the urb structures if
                 copy_to_user sleeps, on SMP it's worse.
          + do_execve (Al Viro, reported by Manfred)
          + fix the quota races (Al Viro)
          + complete the ext2 races fixes (truncate) (Al Viro)
          + fix the UFS, minixfs and sysvfs SMP races(the latter couple
            is broken as ext2 was, UFS is _completely_ broken; eats
            filesystems) (Al Viro)
     * SCSI CD-ROM doesn't work on filesystems with < 2kb block size
       (Jens Axboe will fix)
     * Remove (now obsolete) checks for ->sb == NULL (Al Viro)
     * Audit list of drivers that dereference ioremap's return (Abramo
       Bagnara)
     * 2.4.0-test2 breaks the behaviour of the ether=0,0,eth1 boot
       parameter (dwguest)
     * ISAPnP can reprogram active devices (2.4.0-test5, Elmer Joandi,
       alan)
     * Multilink PPP can get the kernel into a tight loop which spams the
       console and freezes the machine (Aaron Tiensivu)
     * Writing to tapes > 2.4G causes tar to fail with EIO (using
       2.4.0-test7-pre5; it works under 2.4.0-test1-ac18 --- Tigran
       Aivazian)
     * mm->rss is modified in some places without holding the
       page_table_lock (sct)
     * Copying between two encrypting loop devices causes an immediate
       deadlock in the request queue (Andi Kleen)
     * FAT filesystem doesn't support 2kb sector sizes (did under 2.2.16,
       doesn't under 2.4.0test7. Kazu Makashima, alan)
     * The new hot plug PCI interface does not provide a method for
       passing the correct device name to cardmgr (David Hinds, alan)
     * PIIXn tuning can hang laptop (2.4.0-test8-pre6, David Ford)
     * non-PNP SB AWE32 has tobles in 2.4.0-test7 and newer (Gerard
       Sharp) (Paul Laufer has a potential patch)
     * Oops in dquot_transfer (David Ford, Martin Diehl) (Jan Kara has a
       potential patch)
     * Loading the qlogicfc driver in 2.4.0-test8 causes the kernel to
       loop forver reporting SCSI disks that aren't present (Paul
       Hubbard)

10. To Do But Non Showstopper

     * Go through as 2.4pre kicks in and figure what we should mark
       obsolete for the final 2.4 (i.e. XT hard disk support?)
     * Union mount (Al Viro)
     * Per Process rtsigio limit
     * iget abuse in knfsd
     * ISAPnP IRQ handling failing on SB1000 + resource handling bug
     * Parallel ports should set SA_SHIRQ if PCI (eg in Plip)
     * Devfs compiled in but not mounted causes crap for ->mnt_devname of
       root (Al Viro)
     * PCMCIA/Cardbus hangs (Basically unusable - Hinds pcmcia code is
       reliable)
          + PCMCIA crashes on unloading pci_socket
     * RTL 8139 cards sometimes stop responding. Both drivers don't
       handle this quite good enough yet. (reported by Rogier Wolff)
     * ATM phy-chip-driver interface change for Firestream ATM card
       (Rogier Wolff)
     * Loop device can still hang (William Stearns has script that will
       hang 2.4.0-test7)
     * USB pegasus driver doesn't work since 2.4.0test5 (David Ford)
     * TLAN nic appears to be adding a timer twice (2.4.0test8pre6, Arjan
       ve de Ven)

11. To Check

     * Check O_APPEND atomicity bug fixing is complete
     * Protection on i_size (sct) [Al Viro mostly done]
     * Mikulas claims we need to fix the getblk/mark_buffer_uptodate
       thing for 2.3.x as well
     * Network block device seems broken by block device changes
     * VFS?VM - mmap/write deadlock (demo code seems to show lock is
       there)
     * kiobuf seperate lock functions/bounce/page_address fixes
     * Fix routing by fwmark
     * rw semaphores on inodes to fix read/truncate races ? [Probably
       fixed]
     * Not all device drivers are safe now the write inode lock isnt
       taken on write
     * Multiwrite IDE breaks on a disk error [minor issue at best]
       (hopefully fixed)
     * ACPI/APM suspend issue - IDE related stuff ? (requires full
       taskfile support that was vetoed by Linus)
     * NFS bugs are fixed
     * Chase reports of SMB not working
     * Some AWE cards are not being found by ISAPnP ??
     * SHM segments not always being detached and destroyed right ?
       (problem reported by Lincoln Dale)
     * RAM disk contents vanishing on cramfs (block change) and bforget
       cases
     * ACPI hangs on boot for some systems (Are there any cases left ?)
     * Disappointing performance of Software Raid, esp. write performance
       (reported by Nils Rennebarth)
     * List of potential problems found by Stanford students using g++
       hacks:
          + Andy Chou's list of mismatched spinlocks and interrupts/bh
            enable/disable.
          + Seth Andrew Hallem's list potentially sleeping functions
            called with interrupts off or spinlocks held.
          + Dawson Engler's list of potential kmalloc/kfree bugs
     * Potential races in file locking code (Christian Ehrhardt)
          + locks_verify_area checks the wrong range if O_APPEND is set
            and the current file position is not at the end of the file.
          + dito if the file position changes between the call to
            locks_verify_area and the actual read/write (requires a
            shared file pointer, an attacker can use this to circumvent
            virtually any mandatory lock).
          + active writes should prevent anyone from getting mandatory
            locks for the area beeing written.
          + active reads should prevent anyone from getting mandatory
            write locks for the area beeing read.
     * Possible ppp problem (fail to connect; may be user error; reported
       by Matt Spong; claims worked on 2.3.40)
     * cdrecord doesn't work (produces CD-ROM coasters) w/o any errors
       reported, works under 2.2 (Damon LoCascio)
     * Possible race in b-tree code for HFS, HPFS, NTFS: insert into the
       tree whild doing a readdir (Matthew Wilcox)
     * Stressing the VM (IOPS SPEC SFS) with HIGHMEM turned on can hang
       system (linux-2.4.0test5, Ying Chen, Rik van Riel)
     * Eepro100 driver can sometimes report out of resources on reboot
       (Josue Emmanuel Amaro)

12. Probably Post 2.4

     * per super block write_super needs an async flag
     * addres_space needs a VM pressure/flush callback (Ingo)
     * per file_op rw_kiovec
     * rw sempahores on page faults (mmap_sem) (Currently protected by
       mmap_sem)
     * module remove race bugs (ipchains modules -- Rusty; won't fix for
       2.4)
     * NCR5380 isnt smp safe (Frank Davis --- belives the driver should
       be rewritten)
     _________________________________________________________________

Fixed

     * Incredibly slow loopback tcp bug (believed fixed about 2.3.48)
     * COMX series WAN now merged
     * VM needs rebalancing or we have a bad leak
     * SHM works chroot
     * SHM back compatibility
     * Intel i960 problems with I2O
     * Symbol clashes and other mess from _three_ copies of zlib!
     * PCI buffer overruns
     * Shared memory changes change the API breaking applications (eg
       gimp)
     * Finish softnet driver port over and cleanups
     * via rhine oopses under load ?
     * SCSI generic driver crashes controllers (need to pass
       PCI_DIR_UNKNOWN..)
     * UMSDOS fixups resync (not quite done)
     * Make NTFS sort of work
     * Any user can crash FAT fs code with ftruncate
     * AFFS fixups
     * Directory race fix for UFS
     * Security holes in execve()
     * Lan Media WAN update for 2.3
     * Get the Emu10K merged
     * Paride seems to need fixes for the block changes yet
     * Kernel corrupts fs and gs in some situations (Ulrich has demo
       code)
     * 1.07 AMI MegaRAID
     * Merge 2.2.15 changes (Alan)
     * Get RAID 0.90 in (Ingo)
     * S/390 Merge
     * NFS DoS fix (security)
     * Fix Space.c duplicate string/write to constants
     * Elevator and block handling queue change errors are all sorted
     * Make sure all drivers return 1 from their __setup functions (Done
       ?)
     * Enhanced disk statistics
     * Complete vfsmount merge (Al Viro)
     * Merge removed-buf-open directory stuff into VFS (Al Viro)
     * Problems with ip autoconfig according to Zaitcev
     * NFS causes dup kmem_create on reload (Trond)
     * vmalloc(GFP_DMA) is needed for DMA drivers (Ingo)
     * TLB flush should use highest priority (Ingo)
     * SMP affinity code creates multiple dirs with the same name (Ingo)
     * Set SMP affinity mask to actual cpu online mask (needed for some
       boards) (Ingo)
     * heavy swapping corrupts ptes (believed so)
     * pci_set_master forces a 64 latency on low latency setting
       devices.Some boards require all cards have latency <= 32
     * msync fails on NFS (probably fixed anyway)
     * Find out what has ruined disk I/O throughput. (mostly)
     * PIII FXSAVE/FXRESTORE support
     * The netdev name changing stuff broke GRE
     * put_user is broken for i386 machines (security) - sem stuff may be
       wrong too
     * BusLogic crashes when you cat /proc/scsi/BusLogic/0 (Robert de
       Vries)
     * Finish sorting out VM balancing (Rik Van Riel, Juan Quintela et
       al)
     * Fix eth= command line
     * 8139 + bridging fails
     * RtSig limit handling bug
     * Signals leak kernel memory (security) [FIX in ac tree]
     * TTY and N_HDLC layer called poll_wait twice per fd and corrupt
       memory
     * ATM layer calls poll_wait twice per fd and corrupts memory
     * Random calls poll_wait twice per fd and corrupts memory
     * PCI sound calls poll_wait twice per fd and corrupts memory
     * sbus audio calls poll_wait twice per fd and corrupts memory
     * IBM MCA driver breaks on Device_Inquiry at boot
     * SHM code corrupts memory (Russell)
     * Linux sends a 1K buffer with SCSI inquiries. The ANSI-SCSI limit
       is 255.
     * Linux uses TEST_UNIT_READY to chck for device presence on a
       PUN/LUN. The INQUIRY is the only valid test allowed by the spec.
     * truncate_inode_pages does unsafe page cache operations
     * Fix the ptrace code to be back compatible and add a new PTRACE
       call set for getting the PIII extra registers
     * EPIC100 fixes
     * Tlan and Epic100 crash under load
     * Fix hpfs_unlink (Al Viro)
     * exec loader permissions
     * Locking on getcwd
     * E820 memory setup causes crashes/corruption on some laptops[**VERY
       NASTY**] (fixed in test5)
     * Debian report that the gcc 2.95 possibly miscompiles fault.c or
       mm/remap.c (Perl script available from Arjan) (fixed in test2 or
       3)
     * Dcache threading (Al Viro)
     * Sockfs races (removing NULL ->i_sb stuf) (Al Viro)
     * Module remove race bug (done: anything with file_operations, fb
       stuff, procfs stuff - Al Viro)
     * DEFXX driver appears broken (reported fixed by Jeff Garzik)
     * Some FB drivers check the A000 area and find it busy then bomb out
       (checked and fixed, reported by Jeff Garzik)
     * Stick lock_kernel() calls around OSS driver with issues to hard to
       fix nicely for 2.4 itself (Alan, fixed)
     * Merge the current Compaq RAID driver into 2.4 (fixed, reported by
       thomas.hiller@sap.com)
     * mount crashes on Alpha platforms (fixed, reported by Thorsten
       Kranzkowski)
     * IDE fails on some VIA boards (eg the i-opener) (reported fixed by
       Konrad Stepien)
     * access_process_mm oops/lockup if task->mm changes (Manfred) [user
       can cause deliberately]
     * PCMCIA IRQ routing should now be fixed modulo ISA cards and bios
       doesn't tell us that an IRQ is ISA-only (Martin Mares)
     * TB Multisound driver hasnt been updated for new isa I/O totally.
       (reported fixed by John Coiner; see
       http://atv.ne.mediaone.net/linux-multisound)
     * yenta (PCMCIA) and pci_socket modules have mutual dependency
       (cardbus_register, yenta_operations) (test5, worked in test3)
       (reported fixed by Erik Mouw)
     * Keyboard/mouse problems (should be fixed?)
     * Loop device hangs (Peter Enderborg can duplicate by writing large
       file to dosfs mounted via loop device; Steve Dodd reports deadlock
       issues; Linus says fixed in test6)
     * Floppy driver broken by VFS changes. Other drivers may be too
       (Stuff gets called after _close now - unload race possibly too;
       should be fixed in test6)
     * OSS module remove races (fixed by Christoph Hellwig)
     * Merge the 2.2 ServeRAID driver into 2.4 (Christoph Hellwig)
     * AHA27xx is broken (maybe 28xx too) (reported fixed by Doug
       Ledford)
     * Merge the network fixes (DaveM)
     * Finish 64bit vfs merges (lockf64 and friends missing -- willy?)
       (Andreas Jaeger reports that lockf64 has been added for Intel and
       Alpha; other architectures may not be done, but if not, they won't
       build :-)
     * Can't compile CONFIG_IBMTR and CONFIG_PCMCIA_IBMTR in kernel at
       once; kernel link failes (rasmus; fixed in a kludgy way by not
       allowing the combination by arjan)
     * Merge the RIO driver
     * Potential deadlock in EMU10K driver when running SMP
       (orre@nada.kth.se, Alan)
     * Symbol clashes from ppp and irda compression code (Arjan van de
       Ven)
     * Kernel build has race conditions when building modversions.h
       (Mikael Pettersson)
     * USB Pegasus driver explodes on disconnect (lots of printk and/or
       OOPS spewage to the console. David Ford) (reported fixed by Petko
       Manolov)
     * unsafe sleep_on in ibmcam and ov511 drivers (never was a problem,
       according to Mark McClelland)
     * netfilter doesn't compile correctly (test7-pre3, reported by Pau
       Aliagas, fixed by test7-pre6, rusty)
     * Innd data corruption, probably caused by bug truncation bug (Rik
       van Riel)
     * If all the ISO NLS's are modules, there can be an undefined ref to
       CONFIG_NLS_DEFAULT in inode.c (Dale Amon --- not a bug CONFIG_NLS
       is forced)
     * Fix sysinfo interface so it is binary compatible with 2.2.x (i.e.
       mem_unit=1), except when memory >= 4Gb (Erik Andersen)
     * Some people report 2.3.x serial problems (reported fixed by Shaya
       Potter)
     * Some Ultra-I sbus sparc64 systems fail to boot since 2.4.0-test3,
       may be due to specific memory configurations. (reported fixed by
       davem)

Probably Hardware Bugs

     * Data corruption on IDE disks (Generic PCI DMA and SiS support
       Steven Walter) (sounds like PCChips #M599LMR motherboard doesn't
       disable UDMA when a non-UDMA cable is used. If you disable UDMA in
       the BIOS, then there is no problem. hardware bug?)
     * AHA29xx driver appears to stomp other cards (may be BIOS; probably
       motherboard has assigned to small of a range to a card, so that
       it's overlapping with some other card -- Doug Ledford)
     * USB hangs on APM suspend on some machines (fixed more most; Alan
       has one still that fails but the BIOS has 'issues')
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/