Date: Mon, 18 Sep 2000 16:43:00 -0700 From: Dave Higgen <dhiggen@valinux.com> To: nfs@lists.sourceforge.net Subject: [NFS] Server-side NFS3 patch for 2.2.18-pre9: take 2. This is a multi-part message in MIME format. --------------500E2001E72FE3DE433B11A0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit > Alan, > > I'm attaching the server-side fixes+NFS3 patch resynchronized to > 2.2.18-pre9. > > Hope we can get this in for 2.2.18 to complete the set, now that the > client-side stuff is in. Thanks, > > > Dave Higgen Ackk.... I just realized there was some .orig file cruft left in there. Terribly sorry about that. Here's the cleaned-up version. Dave Higgen --------------500E2001E72FE3DE433B11A0 Content-Type: text/plain; charset=us-ascii; name="dhiggen-over-2.2.18-pre9" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="dhiggen-over-2.2.18-pre9" diff -Naur pre9/linux/Documentation/Changes test/linux/Documentation/Changes --- pre9/linux/Documentation/Changes Mon Sep 4 10:39:15 2000 +++ test/linux/Documentation/Changes Mon Sep 18 14:35:50 2000 @@ -651,9 +651,9 @@ ftp://ftp.mathematik.th-darmstadt.de/pub/linux/okir/dontuse/nfs-server-2.2beta40.tar.gz ftp://linux.nrao.edu/mirrors/fb0429.mathematik.th-darmstadt.de/pub/linux/okir/dontuse/nfs-server-2.2beta40.tar.gz -The kernel-level 1.4.7 release: -ftp://ftp.varesearch.com/pub/support/hjl/knfsd/knfsd-1.4.7.tar.gz -ftp://ftp.kernel.org/pub/linux/devel/gcc/knfsd-1.4.7.tar.gz + +The kernel-level nfs-utils-0.1.6 release: +ftp://nfs.sourceforge.net/pub/nfs/nfs-utils-0.1.6.tar.gz Net-tools ========= diff -Naur pre9/linux/Documentation/Configure.help test/linux/Documentation/Configure.help --- pre9/linux/Documentation/Configure.help Mon Sep 18 13:58:31 2000 +++ test/linux/Documentation/Configure.help Mon Sep 18 14:35:50 2000 @@ -7614,31 +7614,33 @@ NFS server support CONFIG_NFSD - If you want your Linux box to act as a NFS *server*, so that other + If you want your Linux box to act as an NFS *server*, so that other computers on your local network which support NFS can access certain directories on your box transparently, you have two options: you can use the self-contained user space program nfsd, in which case you - should say N here, or you can say Y and use this new experimental - kernel based NFS server. The advantage of the kernel based solution - is that it is faster; it might not be completely stable yet, though. + should say N here, or you can say Y and use the kernel based NFS + server. The kernel based solution is faster and is now the recommended + solution: no further development is occurring on the userspace server and + support of it may be discontinued in future. In either case, you will need support software; the respective locations are given in the file Documentation/Changes in the NFS section. - Please read the NFS-HOWTO, available via FTP (user: anonymous) from - ftp://metalab.unc.edu/pub/Linux/docs/HOWTO. + Please read the NFS-HOWTO, available from + http://www.linuxdoc.org/HOWTO/NFS-HOWTO.html . + The NFS server is also available as a module ( = code which can be inserted in and removed from the running kernel whenever you want). The module is called nfsd.o. If you want to compile it as a module, say M here and read Documentation/modules.txt. If unsure, say N. -Emulate Sun NFS daemon -CONFIG_NFSD_SUN - If you would like for the server to allow clients to access - directories that are mount points on the local filesystem (this is - how nfsd behaves on Sun systems), say yes here. If unsure, say N. +Provide NFSv3 server support (EXPERIMENTAL) +CONFIG_NFSD_V3 + If you would like to include the NFSv3 server as well as the NFSv2 + server, say Y here. File locking, via the NLMv4 protocol, is now + supported. If unsure, say N. OS/2 HPFS filesystem support (read only) CONFIG_HPFS_FS diff -Naur pre9/linux/fs/Config.in test/linux/fs/Config.in --- pre9/linux/fs/Config.in Mon Sep 18 13:58:34 2000 +++ test/linux/fs/Config.in Mon Sep 18 14:35:50 2000 @@ -69,14 +69,15 @@ if [ "$CONFIG_INET" = "y" ]; then tristate 'Coda filesystem support (advanced network fs)' CONFIG_CODA_FS tristate 'NFS filesystem support' CONFIG_NFS_FS + if [ "$CONFIG_NFS_FS" != "n" ]; then + bool ' NFS Version 3 filesystem support' CONFIG_NFS_V3 + fi if [ "$CONFIG_NFS_FS" = "y" -a "$CONFIG_IP_PNP" = "y" ]; then bool ' Root file system on NFS' CONFIG_ROOT_NFS fi - if [ "$CONFIG_EXPERIMENTAL" = "y" ]; then - tristate 'NFS server support' CONFIG_NFSD - fi - if [ "$CONFIG_EXPERIMENTAL" = "y" -a "$CONFIG_NFSD" != "n" ]; then - bool ' Emulate SUN NFS server' CONFIG_NFSD_SUN + tristate 'NFS server support' CONFIG_NFSD + if [ "$CONFIG_NFSD" != "n" -a "$CONFIG_EXPERIMENTAL" = "y" ]; then + bool ' NFS Version 3 server support (EXPERIMENTAL)' CONFIG_NFSD_V3 fi if [ "$CONFIG_NFS_FS" = "y" -o "$CONFIG_NFSD" = "y" ]; then define_bool CONFIG_SUNRPC y @@ -89,9 +90,6 @@ define_bool CONFIG_SUNRPC n define_bool CONFIG_LOCKD n fi - fi - if [ "$CONFIG_NFS_FS" != "n" -o "$CONFIG_NFSD" != "n" ]; then - bool ' NFS Version 3' CONFIG_NFS_V3 fi tristate 'SMB filesystem support (to mount WfW shares etc.)' CONFIG_SMB_FS if [ "$CONFIG_SMB_FS" != "n" ]; then diff -Naur pre9/linux/fs/ext2/ialloc.c test/linux/fs/ext2/ialloc.c --- pre9/linux/fs/ext2/ialloc.c Tue Jan 4 10:12:23 2000 +++ test/linux/fs/ext2/ialloc.c Mon Sep 18 14:35:50 2000 @@ -478,8 +478,14 @@ if (inode->u.ext2_i.i_flags & EXT2_SYNC_FL) inode->i_flags |= MS_SYNCHRONOUS; insert_inode_hash(inode); + /* + * dhXXX: + * To be really picky we should set i_generation to one more than + * whatever's on the disk, to ensure a monotonic advance of + * generation for NFS. But the odds of duplicating the last igen + * are only 1 in 2^32... + */ inode->i_generation = inode_generation_count++; - inode->u.ext2_i.i_version = inode->i_generation; mark_inode_dirty(inode); unlock_super (sb); diff -Naur pre9/linux/fs/ext2/inode.c test/linux/fs/ext2/inode.c --- pre9/linux/fs/ext2/inode.c Mon Sep 18 13:58:34 2000 +++ test/linux/fs/ext2/inode.c Mon Sep 18 14:35:50 2000 @@ -47,14 +47,14 @@ */ void ext2_delete_inode (struct inode * inode) { - if (inode->i_ino == EXT2_ACL_IDX_INO || + if (is_bad_inode(inode) || + inode->i_ino == EXT2_ACL_IDX_INO || inode->i_ino == EXT2_ACL_DATA_INO) return; inode->u.ext2_i.i_dtime = CURRENT_TIME; - /* When we delete an inode, we increment its i_version. If it - is ever read in from disk again, it will have a different - i_version. */ - inode->u.ext2_i.i_version++; + /* When we delete an inode, we increment its i_generation. + If it is read in from disk again, the generation will differ. */ + inode->i_generation++; mark_inode_dirty(inode); ext2_update_inode(inode, IS_SYNC(inode)); inode->i_size = 0; @@ -511,9 +511,20 @@ inode->i_ctime = le32_to_cpu(raw_inode->i_ctime); inode->i_mtime = le32_to_cpu(raw_inode->i_mtime); inode->u.ext2_i.i_dtime = le32_to_cpu(raw_inode->i_dtime); + /* We now have enough fields to check if the inode was active or not. + * This is needed because nfsd might try to access dead inodes + * the test is that same one that e2fsck uses + * NeilBrown 1999oct15 + */ + if (inode->i_nlink == 0 && (inode->i_mode == 0 || inode->u.ext2_i.i_dtime)) { + /* this inode is deleted */ + brelse (bh); + goto bad_inode; + } inode->i_blksize = PAGE_SIZE; /* This is the optimal IO size (for stat), not the fs block size */ inode->i_blocks = le32_to_cpu(raw_inode->i_blocks); inode->i_version = ++global_event; + inode->i_generation = le32_to_cpu(raw_inode->i_generation); inode->u.ext2_i.i_new_inode = 0; inode->u.ext2_i.i_flags = le32_to_cpu(raw_inode->i_flags); inode->u.ext2_i.i_faddr = le32_to_cpu(raw_inode->i_faddr); @@ -535,8 +546,6 @@ << 32; #endif } - inode->u.ext2_i.i_version = le32_to_cpu(raw_inode->i_version); - inode->i_generation = inode->u.ext2_i.i_version; inode->u.ext2_i.i_block_group = block_group; inode->u.ext2_i.i_next_alloc_block = 0; inode->u.ext2_i.i_next_alloc_goal = 0; @@ -647,6 +656,7 @@ raw_inode->i_ctime = cpu_to_le32(inode->i_ctime); raw_inode->i_mtime = cpu_to_le32(inode->i_mtime); raw_inode->i_blocks = cpu_to_le32(inode->i_blocks); + raw_inode->i_generation = cpu_to_le32(inode->i_generation); raw_inode->i_dtime = cpu_to_le32(inode->u.ext2_i.i_dtime); raw_inode->i_flags = cpu_to_le32(inode->u.ext2_i.i_flags); raw_inode->i_faddr = cpu_to_le32(inode->u.ext2_i.i_faddr); @@ -663,7 +673,6 @@ raw_inode->i_size_high = cpu_to_le32(inode->i_size >> 32); #endif } - raw_inode->i_version = cpu_to_le32(inode->u.ext2_i.i_version); if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode)) raw_inode->i_block[0] = cpu_to_le32(kdev_t_to_nr(inode->i_rdev)); else for (block = 0; block < EXT2_N_BLOCKS; block++) diff -Naur pre9/linux/fs/ext2/ioctl.c test/linux/fs/ext2/ioctl.c --- pre9/linux/fs/ext2/ioctl.c Tue Jan 4 10:12:23 2000 +++ test/linux/fs/ext2/ioctl.c Mon Sep 18 14:35:50 2000 @@ -68,15 +68,14 @@ mark_inode_dirty(inode); return 0; case EXT2_IOC_GETVERSION: - return put_user(inode->u.ext2_i.i_version, (int *) arg); + return put_user(inode->i_generation, (int *) arg); case EXT2_IOC_SETVERSION: if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER)) return -EPERM; if (IS_RDONLY(inode)) return -EROFS; - if (get_user(inode->u.ext2_i.i_version, (int *) arg)) + if (get_user(inode->i_generation, (int *) arg)) return -EFAULT; - inode->i_generation = inode->u.ext2_i.i_version; inode->i_ctime = CURRENT_TIME; mark_inode_dirty(inode); return 0; diff -Naur pre9/linux/fs/lockd/Makefile test/linux/fs/lockd/Makefile --- pre9/linux/fs/lockd/Makefile Mon Sep 18 13:58:34 2000 +++ test/linux/fs/lockd/Makefile Mon Sep 18 14:35:50 2000 @@ -9,11 +9,11 @@ O_TARGET := lockd.o O_OBJS := clntlock.o clntproc.o host.o svc.o svclock.o svcshare.o \ - svcproc.o svcsubs.o mon.o xdr.o + svcproc.o svcsubs.o mon.o xdr.o -ifdef CONFIG_NFS_V3 - O_OBJS += xdr4.o -endif +#ifdef CONFIG_NFS_V3 + O_OBJS += xdr4.o svc4proc.o +#endif OX_OBJS := lockd_syms.o M_OBJS := $(O_TARGET) diff -Naur pre9/linux/fs/lockd/lockd_syms.c test/linux/fs/lockd/lockd_syms.c --- pre9/linux/fs/lockd/lockd_syms.c Tue Jan 4 10:12:23 2000 +++ test/linux/fs/lockd/lockd_syms.c Mon Sep 18 14:35:50 2000 @@ -23,6 +23,7 @@ #include <linux/sunrpc/clnt.h> #include <linux/sunrpc/svc.h> #include <linux/lockd/lockd.h> +#include <linux/lockd/xdr4.h> #include <linux/lockd/syscall.h> /* Start/stop the daemon */ @@ -40,5 +41,18 @@ /* Configuration at insmod time */ EXPORT_SYMBOL(nlmsvc_grace_period); EXPORT_SYMBOL(nlmsvc_timeout); + +/* NLM4 exported symbols */ +EXPORT_SYMBOL(nlm4_lck_denied_grace_period); +EXPORT_SYMBOL(nlm4_lck_denied); +EXPORT_SYMBOL(nlm4_lck_blocked); +EXPORT_SYMBOL(nlm4_rofs); +EXPORT_SYMBOL(nlm4_stale_fh); +EXPORT_SYMBOL(nlm4_granted); +EXPORT_SYMBOL(nlm4_deadlock); +EXPORT_SYMBOL(nlm4_failed); +EXPORT_SYMBOL(nlm4_fbig); +EXPORT_SYMBOL(nlm4_lck_denied_nolocks); + #endif /* CONFIG_MODULES */ diff -Naur pre9/linux/fs/lockd/svc.c test/linux/fs/lockd/svc.c --- pre9/linux/fs/lockd/svc.c Mon Sep 4 10:39:22 2000 +++ test/linux/fs/lockd/svc.c Mon Sep 18 14:35:50 2000 @@ -230,6 +230,7 @@ error = -ENOMEM; serv = svc_create(&nlmsvc_program, 0, NLMSVC_XDRSIZE); + if (!serv) { printk(KERN_WARNING "lockd_up: create service failed\n"); goto out; @@ -353,7 +354,7 @@ static struct svc_version nlmsvc_version3 = { 3, 24, nlmsvc_procedures, NULL }; -#ifdef CONFIG_NFSD_NFS3 +#if (defined(CONFIG_NFSD) || defined(CONFIG_NFSD_MODULE)) && defined(CONFIG_NFS_V3) static struct svc_version nlmsvc_version4 = { 4, 24, nlmsvc_procedures4, NULL }; @@ -363,7 +364,7 @@ &nlmsvc_version1, NULL, &nlmsvc_version3, -#ifdef CONFIG_NFSD_NFS3 +#if (defined(CONFIG_NFSD) || defined(CONFIG_NFSD_MODULE)) && defined(CONFIG_NFS_V3) &nlmsvc_version4, #endif }; @@ -383,6 +384,7 @@ int lockdctl(int cmd, void *opaque_argp, void *opaque_resp) { +#if 0 int err; MOD_INC_USE_COUNT; @@ -398,4 +400,17 @@ MOD_DEC_USE_COUNT; return err; +#else + /* + * For the moment, unless a real need for locks on NFS root + * emerges, we revert to automatic lockd start. But we will leave the + * manual call machinery in place in case we ever want to go + * back to it. I felt a warning was useful here, but many didn't like + * that, so I'll suppress it. - dhiggen + */ +#if 0 + printk("lockd: note, lockd is automatic in this kernel. Remove rpc.lockd from any rc scripts.\n"); +#endif + return (0); +#endif } diff -Naur pre9/linux/fs/lockd/svc4proc.c test/linux/fs/lockd/svc4proc.c --- pre9/linux/fs/lockd/svc4proc.c Wed Dec 31 16:00:00 1969 +++ test/linux/fs/lockd/svc4proc.c Mon Sep 18 14:35:50 2000 @@ -0,0 +1,561 @@ +/* + * linux/fs/lockd/svc4proc.c + * + * Lockd server procedures. We don't implement the NLM_*_RES + * procedures because we don't use the async procedures. + * + * Copyright (C) 1996, Olaf Kirch <okir@monad.swb.de> + */ + +#include <linux/types.h> +#include <linux/sched.h> +#include <linux/malloc.h> +#include <linux/in.h> +#include <linux/sunrpc/svc.h> +#include <linux/sunrpc/clnt.h> +#include <linux/nfsd/nfsd.h> +#include <linux/lockd/xdr4.h> +#include <linux/lockd/lockd.h> +#include <linux/lockd/share.h> +#include <linux/lockd/sm_inter.h> + + +#define NLMDBG_FACILITY NLMDBG_CLIENT + +static u32 nlm4svc_callback(struct svc_rqst *, u32, struct nlm_res *); +static void nlm4svc_callback_exit(struct rpc_task *); + +/* + * Obtain client and file from arguments + */ +static u32 +nlm4svc_retrieve_args(struct svc_rqst *rqstp, struct nlm_args *argp, + struct nlm_host **hostp, struct nlm_file **filp) +{ + struct nlm_host *host = NULL; + struct nlm_file *file = NULL; + struct nlm_lock *lock = &argp->lock; + u32 error = 0; + + /* nfsd callbacks must have been installed for this procedure */ + if (!nlmsvc_ops) + return nlm4_lck_denied_nolocks; + + /* Obtain handle for client host */ + if (rqstp->rq_client == NULL) { + printk(KERN_NOTICE + "lockd: unauthenticated request from (%08x:%d)\n", + ntohl(rqstp->rq_addr.sin_addr.s_addr), + ntohs(rqstp->rq_addr.sin_port)); + return nlm4_lck_denied_nolocks; + } + + /* Obtain host handle */ + if (!(host = nlmsvc_lookup_host(rqstp)) + || (argp->monitor && !host->h_monitored && nsm_monitor(host) < 0)) + goto no_locks; + *hostp = host; + + /* Obtain file pointer. Not used by FREE_ALL call. */ + if (filp != NULL) { + if ((error = nlm_lookup_file(rqstp, &file, &lock->fh)) != 0) + goto no_locks; + *filp = file; + + /* Set up the missing parts of the file_lock structure */ + lock->fl.fl_file = &file->f_file; + lock->fl.fl_owner = (fl_owner_t) host; + } + + return 0; + +no_locks: + if (host) + nlm_release_host(host); + /* check the error to see if its a stale fh error */ + if (error) + return error; + return nlm4_lck_denied_nolocks; +} + +/* + * NULL: Test for presence of service + */ +static int +nlm4svc_proc_null(struct svc_rqst *rqstp, void *argp, void *resp) +{ + dprintk("lockd: NULL called\n"); + return rpc_success; +} + +/* + * TEST: Check for conflicting lock + */ +static int +nlm4svc_proc_test(struct svc_rqst *rqstp, struct nlm_args *argp, + struct nlm_res *resp) +{ + struct nlm_host *host; + struct nlm_file *file; + + dprintk("lockd: TEST4 called\n"); + resp->cookie = argp->cookie; + + /* Don't accept test requests during grace period */ + if (nlmsvc_grace_period) { + resp->status = nlm4_lck_denied_grace_period; + return rpc_success; + } + + /* Obtain client and file */ + if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file))) + return rpc_success; + + /* Now check for conflicting locks */ + resp->status = nlmsvc_testlock(file, &argp->lock, &resp->lock); + + dprintk("lockd: TEST4 status %d\n", ntohl(resp->status)); + nlm_release_host(host); + nlm_release_file(file); + return rpc_success; +} + +static int +nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_args *argp, + struct nlm_res *resp) +{ + struct nlm_host *host; + struct nlm_file *file; + + dprintk("lockd: LOCK called\n"); + + resp->cookie = argp->cookie; + + /* Don't accept new lock requests during grace period */ + if (nlmsvc_grace_period && !argp->reclaim) { + resp->status = nlm4_lck_denied_grace_period; + return rpc_success; + } + + /* Obtain client and file */ + if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file))) + return rpc_success; + +#if 0 + /* If supplied state doesn't match current state, we assume it's + * an old request that time-warped somehow. Any error return would + * do in this case because it's irrelevant anyway. + * + * NB: We don't retrieve the remote host's state yet. + */ + if (host->h_nsmstate && host->h_nsmstate != argp->state) { + resp->status = nlm4_lck_denied_nolocks; + } else +#endif + + /* Now try to lock the file */ + resp->status = nlmsvc_lock(rqstp, file, &argp->lock, + argp->block, &argp->cookie); + + dprintk("lockd: LOCK status %d\n", ntohl(resp->status)); + nlm_release_host(host); + nlm_release_file(file); + return rpc_success; +} + +static int +nlm4svc_proc_cancel(struct svc_rqst *rqstp, struct nlm_args *argp, + struct nlm_res *resp) +{ + struct nlm_host *host; + struct nlm_file *file; + + dprintk("lockd: CANCEL called\n"); + + resp->cookie = argp->cookie; + + /* Don't accept requests during grace period */ + if (nlmsvc_grace_period) { + resp->status = nlm4_lck_denied_grace_period; + return rpc_success; + } + + /* Obtain client and file */ + if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file))) + return rpc_success; + + /* Try to cancel request. */ + resp->status = nlmsvc_cancel_blocked(file, &argp->lock); + + dprintk("lockd: CANCEL status %d\n", ntohl(resp->status)); + nlm_release_host(host); + nlm_release_file(file); + return rpc_success; +} + +/* + * UNLOCK: release a lock + */ +static int +nlm4svc_proc_unlock(struct svc_rqst *rqstp, struct nlm_args *argp, + struct nlm_res *resp) +{ + struct nlm_host *host; + struct nlm_file *file; + + dprintk("lockd: UNLOCK called\n"); + + resp->cookie = argp->cookie; + + /* Don't accept new lock requests during grace period */ + if (nlmsvc_grace_period) { + resp->status = nlm4_lck_denied_grace_period; + return rpc_success; + } + + /* Obtain client and file */ + if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file))) + return rpc_success; + + /* Now try to remove the lock */ + resp->status = nlmsvc_unlock(file, &argp->lock); + + dprintk("lockd: UNLOCK status %d\n", ntohl(resp->status)); + nlm_release_host(host); + nlm_release_file(file); + return rpc_success; +} + +/* + * GRANTED: A server calls us to tell that a process' lock request + * was granted + */ +static int +nlm4svc_proc_granted(struct svc_rqst *rqstp, struct nlm_args *argp, + struct nlm_res *resp) +{ + resp->cookie = argp->cookie; + + dprintk("lockd: GRANTED called\n"); + resp->status = nlmclnt_grant(&argp->lock); + dprintk("lockd: GRANTED status %d\n", ntohl(resp->status)); + return rpc_success; +} + +/* + * `Async' versions of the above service routines. They aren't really, + * because we send the callback before the reply proper. I hope this + * doesn't break any clients. + */ +static int +nlm4svc_proc_test_msg(struct svc_rqst *rqstp, struct nlm_args *argp, + void *resp) +{ + struct nlm_res res; + u32 stat; + + dprintk("lockd: TEST_MSG called\n"); + + if ((stat = nlm4svc_proc_test(rqstp, argp, &res)) == 0) + stat = nlm4svc_callback(rqstp, NLMPROC_TEST_RES, &res); + return stat; +} + +static int +nlm4svc_proc_lock_msg(struct svc_rqst *rqstp, struct nlm_args *argp, + void *resp) +{ + struct nlm_res res; + u32 stat; + + dprintk("lockd: LOCK_MSG called\n"); + + if ((stat = nlm4svc_proc_lock(rqstp, argp, &res)) == 0) + stat = nlm4svc_callback(rqstp, NLMPROC_LOCK_RES, &res); + return stat; +} + +static int +nlm4svc_proc_cancel_msg(struct svc_rqst *rqstp, struct nlm_args *argp, + void *resp) +{ + struct nlm_res res; + u32 stat; + + dprintk("lockd: CANCEL_MSG called\n"); + + if ((stat = nlm4svc_proc_cancel(rqstp, argp, &res)) == 0) + stat = nlm4svc_callback(rqstp, NLMPROC_CANCEL_RES, &res); + return stat; +} + +static int +nlm4svc_proc_unlock_msg(struct svc_rqst *rqstp, struct nlm_args *argp, + void *resp) +{ + struct nlm_res res; + u32 stat; + + dprintk("lockd: UNLOCK_MSG called\n"); + + if ((stat = nlm4svc_proc_unlock(rqstp, argp, &res)) == 0) + stat = nlm4svc_callback(rqstp, NLMPROC_UNLOCK_RES, &res); + return stat; +} + +static int +nlm4svc_proc_granted_msg(struct svc_rqst *rqstp, struct nlm_args *argp, + void *resp) +{ + struct nlm_res res; + u32 stat; + + dprintk("lockd: GRANTED_MSG called\n"); + + if ((stat = nlm4svc_proc_granted(rqstp, argp, &res)) == 0) + stat = nlm4svc_callback(rqstp, NLMPROC_GRANTED_RES, &res); + return stat; +} + +/* + * SHARE: create a DOS share or alter existing share. + */ +static int +nlm4svc_proc_share(struct svc_rqst *rqstp, struct nlm_args *argp, + struct nlm_res *resp) +{ + struct nlm_host *host; + struct nlm_file *file; + + dprintk("lockd: SHARE called\n"); + + resp->cookie = argp->cookie; + + /* Don't accept new lock requests during grace period */ + if (nlmsvc_grace_period && !argp->reclaim) { + resp->status = nlm4_lck_denied_grace_period; + return rpc_success; + } + + /* Obtain client and file */ + if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file))) + return rpc_success; + + /* Now try to create the share */ + resp->status = nlmsvc_share_file(host, file, argp); + + dprintk("lockd: SHARE status %d\n", ntohl(resp->status)); + nlm_release_host(host); + nlm_release_file(file); + return rpc_success; +} + +/* + * UNSHARE: Release a DOS share. + */ +static int +nlm4svc_proc_unshare(struct svc_rqst *rqstp, struct nlm_args *argp, + struct nlm_res *resp) +{ + struct nlm_host *host; + struct nlm_file *file; + + dprintk("lockd: UNSHARE called\n"); + + resp->cookie = argp->cookie; + + /* Don't accept requests during grace period */ + if (nlmsvc_grace_period) { + resp->status = nlm4_lck_denied_grace_period; + return rpc_success; + } + + /* Obtain client and file */ + if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file))) + return rpc_success; + + /* Now try to lock the file */ + resp->status = nlmsvc_unshare_file(host, file, argp); + + dprintk("lockd: UNSHARE status %d\n", ntohl(resp->status)); + nlm_release_host(host); + nlm_release_file(file); + return rpc_success; +} + +/* + * NM_LOCK: Create an unmonitored lock + */ +static int +nlm4svc_proc_nm_lock(struct svc_rqst *rqstp, struct nlm_args *argp, + struct nlm_res *resp) +{ + dprintk("lockd: NM_LOCK called\n"); + + argp->monitor = 0; /* just clean the monitor flag */ + return nlm4svc_proc_lock(rqstp, argp, resp); +} + +/* + * FREE_ALL: Release all locks and shares held by client + */ +static int +nlm4svc_proc_free_all(struct svc_rqst *rqstp, struct nlm_args *argp, + void *resp) +{ + struct nlm_host *host; + + /* Obtain client */ + if (nlm4svc_retrieve_args(rqstp, argp, &host, NULL)) + return rpc_success; + + nlmsvc_free_host_resources(host); + nlm_release_host(host); + return rpc_success; +} + +/* + * SM_NOTIFY: private callback from statd (not part of official NLM proto) + */ +static int +nlm4svc_proc_sm_notify(struct svc_rqst *rqstp, struct nlm_reboot *argp, + void *resp) +{ + struct sockaddr_in saddr = rqstp->rq_addr; + struct nlm_host *host; + + dprintk("lockd: SM_NOTIFY called\n"); + if (saddr.sin_addr.s_addr != htonl(INADDR_LOOPBACK) + || ntohs(saddr.sin_port) >= 1024) { + printk(KERN_WARNING + "lockd: rejected NSM callback from %08x:%d\n", + ntohl(rqstp->rq_addr.sin_addr.s_addr), + ntohs(rqstp->rq_addr.sin_port)); + return rpc_system_err; + } + + /* Obtain the host pointer for this NFS server and try to + * reclaim all locks we hold on this server. + */ + saddr.sin_addr.s_addr = argp->addr; + if ((host = nlm_lookup_host(NULL, &saddr, IPPROTO_UDP, 1)) != NULL) { + nlmclnt_recovery(host, argp->state); + nlm_release_host(host); + } + + /* If we run on an NFS server, delete all locks held by the client */ + if (nlmsvc_ops != NULL) { + struct svc_client *clnt; + saddr.sin_addr.s_addr = argp->addr; + if ((clnt = nlmsvc_ops->exp_getclient(&saddr)) != NULL + && (host = nlm_lookup_host(clnt, &saddr, 0, 0)) != NULL) { + nlmsvc_free_host_resources(host); + } + nlm_release_host(host); + } + + return rpc_success; +} + +/* + * This is the generic lockd callback for async RPC calls + */ +static u32 +nlm4svc_callback(struct svc_rqst *rqstp, u32 proc, struct nlm_res *resp) +{ + struct nlm_host *host; + struct nlm_rqst *call; + + if (!(call = nlmclnt_alloc_call())) + return rpc_system_err; + + host = nlmclnt_lookup_host(&rqstp->rq_addr, + rqstp->rq_prot, rqstp->rq_vers); + if (!host) { + rpc_free(call); + return rpc_system_err; + } + + call->a_flags = RPC_TASK_ASYNC; + call->a_host = host; + memcpy(&call->a_args, resp, sizeof(*resp)); + + if (nlmsvc_async_call(call, proc, nlm4svc_callback_exit) < 0) + return rpc_system_err; + + return rpc_success; +} + +static void +nlm4svc_callback_exit(struct rpc_task *task) +{ + struct nlm_rqst *call = (struct nlm_rqst *) task->tk_calldata; + + if (task->tk_status < 0) { + dprintk("lockd: %4d callback failed (errno = %d)\n", + task->tk_pid, -task->tk_status); + } + nlm_release_host(call->a_host); + rpc_free(call); +} + +/* + * NLM Server procedures. + */ + +#define nlm4svc_encode_norep nlm4svc_encode_void +#define nlm4svc_decode_norep nlm4svc_decode_void +#define nlm4svc_decode_testres nlm4svc_decode_void +#define nlm4svc_decode_lockres nlm4svc_decode_void +#define nlm4svc_decode_unlockres nlm4svc_decode_void +#define nlm4svc_decode_cancelres nlm4svc_decode_void +#define nlm4svc_decode_grantedres nlm4svc_decode_void + +#define nlm4svc_proc_none nlm4svc_proc_null +#define nlm4svc_proc_test_res nlm4svc_proc_null +#define nlm4svc_proc_lock_res nlm4svc_proc_null +#define nlm4svc_proc_cancel_res nlm4svc_proc_null +#define nlm4svc_proc_unlock_res nlm4svc_proc_null +#define nlm4svc_proc_granted_res nlm4svc_proc_null + +struct nlm_void { int dummy; }; + +#define PROC(name, xargt, xrest, argt, rest) \ + { (svc_procfunc) nlm4svc_proc_##name, \ + (kxdrproc_t) nlm4svc_decode_##xargt, \ + (kxdrproc_t) nlm4svc_encode_##xrest, \ + NULL, \ + sizeof(struct nlm_##argt), \ + sizeof(struct nlm_##rest), \ + 0, \ + 0 \ + } +struct svc_procedure nlmsvc_procedures4[] = { + PROC(null, void, void, void, void), + PROC(test, testargs, testres, args, res), + PROC(lock, lockargs, res, args, res), + PROC(cancel, cancargs, res, args, res), + PROC(unlock, unlockargs, res, args, res), + PROC(granted, testargs, res, args, res), + PROC(test_msg, testargs, norep, args, void), + PROC(lock_msg, lockargs, norep, args, void), + PROC(cancel_msg, cancargs, norep, args, void), + PROC(unlock_msg, unlockargs, norep, args, void), + PROC(granted_msg, testargs, norep, args, void), + PROC(test_res, testres, norep, res, void), + PROC(lock_res, lockres, norep, res, void), + PROC(cancel_res, cancelres, norep, res, void), + PROC(unlock_res, unlockres, norep, res, void), + PROC(granted_res, grantedres, norep, res, void), + PROC(none, void, void, void, void), + PROC(none, void, void, void, void), + PROC(none, void, void, void, void), + PROC(none, void, void, void, void), + PROC(share, shareargs, shareres, args, res), + PROC(unshare, shareargs, shareres, args, res), + PROC(nm_lock, lockargs, res, args, res), + PROC(free_all, notify, void, args, void), + + /* statd callback */ + PROC(sm_notify, reboot, void, reboot, void), +}; diff -Naur pre9/linux/fs/lockd/svclock.c test/linux/fs/lockd/svclock.c --- pre9/linux/fs/lockd/svclock.c Mon Sep 18 13:58:34 2000 +++ test/linux/fs/lockd/svclock.c Mon Sep 18 14:35:50 2000 @@ -26,6 +26,7 @@ #include <linux/sunrpc/clnt.h> #include <linux/sunrpc/svc.h> #include <linux/lockd/nlm.h> +#include <linux/lockd/xdr4.h> #include <linux/lockd/lockd.h> @@ -98,9 +99,10 @@ lock->fl.fl_end, lock->fl.fl_type); for (head = &nlm_blocked; (block = *head); head = &block->b_next) { fl = &block->b_call.a_args.lock.fl; - dprintk(" check f=%p pd=%d %ld-%ld ty=%d\n", + dprintk("lockd: check f=%p pd=%d %ld-%ld ty=%d cookie=%x\n", block->b_file, fl->fl_pid, fl->fl_start, - fl->fl_end, fl->fl_type); + fl->fl_end, fl->fl_type, + *(u32 *)(&block->b_call.a_args.cookie.data)); if (block->b_file == file && nlm_compare_locks(fl, &lock->fl)) { if (remove) *head = block->b_next; @@ -129,6 +131,8 @@ struct nlm_block *block; for (block = nlm_blocked; block; block = block->b_next) { + dprintk("cookie: head of blocked queue %p, block %p\n", + nlm_blocked, block); if (nlm_cookie_match(&block->b_call.a_args.cookie,cookie)) break; } @@ -280,6 +284,7 @@ { struct file_lock *conflock; struct nlm_block *block; + struct inode *inode = file->f_file.f_dentry->d_inode; int error; dprintk("lockd: nlmsvc_lock(%04x/%ld, ty=%d, pi=%d, %ld-%ld, bl=%d)\n", @@ -289,6 +294,10 @@ lock->fl.fl_start, lock->fl.fl_end, wait); + + /* Checking for read only file system */ + if (IS_RDONLY(inode)) + return nlm4_rofs; /* Lock file against concurrent access */ down(&file->f_sema); @@ -301,7 +310,7 @@ again: if (!(conflock = posix_test_lock(&file->f_file, &lock->fl))) { error = posix_lock_file(&file->f_file, &lock->fl, 0); - + if (block) nlmsvc_delete_block(block, 0); up(&file->f_sema); @@ -309,18 +318,19 @@ dprintk("lockd: posix_lock_file returned %d\n", -error); switch(-error) { case 0: - return nlm_granted; - case EDEADLK: /* no applicable NLM status */ + return nlm4_granted; + case EDEADLK: + return nlm4_deadlock; case EAGAIN: - return nlm_lck_denied; + return nlm4_lck_denied; default: /* includes ENOLCK */ - return nlm_lck_denied_nolocks; + return nlm4_lck_denied_nolocks; } } if (!wait) { up(&file->f_sema); - return nlm_lck_denied; + return nlm4_lck_denied; } /* If we don't have a block, create and initialize it. Then @@ -328,7 +338,7 @@ if (block == NULL) { dprintk("lockd: blocking on this lock (allocating).\n"); if (!(block = nlmsvc_create_block(rqstp, file, lock, cookie))) - return nlm_lck_denied_nolocks; + return nlm4_lck_denied_nolocks; goto again; } @@ -343,7 +353,7 @@ } up(&file->f_sema); - return nlm_lck_blocked; + return nlm4_lck_blocked; } /* @@ -364,14 +374,15 @@ if ((fl = posix_test_lock(&file->f_file, &lock->fl)) != NULL) { dprintk("lockd: conflicting lock(ty=%d, %ld-%ld)\n", - fl->fl_type, fl->fl_start, fl->fl_end); + fl->fl_type, fl->fl_start, fl->fl_end ); + conflock->caller = "somehost"; /* FIXME */ conflock->oh.len = 0; /* don't return OH info */ conflock->fl = *fl; - return nlm_lck_denied; + return nlm4_lck_denied; } - return nlm_granted; + return nlm4_granted; } /* @@ -399,7 +410,7 @@ lock->fl.fl_type = F_UNLCK; error = posix_lock_file(&file->f_file, &lock->fl, 0); - return (error < 0)? nlm_lck_denied_nolocks : nlm_granted; + return (error < 0)? nlm4_lck_denied_nolocks : nlm4_granted; } /* @@ -425,7 +436,7 @@ if ((block = nlmsvc_lookup_block(file, lock, 1)) != NULL) nlmsvc_delete_block(block, 1); up(&file->f_sema); - return nlm_granted; + return nlm4_granted; } /* @@ -541,6 +552,8 @@ unsigned long timeout; dprintk("lockd: GRANT_MSG RPC callback\n"); + dprintk("callback: looking for cookie %x \n", + *(u32 *)(call->a_args.cookie.data)); if (!(block = nlmsvc_find_block(&call->a_args.cookie))) { dprintk("lockd: no block for cookie %x\n", *(u32 *)(call->a_args.cookie.data)); return; @@ -616,7 +629,7 @@ dprintk("nlmsvc_retry_blocked(%p, when=%ld)\n", nlm_blocked, nlm_blocked? nlm_blocked->b_when : 0); - while ((block = nlm_blocked) && block->b_when < jiffies) { + while ((block = nlm_blocked) && block->b_when <= jiffies) { dprintk("nlmsvc_retry_blocked(%p, when=%ld, done=%d)\n", block, block->b_when, block->b_done); if (block->b_done) @@ -627,6 +640,5 @@ if ((block = nlm_blocked) && block->b_when != NLM_NEVER) return (block->b_when - jiffies); - return MAX_SCHEDULE_TIMEOUT; } diff -Naur pre9/linux/fs/lockd/svcproc.c test/linux/fs/lockd/svcproc.c --- pre9/linux/fs/lockd/svcproc.c Mon Sep 18 13:58:34 2000 +++ test/linux/fs/lockd/svcproc.c Mon Sep 18 14:35:50 2000 @@ -14,6 +14,8 @@ #include <linux/sunrpc/svc.h> #include <linux/sunrpc/clnt.h> #include <linux/nfsd/nfsd.h> +#include <linux/lockd/xdr.h> +#include <linux/lockd/xdr4.h> #include <linux/lockd/lockd.h> #include <linux/lockd/share.h> #include <linux/lockd/sm_inter.h> @@ -23,6 +25,7 @@ static u32 nlmsvc_callback(struct svc_rqst *, u32, struct nlm_res *); static void nlmsvc_callback_exit(struct rpc_task *); +static u32 cast_to_nlm(u32, u32); /* * Obtain client and file from arguments @@ -93,6 +96,7 @@ { struct nlm_host *host; struct nlm_file *file; + u32 status; dprintk("lockd: TEST called\n"); resp->cookie = argp->cookie; @@ -108,9 +112,12 @@ return rpc_success; /* Now check for conflicting locks */ - resp->status = nlmsvc_testlock(file, &argp->lock, &resp->lock); + status = nlmsvc_testlock(file, &argp->lock, &resp->lock); + dprintk("test: status before %d\n", ntohl(status)); + resp->status = cast_to_nlm(status, rqstp->rq_vers); - dprintk("lockd: TEST status %d\n", ntohl(resp->status)); + dprintk("lockd: TEST status %d vers %d\n", + ntohl(resp->status), rqstp->rq_vers); nlm_release_host(host); nlm_release_file(file); return rpc_success; @@ -122,6 +129,7 @@ { struct nlm_host *host; struct nlm_file *file; + u32 status; dprintk("lockd: LOCK called\n"); @@ -150,8 +158,9 @@ #endif /* Now try to lock the file */ - resp->status = nlmsvc_lock(rqstp, file, &argp->lock, - argp->block, &argp->cookie); + status = nlmsvc_lock(rqstp, file, &argp->lock, + argp->block, &argp->cookie); + resp->status = cast_to_nlm(status, rqstp->rq_vers); dprintk("lockd: LOCK status %d\n", ntohl(resp->status)); nlm_release_host(host); @@ -165,6 +174,7 @@ { struct nlm_host *host; struct nlm_file *file; + u32 status; dprintk("lockd: CANCEL called\n"); @@ -181,7 +191,8 @@ return rpc_success; /* Try to cancel request. */ - resp->status = nlmsvc_cancel_blocked(file, &argp->lock); + status = nlmsvc_cancel_blocked(file, &argp->lock); + resp->status = cast_to_nlm(status, rqstp->rq_vers); dprintk("lockd: CANCEL status %d\n", ntohl(resp->status)); nlm_release_host(host); @@ -198,6 +209,7 @@ { struct nlm_host *host; struct nlm_file *file; + u32 status; dprintk("lockd: UNLOCK called\n"); @@ -214,7 +226,8 @@ return rpc_success; /* Now try to remove the lock */ - resp->status = nlmsvc_unlock(file, &argp->lock); + status = nlmsvc_unlock(file, &argp->lock); + resp->status = cast_to_nlm(status, rqstp->rq_vers); dprintk("lockd: UNLOCK status %d\n", ntohl(resp->status)); nlm_release_host(host); @@ -322,6 +335,7 @@ { struct nlm_host *host; struct nlm_file *file; + u32 status; dprintk("lockd: SHARE called\n"); @@ -338,7 +352,8 @@ return rpc_success; /* Now try to create the share */ - resp->status = nlmsvc_share_file(host, file, argp); + status = nlmsvc_share_file(host, file, argp); + resp->status = cast_to_nlm(status, rqstp->rq_vers); dprintk("lockd: SHARE status %d\n", ntohl(resp->status)); nlm_release_host(host); @@ -355,6 +370,7 @@ { struct nlm_host *host; struct nlm_file *file; + u32 status; dprintk("lockd: UNSHARE called\n"); @@ -371,7 +387,8 @@ return rpc_success; /* Now try to lock the file */ - resp->status = nlmsvc_unshare_file(host, file, argp); + status = nlmsvc_unshare_file(host, file, argp); + resp->status = cast_to_nlm(status, rqstp->rq_vers); dprintk("lockd: UNSHARE status %d\n", ntohl(resp->status)); nlm_release_host(host); @@ -495,6 +512,27 @@ kfree(call); } +static u32 +cast_to_nlm(u32 status, u32 vers) +{ + + if (vers != 4){ + switch(ntohl(status)){ + case NLM_LCK_GRANTED: + case NLM_LCK_DENIED: + case NLM_LCK_DENIED_NOLOCKS: + case NLM_LCK_BLOCKED: + case NLM_LCK_DENIED_GRACE_PERIOD: + break; + default: + status = NLM_LCK_DENIED_NOLOCKS; + } + } + + return (status); + +} + /* * NLM Server procedures. */ diff -Naur pre9/linux/fs/lockd/svcshare.c test/linux/fs/lockd/svcshare.c --- pre9/linux/fs/lockd/svcshare.c Mon Apr 7 11:35:30 1997 +++ test/linux/fs/lockd/svcshare.c Mon Sep 18 14:35:50 2000 @@ -12,6 +12,7 @@ #include <linux/sunrpc/clnt.h> #include <linux/sunrpc/svc.h> +#include <linux/lockd/xdr4.h> #include <linux/lockd/lockd.h> #include <linux/lockd/share.h> @@ -35,13 +36,13 @@ goto update; if ((argp->fsm_access & share->s_mode) || (argp->fsm_mode & share->s_access )) - return nlm_lck_denied; + return nlm4_lck_denied; } share = (struct nlm_share *) kmalloc(sizeof(*share) + oh->len, GFP_KERNEL); if (share == NULL) - return nlm_lck_denied_nolocks; + return nlm4_lck_denied_nolocks; /* Copy owner handle */ ohdata = (u8 *) (share + 1); @@ -58,7 +59,7 @@ update: share->s_access = argp->fsm_access; share->s_mode = argp->fsm_mode; - return nlm_granted; + return nlm4_granted; } /* @@ -75,13 +76,13 @@ if (share->s_host == host && nlm_cmp_owner(share, oh)) { *shpp = share->s_next; kfree(share); - return nlm_granted; + return nlm4_granted; } } /* X/Open spec says return success even if there was no * corresponding share. */ - return nlm_granted; + return nlm4_granted; } /* diff -Naur pre9/linux/fs/lockd/svcsubs.c test/linux/fs/lockd/svcsubs.c --- pre9/linux/fs/lockd/svcsubs.c Mon Sep 18 13:58:34 2000 +++ test/linux/fs/lockd/svcsubs.c Mon Sep 18 14:35:50 2000 @@ -13,6 +13,7 @@ #include <linux/sunrpc/clnt.h> #include <linux/nfsd/nfsfh.h> #include <linux/nfsd/export.h> +#include <linux/lockd/xdr4.h> #include <linux/lockd/lockd.h> #include <linux/lockd/share.h> #include <linux/lockd/sm_inter.h> @@ -69,7 +70,7 @@ dprintk("lockd: creating file for %s/%u\n", kdevname(u32_to_kdev_t(fh->fh_dev)), fh->fh_ino); - nfserr = nlm_lck_denied_nolocks; + nfserr = nlm4_lck_denied_nolocks; file = (struct nlm_file *) kmalloc(sizeof(*file), GFP_KERNEL); if (!file) goto out_unlock; @@ -104,7 +105,10 @@ out_free: kfree(file); - nfserr = nlm_lck_denied; + if (nfserr == 1) + nfserr = nlm4_stale_fh; + else + nfserr = nlm4_lck_denied; goto out_unlock; } diff -Naur pre9/linux/fs/lockd/xdr.c test/linux/fs/lockd/xdr.c --- pre9/linux/fs/lockd/xdr.c Mon Sep 18 13:58:34 2000 +++ test/linux/fs/lockd/xdr.c Mon Sep 18 14:35:50 2000 @@ -16,8 +16,10 @@ #include <linux/sunrpc/clnt.h> #include <linux/sunrpc/svc.h> #include <linux/sunrpc/stats.h> +#include <linux/lockd/xdr4.h> #include <linux/lockd/lockd.h> #include <linux/lockd/sm_inter.h> +#include <linux/nfs2.h> #define NLMDBG_FACILITY NLMDBG_XDR #define NLM_MAXSTRLEN 1024 @@ -51,6 +53,17 @@ nlm_lck_blocked = htonl(NLM_LCK_BLOCKED); nlm_lck_denied_grace_period = htonl(NLM_LCK_DENIED_GRACE_PERIOD); + nlm4_granted = htonl(NLM_LCK_GRANTED); + nlm4_lck_denied = htonl(NLM_LCK_DENIED); + nlm4_lck_denied_nolocks = htonl(NLM_LCK_DENIED_NOLOCKS); + nlm4_lck_blocked = htonl(NLM_LCK_BLOCKED); + nlm4_lck_denied_grace_period = htonl(NLM_LCK_DENIED_GRACE_PERIOD); + nlm4_deadlock = htonl(NLM_DEADLCK); + nlm4_rofs = htonl(NLM_ROFS); + nlm4_stale_fh = htonl(NLM_STALE_FH); + nlm4_fbig = htonl(NLM_FBIG); + nlm4_failed = htonl(NLM_FAILED); + inited = 1; nlm_register_stats(); @@ -559,7 +572,7 @@ (kxdrproc_t) nlmclt_encode_##argtype, \ (kxdrproc_t) nlmclt_decode_##restype, \ MAX(NLM_##argtype##_sz, NLM_##restype##_sz) << 2, \ - 0 \ + 0 \ } static struct rpc_procinfo nlm_procedures[] = { diff -Naur pre9/linux/fs/lockd/xdr4.c test/linux/fs/lockd/xdr4.c --- pre9/linux/fs/lockd/xdr4.c Mon Sep 18 13:58:34 2000 +++ test/linux/fs/lockd/xdr4.c Mon Sep 18 14:35:50 2000 @@ -1,5 +1,5 @@ /* - * linux/fs/lockd/xdr.c + * linux/fs/lockd/xdr4.c * * XDR support for lockd and the lock client. * @@ -18,6 +18,7 @@ #include <linux/sunrpc/stats.h> #include <linux/lockd/lockd.h> #include <linux/lockd/sm_inter.h> +#include <linux/nfs3.h> #define NLMDBG_FACILITY NLMDBG_XDR #define NLM_MAXSTRLEN 1024 @@ -25,6 +26,10 @@ #define QUADLEN(len) (((len) + 3) >> 2) +u32 nlm4_granted, nlm4_lck_denied, nlm4_lck_denied_nolocks, + nlm4_lck_blocked, nlm4_lck_denied_grace_period, nlm4_deadlock, + nlm4_rofs, nlm4_stale_fh, nlm4_fbig, nlm4_failed; + typedef struct nlm_args nlm_args; @@ -170,11 +175,13 @@ static u32 * nlm4_encode_testres(u32 *p, struct nlm_res *resp) { + + dprintk("xdr: before encode_testres (p %p resp %p)\n", p, resp); if (!(p = nlm4_encode_cookie(p, &resp->cookie))) return 0; *p++ = resp->status; - if (resp->status == nlm_lck_denied) { + if (resp->status == nlm4_lck_denied) { struct file_lock *fl = &resp->lock.fl; *p++ = (fl->fl_type == F_RDLCK)? xdr_zero : xdr_one; @@ -189,8 +196,12 @@ p = xdr_encode_hyper(p, 0); else p = xdr_encode_hyper(p, fl->fl_end - fl->fl_start + 1); + dprintk("xdr: encode_testres (status %d pid %d type %d start %ld end %ld)\n", resp->status, fl->fl_pid, fl->fl_type, fl->fl_start, fl->fl_end); + + } + dprintk("xdr: after encode_testres (p %p resp %p)\n", p, resp); return p; } @@ -540,7 +551,7 @@ (kxdrproc_t) nlm4clt_encode_##argtype, \ (kxdrproc_t) nlm4clt_decode_##restype, \ MAX(NLM4_##argtype##_sz, NLM4_##restype##_sz) << 2, \ - 0 \ + 0 \ } static struct rpc_procinfo nlm4_procedures[] = { diff -Naur pre9/linux/fs/nfs/inode.c test/linux/fs/nfs/inode.c --- pre9/linux/fs/nfs/inode.c Mon Sep 18 13:58:34 2000 +++ test/linux/fs/nfs/inode.c Mon Sep 18 14:35:50 2000 @@ -177,10 +177,9 @@ nfs_reqlist_free(server); -#if 0 if (!(server->flags & NFS_MOUNT_NONLM)) lockd_down(); /* release rpc.lockd */ -#endif + rpciod_down(); /* release rpciod */ kfree(server->hostname); @@ -507,11 +506,10 @@ /* We're airborne */ unlock_super(sb); -#if 0 /* Check whether to start the lockd process */ if (!(server->flags & NFS_MOUNT_NONLM)) lockd_up(); -#endif + return sb; /* Yargs. It didn't work out. */ diff -Naur pre9/linux/fs/nfs/nfsroot.c test/linux/fs/nfs/nfsroot.c --- pre9/linux/fs/nfs/nfsroot.c Mon Sep 18 13:58:34 2000 +++ test/linux/fs/nfs/nfsroot.c Mon Sep 18 14:35:50 2000 @@ -243,9 +243,12 @@ memset(&nfs_data, 0, sizeof(nfs_data)); nfs_port = -1; nfs_data.version = NFS_MOUNT_VERSION; - /* It is ok to have lockd in nfs root since it will be started - later manually in the rc script. */ - nfs_data.flags = 0; + /* + * dhiggen: nobody has yet demonstrated a convincing need for NFS root + * locking, so I am reverting to automatic lockd start and disabling + * NLM locking on an NFS root. + */ + nfs_data.flags = NFS_MOUNT_NONLM; nfs_data.timeo = 7; nfs_data.retrans = 3; nfs_data.acregmin = 3; diff -Naur pre9/linux/fs/nfsd/Makefile test/linux/fs/nfsd/Makefile --- pre9/linux/fs/nfsd/Makefile Tue Dec 29 11:42:25 1998 +++ test/linux/fs/nfsd/Makefile Mon Sep 18 14:35:50 2000 @@ -11,6 +11,9 @@ O_OBJS := nfssvc.o nfsctl.o nfsproc.o nfsfh.o vfs.o \ export.o auth.o lockd.o nfscache.o nfsxdr.o \ stats.o +ifdef CONFIG_NFSD_V3 + O_OBJS += nfs3proc.o nfs3xdr.o +endif M_OBJS := $(O_TARGET) diff -Naur pre9/linux/fs/nfsd/export.c test/linux/fs/nfsd/export.c --- pre9/linux/fs/nfsd/export.c Tue Oct 26 17:53:42 1999 +++ test/linux/fs/nfsd/export.c Mon Sep 18 14:35:50 2000 @@ -105,20 +105,6 @@ return exp; } -/* - * Check whether there are any exports for a device. - */ -static int -exp_device_in_use(kdev_t dev) -{ - struct svc_client *clp; - - for (clp = clients; clp; clp = clp->cl_next) { - if (exp_find(clp, dev)) - return 1; - } - return 0; -} /* * Look up the device of the parent fs. @@ -168,9 +154,8 @@ } } while (NULL != (exp = exp->ex_next)); } while (nfsd_parentdev(&xdev)); - if (xdentry == xdentry->d_parent) { + if (IS_ROOT(xdentry)) break; - } } while ((xdentry = xdentry->d_parent)); exp = NULL; out: @@ -204,7 +189,7 @@ #endif goto out; } - if (ndentry == ndentry->d_parent) + if (IS_ROOT(ndentry)) break; } } while (NULL != (exp = exp->ex_next)); @@ -287,6 +272,12 @@ goto finish; err = -EINVAL; + if (!(inode->i_sb->s_type->fs_flags & FS_REQUIRES_DEV) || + inode->i_sb->s_op->read_inode == NULL) { + dprintk("exp_export: export of invalid fs type.\n"); + goto finish; + } + if ((parent = exp_child(clp, dev, dentry)) != NULL) { dprintk("exp_export: export not valid (Rule 3).\n"); goto finish; @@ -366,16 +357,6 @@ exp->ex_parent = unexp->ex_parent; } - /* - * Check whether this is the last export for this device, - * and if so flush any cached dentries. - */ - if (!exp_device_in_use(unexp->ex_dev)) { -printk("exp_do_unexport: %s last use, flushing cache\n", - kdevname(unexp->ex_dev)); - nfsd_fh_flush(unexp->ex_dev); - } - dentry = unexp->ex_dentry; inode = dentry->d_inode; if (unexp->ex_dev != inode->i_dev || unexp->ex_ino != inode->i_ino) @@ -628,7 +609,9 @@ { NFSEXP_UIDMAP, {"uidmap", ""}}, { NFSEXP_KERBEROS, { "kerberos", ""}}, { NFSEXP_SUNSECURE, { "sunsecure", ""}}, - { NFSEXP_CROSSMNT, {"crossmnt", ""}}, + { NFSEXP_CROSSMNT, {"nohide", ""}}, + { NFSEXP_NOSUBTREECHECK, {"no_subtree_check", ""}}, + { NFSEXP_NOAUTHNLM, {"no_auth_nlm", ""}}, { 0, {"", ""}} }; diff -Naur pre9/linux/fs/nfsd/lockd.c test/linux/fs/nfsd/lockd.c --- pre9/linux/fs/nfsd/lockd.c Mon Aug 31 11:03:38 1998 +++ test/linux/fs/nfsd/lockd.c Mon Sep 18 14:35:50 2000 @@ -30,11 +30,21 @@ fh.fh_handle = *f; fh.fh_export = NULL; - nfserr = nfsd_open(rqstp, &fh, S_IFREG, 0, filp); + nfserr = nfsd_open(rqstp, &fh, S_IFREG, MAY_LOCK, filp); if (!nfserr) dget(filp->f_dentry); fh_put(&fh); - return nfserr; + /* nlm and nfsd don't share error codes. + * we invent: 0 = no error + * 1 = stale file handle + * 2 = other error + */ + if (nfserr == 0) + return 0; + else if (nfserr == nfserr_stale) + return 1; + else return 2; + } static void diff -Naur pre9/linux/fs/nfsd/nfs3proc.c test/linux/fs/nfsd/nfs3proc.c --- pre9/linux/fs/nfsd/nfs3proc.c Mon Apr 12 10:07:36 1999 +++ test/linux/fs/nfsd/nfs3proc.c Mon Sep 18 15:03:53 2000 @@ -3,7 +3,7 @@ * * Process version 3 NFS requests. * - * Copyright (C) 1996 Olaf Kirch <okir@monad.swb.de> + * Copyright (C) 1996, 1997, 1998 Olaf Kirch <okir@monad.swb.de> */ #include <linux/linkage.h> @@ -18,19 +18,33 @@ #include <linux/version.h> #include <linux/unistd.h> #include <linux/malloc.h> +#include <linux/major.h> #include <linux/sunrpc/svc.h> #include <linux/nfsd/nfsd.h> #include <linux/nfsd/cache.h> #include <linux/nfsd/xdr3.h> - -typedef struct svc_rqst svc_rqst; -typedef struct svc_buf svc_buf; +#include <linux/nfs3.h> +#include <linux/ext2_fs.h> #define NFSDDBG_FACILITY NFSDDBG_PROC #define RETURN(st) { resp->status = (st); return (st); } +static int nfs3_ftypes[] = { + 0, /* NF3NON */ + S_IFREG, /* NF3REG */ + S_IFDIR, /* NF3DIR */ + S_IFBLK, /* NF3BLK */ + S_IFCHR, /* NF3CHR */ + S_IFLNK, /* NF3LNK */ + S_IFSOCK, /* NF3SOCK */ + S_IFIFO, /* NF3FIFO */ +}; + +/* + * Reserve room in the send buffer + */ static void svcbuf_reserve(struct svc_buf *buf, u32 **ptr, int *len, int nr) { @@ -38,6 +52,9 @@ *len = buf->buflen - buf->len - nr; } +/* + * NULL call. + */ static int nfsd3_proc_null(struct svc_rqst *rqstp, void *argp, void *resp) { @@ -46,7 +63,6 @@ /* * Get a file's attributes - * N.B. After this call resp->fh needs an fh_put */ static int nfsd3_proc_getattr(struct svc_rqst *rqstp, struct nfsd_fhandle *argp, @@ -54,18 +70,17 @@ { int nfserr; - dprintk("nfsd: GETATTR %x/%ld\n", + dprintk("nfsd: GETATTR(3) %x/%ld\n", SVCFH_DEV(&argp->fh), - SVCFH_INO(&argp->fh)); + (long)SVCFH_INO(&argp->fh)); - resp->fh = argp->fh; + fh_copy(&resp->fh, &argp->fh); nfserr = fh_verify(rqstp, &resp->fh, 0, MAY_NOP); RETURN(nfserr); } /* * Set a file's attributes - * N.B. After this call resp->fh needs an fh_put */ static int nfsd3_proc_setattr(struct svc_rqst *rqstp, struct nfsd3_sattrargs *argp, @@ -73,31 +88,30 @@ { int nfserr; - dprintk("nfsd: SETATTR %x/%ld\n", + dprintk("nfsd: SETATTR(3) %x/%ld\n", SVCFH_DEV(&argp->fh), - SVCFH_INO(&argp->fh)); + (long)SVCFH_INO(&argp->fh)); - resp->fh = argp->fh; + fh_copy(&resp->fh, &argp->fh); nfserr = nfsd_setattr(rqstp, &resp->fh, &argp->attrs); RETURN(nfserr); } /* * Look up a path name component - * N.B. After this call _both_ resp->dirfh and resp->fh need an fh_put */ static int nfsd3_proc_lookup(struct svc_rqst *rqstp, struct nfsd3_diropargs *argp, - struct nfsd3_lookupres *resp) + struct nfsd3_diropres *resp) { int nfserr; - dprintk("nfsd: LOOKUP %x/%ld %s\n", + dprintk("nfsd: LOOKUP(3) %x/%ld %s\n", SVCFH_DEV(&argp->fh), - SVCFH_INO(&argp->fh), + (long)SVCFH_INO(&argp->fh), argp->name); - resp->dirfh = argp->fh; + fh_copy(&resp->dirfh, &argp->fh); nfserr = nfsd_lookup(rqstp, &resp->dirfh, argp->name, argp->len, @@ -109,12 +123,20 @@ * Check file access */ static int -nfsd3_proc_access(struct svc_rqst *rqstp, struct nfsd_fhandle *argp, +nfsd3_proc_access(struct svc_rqst *rqstp, struct nfsd3_accessargs *argp, struct nfsd3_accessres *resp) { - /* to be done */ - resp->fh = argp->fh; - return nfserr_notsupp; + int nfserr; + + dprintk("nfsd: ACCESS(3) %x/%ld 0x%x\n", + SVCFH_DEV(&argp->fh), + (long)SVCFH_INO(&argp->fh), + argp->access); + + fh_copy(&resp->fh, &argp->fh); + resp->access = argp->access; + nfserr = nfsd_access(rqstp, &resp->fh, &resp->access); + RETURN(nfserr); } /* @@ -127,23 +149,23 @@ u32 *path; int dummy, nfserr; - dprintk("nfsd: READLINK %x/%ld\n", + dprintk("nfsd: READLINK(3) %x/%ld\n", SVCFH_DEV(&argp->fh), - SVCFH_INO(&argp->fh)); + (long)SVCFH_INO(&argp->fh)); /* Reserve room for status, post_op_attr, and path length */ - svcbuf_reserve(&rqstp->rq_resbuf, &path, &dummy, 1 + 22 + 1); + svcbuf_reserve(&rqstp->rq_resbuf, &path, &dummy, + 1 + NFS3_POST_OP_ATTR_WORDS + 1); /* Read the symlink. */ + fh_copy(&resp->fh, &argp->fh); resp->len = NFS3_MAXPATHLEN; - nfserr = nfsd_readlink(rqstp, &argp->fh, (char *) path, &resp->len); - fh_put(&argp->fh); + nfserr = nfsd_readlink(rqstp, &resp->fh, (char *) path, &resp->len); RETURN(nfserr); } /* * Read a portion of a file. - * N.B. After this call resp->fh needs an fh_put */ static int nfsd3_proc_read(struct svc_rqst *rqstp, struct nfsd3_readargs *argp, @@ -152,9 +174,9 @@ u32 * buffer; int nfserr, avail; - dprintk("nfsd: READ %x/%ld %lu bytes at %lu\n", + dprintk("nfsd: READ(3) %x/%ld %lu bytes at %lu\n", SVCFH_DEV(&argp->fh), - SVCFH_INO(&argp->fh), + (long)SVCFH_INO(&argp->fh), (unsigned long) argp->count, (unsigned long) argp->offset); @@ -162,30 +184,29 @@ * 1 (status) + 22 (post_op_attr) + 1 (count) + 1 (eof) * + 1 (xdr opaque byte count) = 26 */ - svcbuf_reserve(&rqstp->rq_resbuf, &buffer, &avail, 26); - - if ((avail << 2) < argp->count) { - printk(KERN_NOTICE - "oversized read request from %08lx:%d (%d bytes)\n", - ntohl(rqstp->rq_addr.sin_addr.s_addr), - ntohs(rqstp->rq_addr.sin_port), - argp->count); - argp->count = avail; - } + svcbuf_reserve(&rqstp->rq_resbuf, &buffer, &avail, + 1 + NFS3_POST_OP_ATTR_WORDS + 3); resp->count = argp->count; - resp->fh = argp->fh; + if ((avail << 2) < resp->count) + resp->count = avail << 2; + + fh_copy(&resp->fh, &argp->fh); nfserr = nfsd_read(rqstp, &resp->fh, argp->offset, (char *) buffer, &resp->count); + if (nfserr == 0) { + struct inode *inode = resp->fh.fh_dentry->d_inode; + + resp->eof = (argp->offset + resp->count) >= inode->i_size; + } RETURN(nfserr); } /* * Write data to a file - * N.B. After this call resp->fh needs an fh_put */ static int nfsd3_proc_write(struct svc_rqst *rqstp, struct nfsd3_writeargs *argp, @@ -193,19 +214,21 @@ { int nfserr; - dprintk("nfsd: WRITE %x/%ld %d bytes at %ld\n", + dprintk("nfsd: WRITE(3) %x/%ld %d bytes at %ld%s\n", SVCFH_DEV(&argp->fh), - SVCFH_INO(&argp->fh), + (long)SVCFH_INO(&argp->fh), argp->len, - (unsigned long) argp->offset); + (unsigned long) argp->offset, + argp->stable? " stable" : ""); - resp->fh = argp->fh; + fh_copy(&resp->fh, &argp->fh); nfserr = nfsd_write(rqstp, &resp->fh, argp->offset, argp->data, argp->len, argp->stable); resp->committed = argp->stable; + resp->count = argp->count; RETURN(nfserr); } @@ -213,20 +236,18 @@ * With NFSv3, CREATE processing is a lot easier than with NFSv2. * At least in theory; we'll see how it fares in practice when the * first reports about SunOS compatibility problems start to pour in... - * N.B. After this call _both_ resp->dirfh and resp->fh need an fh_put */ static int nfsd3_proc_create(struct svc_rqst *rqstp, struct nfsd3_createargs *argp, - struct nfsd3_createres *resp) + struct nfsd3_diropres *resp) { svc_fh *dirfhp, *newfhp = NULL; struct iattr *attr; - int mode; u32 nfserr; - dprintk("nfsd: CREATE %x/%ld %s\n", + dprintk("nfsd: CREATE(3) %x/%ld %s\n", SVCFH_DEV(&argp->fh), - SVCFH_INO(&argp->fh), + (long)SVCFH_INO(&argp->fh), argp->name); dirfhp = fh_copy(&resp->dirfh, &argp->fh); @@ -243,131 +264,114 @@ if (!(attr->ia_valid & ATTR_MODE)) { attr->ia_valid |= ATTR_MODE; attr->ia_mode = S_IFREG; + } else { + attr->ia_mode = (attr->ia_mode & ~S_IFMT) | S_IFREG; } - mode = attr->ia_mode & ~S_IFMT; /* Now create the file and set attributes */ - nfserr = nfsd_create(rqstp, dirfhp, argp->name, argp->len, - attr, S_IFREG, 0, newfhp); + nfserr = nfsd_create_v3(rqstp, dirfhp, argp->name, argp->len, + attr, newfhp, + argp->createmode, argp->verf); RETURN(nfserr); } -/* N.B. Is nfsd3_attrstat * correct for resp?? table says "void" */ +/* + * Make directory. This operation is not idempotent. + */ static int -nfsd3_proc_remove(struct svc_rqst *rqstp, struct nfsd3_diropargs *argp, - struct nfsd3_attrstat *resp) +nfsd3_proc_mkdir(struct svc_rqst *rqstp, struct nfsd3_createargs *argp, + struct nfsd3_diropres *resp) { int nfserr; - dprintk("nfsd: REMOVE %x/%ld %s\n", + dprintk("nfsd: MKDIR(3) %x/%ld %s\n", SVCFH_DEV(&argp->fh), - SVCFH_INO(&argp->fh), + (long)SVCFH_INO(&argp->fh), argp->name); - /* Is this correct?? */ - fh_copy(&resp->fh, &argp->fh); + argp->attrs.ia_valid &= ~ATTR_SIZE; + fh_copy(&resp->dirfh, &argp->fh); + fh_init(&resp->fh); + nfserr = nfsd_create(rqstp, &resp->dirfh, argp->name, argp->len, + &argp->attrs, S_IFDIR, 0, &resp->fh); - /* Unlink. -S_IFDIR means file must not be a directory */ - nfserr = nfsd_unlink(rqstp, &resp->fh, -S_IFDIR, argp->name, argp->len); - /* - * N.B. Should be an fh_put here ... nfsd3_proc_rmdir has one, - * or else as an xdr release function - */ - fh_put(&resp->fh); RETURN(nfserr); } static int -nfsd3_proc_rename(struct svc_rqst *rqstp, struct nfsd3_renameargs *argp, - void *resp) +nfsd3_proc_symlink(struct svc_rqst *rqstp, struct nfsd3_symlinkargs *argp, + struct nfsd3_diropres *resp) { int nfserr; - dprintk("nfsd: RENAME %x/%ld %s -> %x/%ld %s\n", + dprintk("nfsd: SYMLINK(3) %x/%ld %s -> %s\n", SVCFH_DEV(&argp->ffh), - SVCFH_INO(&argp->ffh), - argp->fname, - SVCFH_DEV(&argp->tfh), - SVCFH_INO(&argp->tfh), - argp->tname); + (long)SVCFH_INO(&argp->ffh), + argp->fname, argp->tname); - nfserr = nfsd_rename(rqstp, &argp->ffh, argp->fname, argp->flen, - &argp->tfh, argp->tname, argp->tlen); - fh_put(&argp->ffh); - fh_put(&argp->tfh); + fh_copy(&resp->dirfh, &argp->ffh); + fh_init(&resp->fh); + nfserr = nfsd_symlink(rqstp, &resp->dirfh, argp->fname, argp->flen, + argp->tname, argp->tlen, + &resp->fh, &argp->attrs); RETURN(nfserr); } +/* + * Make socket/fifo/device. + */ static int -nfsd3_proc_link(struct svc_rqst *rqstp, struct nfsd3_linkargs *argp, - void *resp) +nfsd3_proc_mknod(struct svc_rqst *rqstp, struct nfsd3_mknodargs *argp, + struct nfsd3_diropres *resp) { - int nfserr; + int nfserr, type; + dev_t rdev = 0; - dprintk("nfsd: LINK %x/%ld -> %x/%ld %s\n", - SVCFH_DEV(&argp->ffh), - SVCFH_INO(&argp->ffh), - SVCFH_DEV(&argp->tfh), - SVCFH_INO(&argp->tfh), - argp->tname); - - nfserr = nfsd_link(rqstp, &argp->tfh, argp->tname, argp->tlen, - &argp->ffh); - fh_put(&argp->ffh); - fh_put(&argp->tfh); - RETURN(nfserr); -} - -static int -nfsd3_proc_symlink(struct svc_rqst *rqstp, struct nfsd3_symlinkargs *argp, - void *resp) -{ - struct svc_fh newfh; - int nfserr; + dprintk("nfsd: MKNOD(3) %x/%ld %s\n", + SVCFH_DEV(&argp->fh), + (long)SVCFH_INO(&argp->fh), + argp->name); - dprintk("nfsd: SYMLINK %x/%ld %s -> %s\n", - SVCFH_DEV(&argp->ffh), - SVCFH_INO(&argp->ffh), - argp->fname, argp->tname); + fh_copy(&resp->dirfh, &argp->fh); + fh_init(&resp->fh); - memset(&newfh, 0, sizeof(newfh)); + if (argp->ftype == 0 || argp->ftype >= NF3BAD) + return nfserr_inval; + if (argp->ftype == NF3CHR || argp->ftype == NF3BLK) { + if ((argp->ftype == NF3CHR && argp->major >= MAX_CHRDEV) + || (argp->ftype == NF3BLK && argp->major >= MAX_BLKDEV) + || argp->minor > 0xFF) + return nfserr_inval; + rdev = ((argp->major) << 8) | (argp->minor); + } else + if (argp->ftype != NF3SOCK && argp->ftype != NF3FIFO) + return nfserr_inval; - /* - * Create the link, look up new file and set attrs. - */ - nfserr = nfsd_symlink(rqstp, &argp->ffh, argp->fname, argp->flen, - argp->tname, argp->tlen, - &newfh); - if (!nfserr) { - argp->attrs.ia_valid &= ~ATTR_SIZE; - nfserr = nfsd_setattr(rqstp, &newfh, &argp->attrs); - } + type = nfs3_ftypes[argp->ftype]; + nfserr = nfsd_create(rqstp, &resp->dirfh, argp->name, argp->len, + &argp->attrs, type, rdev, &resp->fh); - fh_put(&argp->ffh); - fh_put(&newfh); RETURN(nfserr); } /* - * Make directory. This operation is not idempotent. - * N.B. After this call resp->fh needs an fh_put + * Remove file/fifo/socket etc. */ static int -nfsd3_proc_mkdir(struct svc_rqst *rqstp, struct nfsd3_createargs *argp, - struct nfsd3_diropres *resp) +nfsd3_proc_remove(struct svc_rqst *rqstp, struct nfsd3_diropargs *argp, + struct nfsd3_attrstat *resp) { int nfserr; - dprintk("nfsd: MKDIR %x/%ld %s\n", + dprintk("nfsd: REMOVE(3) %x/%ld %s\n", SVCFH_DEV(&argp->fh), - SVCFH_INO(&argp->fh), + (long)SVCFH_INO(&argp->fh), argp->name); - argp->attrs.ia_valid &= ~ATTR_SIZE; - nfserr = nfsd_create(rqstp, &argp->fh, argp->name, argp->len, - &argp->attrs, S_IFDIR, 0, &resp->fh); - fh_put(&argp->fh); + /* Unlink. -S_IFDIR means file must not be a directory */ + fh_copy(&resp->fh, &argp->fh); + nfserr = nfsd_unlink(rqstp, &resp->fh, -S_IFDIR, argp->name, argp->len); RETURN(nfserr); } @@ -376,17 +380,58 @@ */ static int nfsd3_proc_rmdir(struct svc_rqst *rqstp, struct nfsd3_diropargs *argp, - void *resp) + struct nfsd3_attrstat *resp) { int nfserr; - dprintk("nfsd: RMDIR %x/%ld %s\n", + dprintk("nfsd: RMDIR(3) %x/%ld %s\n", SVCFH_DEV(&argp->fh), - SVCFH_INO(&argp->fh), + (long)SVCFH_INO(&argp->fh), argp->name); - nfserr = nfsd_unlink(rqstp, &argp->fh, S_IFDIR, argp->name, argp->len); - fh_put(&argp->fh); + fh_copy(&resp->fh, &argp->fh); + nfserr = nfsd_unlink(rqstp, &resp->fh, S_IFDIR, argp->name, argp->len); + RETURN(nfserr); +} + +static int +nfsd3_proc_rename(struct svc_rqst *rqstp, struct nfsd3_renameargs *argp, + struct nfsd3_renameres *resp) +{ + int nfserr; + + dprintk("nfsd: RENAME(3) %x/%ld %s -> %x/%ld %s\n", + SVCFH_DEV(&argp->ffh), + (long)SVCFH_INO(&argp->ffh), + argp->fname, + SVCFH_DEV(&argp->tfh), + (long)SVCFH_INO(&argp->tfh), + argp->tname); + + fh_copy(&resp->ffh, &argp->ffh); + fh_copy(&resp->tfh, &argp->tfh); + nfserr = nfsd_rename(rqstp, &resp->ffh, argp->fname, argp->flen, + &resp->tfh, argp->tname, argp->tlen); + RETURN(nfserr); +} + +static int +nfsd3_proc_link(struct svc_rqst *rqstp, struct nfsd3_linkargs *argp, + struct nfsd3_linkres *resp) +{ + int nfserr; + + dprintk("nfsd: LINK(3) %x/%ld -> %x/%ld %s\n", + SVCFH_DEV(&argp->ffh), + (long)SVCFH_INO(&argp->ffh), + SVCFH_DEV(&argp->tfh), + (long)SVCFH_INO(&argp->tfh), + argp->tname); + + fh_copy(&resp->fh, &argp->ffh); + fh_copy(&resp->tfh, &argp->tfh); + nfserr = nfsd_link(rqstp, &resp->tfh, argp->tname, argp->tlen, + &resp->fh); RETURN(nfserr); } @@ -395,46 +440,85 @@ */ static int nfsd3_proc_readdir(struct svc_rqst *rqstp, struct nfsd3_readdirargs *argp, - struct nfsd3_readdirres *resp) + struct nfsd3_readdirres *resp) +{ + u32 * buffer; + int nfserr, count; + unsigned int want; + + dprintk("nfsd: READDIR(3) %x/%ld %d bytes at %d\n", + SVCFH_DEV(&argp->fh), + (long)SVCFH_INO(&argp->fh), + argp->count, (u32) argp->cookie); + + /* Reserve buffer space for status, attributes and verifier */ + svcbuf_reserve(&rqstp->rq_resbuf, &buffer, &count, + 1 + NFS3_POST_OP_ATTR_WORDS + 2); + + /* Make sure we've room for the NULL ptr & eof flag, and shrink to + * client read size */ + if ((count -= 2) > (want = (argp->count >> 2) - 2)) + count = want; + + /* Read directory and encode entries on the fly */ + fh_copy(&resp->fh, &argp->fh); + nfserr = nfsd_readdir(rqstp, &resp->fh, (loff_t) argp->cookie, + nfs3svc_encode_entry, + buffer, &count, argp->verf); + memcpy(resp->verf, argp->verf, 8); + resp->count = count; + + RETURN(nfserr); +} + +/* + * Read a portion of a directory, including file handles and attrs. + * For now, we choose to ignore the dircount parameter. + */ +static int +nfsd3_proc_readdirplus(struct svc_rqst *rqstp, struct nfsd3_readdirargs *argp, + struct nfsd3_readdirres *resp) { u32 * buffer; - int nfserr, count; + int nfserr, count, want; - dprintk("nfsd: READDIR %x/%ld %d bytes at %d\n", + dprintk("nfsd: READDIR+(3) %x/%ld %d bytes at %d\n", SVCFH_DEV(&argp->fh), - SVCFH_INO(&argp->fh), - argp->count, argp->cookie); + (long)SVCFH_INO(&argp->fh), + argp->count, (u32) argp->cookie); - /* Reserve buffer space for status */ - svcbuf_reserve(&rqstp->rq_resbuf, &buffer, &count, 1); + /* Reserve buffer space for status, attributes and verifier */ + svcbuf_reserve(&rqstp->rq_resbuf, &buffer, &count, + 1 + NFS3_POST_OP_ATTR_WORDS + 2); /* Make sure we've room for the NULL ptr & eof flag, and shrink to * client read size */ - if ((count -= 8) > argp->count) - count = argp->count; + if ((count -= 2) > (want = argp->count >> 2)) + count = want; /* Read directory and encode entries on the fly */ - nfserr = nfsd_readdir(rqstp, &argp->fh, (loff_t) argp->cookie, - nfssvc_encode_entry, - buffer, &count); + fh_copy(&resp->fh, &argp->fh); + nfserr = nfsd_readdir(rqstp, &resp->fh, (loff_t) argp->cookie, + nfs3svc_encode_entry_plus, + buffer, &count, argp->verf); + memcpy(resp->verf, argp->verf, 8); resp->count = count; - fh_put(&argp->fh); RETURN(nfserr); } /* - * Get file system info + * Get file system stats */ static int -nfsd3_proc_statfs(struct svc_rqst * rqstp, struct nfsd_fhandle *argp, - struct nfsd3_statfsres *resp) +nfsd3_proc_fsstat(struct svc_rqst * rqstp, struct nfsd_fhandle *argp, + struct nfsd3_fsstatres *resp) { int nfserr; - dprintk("nfsd: STATFS %x/%ld\n", + dprintk("nfsd: FSSTAT(3) %x/%ld\n", SVCFH_DEV(&argp->fh), - SVCFH_INO(&argp->fh)); + (long)SVCFH_INO(&argp->fh)); nfserr = nfsd_statfs(rqstp, &argp->fh, &resp->stats); fh_put(&argp->fh); @@ -442,104 +526,165 @@ } /* - * NFSv2 Server procedures. - * Only the results of non-idempotent operations are cached. + * Get file system info */ -#define nfsd3_proc_none NULL -#define nfssvc_encode_void NULL -#define nfssvc_decode_void NULL -#define nfssvc_release_void NULL -struct nfsd3_void { int dummy; }; +static int +nfsd3_proc_fsinfo(struct svc_rqst * rqstp, struct nfsd_fhandle *argp, + struct nfsd3_fsinfores *resp) +{ + int nfserr; -#define PROC(name, argt, rest, relt, cache) \ - { (svc_procfunc) nfsd3_proc_##name, \ - (kxdrproc_t) nfssvc_decode_##argt, \ - (kxdrproc_t) nfssvc_encode_##rest, \ - (kxdrproc_t) nfssvc_release_##relt, \ - sizeof(struct nfsd3_##argt), \ - sizeof(struct nfsd3_##rest), \ - 0, \ - cache \ - } -struct svc_procedure nfsd3_procedures2[18] = { - PROC(null, void, void, void, RC_NOCACHE), - PROC(getattr, fhandle, attrstat, fhandle, RC_NOCACHE), - PROC(setattr, sattrargs, attrstat, fhandle, RC_REPLBUFF), - PROC(none, void, void, void, RC_NOCACHE), - PROC(lookup, diropargs, diropres, fhandle2,RC_NOCACHE), - PROC(readlink, fhandle, readlinkres, void, RC_NOCACHE), - PROC(read, readargs, readres, fhandle, RC_NOCACHE), - PROC(none, void, void, void, RC_NOCACHE), - PROC(write, writeargs, attrstat, fhandle, RC_REPLBUFF), - PROC(create, createargs, diropres, fhandle2,RC_REPLBUFF), - PROC(remove, diropargs, void,/* ??*/ void, RC_REPLSTAT), - PROC(rename, renameargs, void, void, RC_REPLSTAT), - PROC(link, linkargs, void, void, RC_REPLSTAT), - PROC(symlink, symlinkargs, void, void, RC_REPLSTAT), - PROC(mkdir, createargs, diropres, fhandle, RC_REPLBUFF), - PROC(rmdir, diropargs, void, void, RC_REPLSTAT), - PROC(readdir, readdirargs, readdirres, void, RC_REPLSTAT), - PROC(statfs, fhandle, statfsres, void, RC_NOCACHE), -}; + dprintk("nfsd: FSINFO(3) %x/%ld\n", + SVCFH_DEV(&argp->fh), + (long)SVCFH_INO(&argp->fh)); + resp->f_rtmax = NFSSVC_MAXBLKSIZE; + resp->f_rtpref = NFSSVC_MAXBLKSIZE; + resp->f_rtmult = PAGE_SIZE; + resp->f_wtmax = NFSSVC_MAXBLKSIZE; + resp->f_wtpref = NFSSVC_MAXBLKSIZE; + resp->f_wtmult = PAGE_SIZE; + resp->f_dtpref = PAGE_SIZE; + resp->f_maxfilesize = LONG_MAX; + resp->f_properties = NFS3_FSF_DEFAULT; + + nfserr = fh_verify(rqstp, &argp->fh, 0, MAY_NOP); + + /* Check special features of the file system. May request + * different read/write sizes for file systems known to have + * problems with large blocks */ + if (nfserr == 0) { + struct super_block *sb = argp->fh.fh_dentry->d_inode->i_sb; + + /* Note that we don't care for remote fs's here */ + if (sb->s_magic == 0x4d44 /* MSDOS_SUPER_MAGIC */) { + resp->f_properties = NFS3_FSF_BILLYBOY; + } + } + + fh_put(&argp->fh); + RETURN(nfserr); +} /* - * Map errnos to NFS errnos. + * Get pathconf info for the specified file */ -int -nfserrno (int errno) +static int +nfsd3_proc_pathconf(struct svc_rqst * rqstp, struct nfsd_fhandle *argp, + struct nfsd3_pathconfres *resp) { - static struct { - int nfserr; - int syserr; - } nfs_errtbl[] = { - { NFS_OK, 0 }, - { NFSERR_PERM, EPERM }, - { NFSERR_NOENT, ENOENT }, - { NFSERR_IO, EIO }, - { NFSERR_NXIO, ENXIO }, - { NFSERR_ACCES, EACCES }, - { NFSERR_EXIST, EEXIST }, - { NFSERR_NODEV, ENODEV }, - { NFSERR_NOTDIR, ENOTDIR }, - { NFSERR_ISDIR, EISDIR }, - { NFSERR_INVAL, EINVAL }, - { NFSERR_FBIG, EFBIG }, - { NFSERR_NOSPC, ENOSPC }, - { NFSERR_ROFS, EROFS }, - { NFSERR_NAMETOOLONG, ENAMETOOLONG }, - { NFSERR_NOTEMPTY, ENOTEMPTY }, -#ifdef EDQUOT - { NFSERR_DQUOT, EDQUOT }, -#endif - { NFSERR_STALE, ESTALE }, - { NFSERR_WFLUSH, EIO }, - { -1, EIO } - }; - int i; - - for (i = 0; nfs_errtbl[i].nfserr != -1; i++) { - if (nfs_errtbl[i].syserr == errno) - return htonl (nfs_errtbl[i].nfserr); + int nfserr; + + dprintk("nfsd: PATHCONF(3) %x/%ld\n", + SVCFH_DEV(&argp->fh), + (long)SVCFH_INO(&argp->fh)); + + /* Set default pathconf */ + resp->p_link_max = 255; /* at least */ + resp->p_name_max = 255; /* at least */ + resp->p_no_trunc = 0; + resp->p_chown_restricted = 1; + resp->p_case_insensitive = 0; + resp->p_case_preserving = 1; + + nfserr = fh_verify(rqstp, &argp->fh, 0, MAY_NOP); + + if (nfserr == 0) { + struct super_block *sb = argp->fh.fh_dentry->d_inode->i_sb; + + /* Note that we don't care for remote fs's here */ + switch (sb->s_magic) { + case EXT2_SUPER_MAGIC: + resp->p_link_max = EXT2_LINK_MAX; + resp->p_name_max = EXT2_NAME_LEN; + break; + case 0x4d44: /* MSDOS_SUPER_MAGIC */ + resp->p_case_insensitive = 1; + resp->p_case_preserving = 0; + break; + } } - printk (KERN_INFO "nfsd: non-standard errno: %d\n", errno); - return nfserr_io; + + fh_put(&argp->fh); + RETURN(nfserr); } -#if 0 -static void -nfsd3_dump(char *tag, u32 *buf, int len) + +/* + * Commit a file (range) to stable storage. + */ +static int +nfsd3_proc_commit(struct svc_rqst * rqstp, struct nfsd3_commitargs *argp, + struct nfsd3_commitres *resp) { - int i; + int nfserr; - printk(KERN_NOTICE - "nfsd: %s (%d words)\n", tag, len); + dprintk("nfsd: COMMIT(3) %x/%ld %d@%ld\n", + SVCFH_DEV(&argp->fh), + (long)SVCFH_INO(&argp->fh), + argp->count, + (unsigned long) argp->offset); - for (i = 0; i < len && i < 32; i += 8) - printk(KERN_NOTICE - " %08lx %08lx %08lx %08lx" - " %08lx %08lx %08lx %08lx\n", - buf[i], buf[i+1], buf[i+2], buf[i+3], - buf[i+4], buf[i+5], buf[i+6], buf[i+7]); + if (argp->offset > NFS_OFFSET_MAX) + return nfserr_inval; + + fh_copy(&resp->fh, &argp->fh); + nfserr = nfsd_commit(rqstp, &resp->fh, argp->offset, argp->count); + + RETURN(nfserr); } -#endif + + +/* + * NFSv3 Server procedures. + * Only the results of non-idempotent operations are cached. + */ +#define nfs3svc_decode_voidargs NULL +#define nfs3svc_release_void NULL +#define nfs3svc_decode_fhandleargs nfs3svc_decode_fhandle +#define nfs3svc_encode_attrstatres nfs3svc_encode_attrstat +#define nfs3svc_encode_wccstatres nfs3svc_encode_wccstat +#define nfsd3_mkdirargs nfsd3_createargs +#define nfsd3_readdirplusargs nfsd3_readdirargs +#define nfsd3_fhandleargs nfsd_fhandle +#define nfsd3_fhandleres nfsd3_attrstat +#define nfsd3_attrstatres nfsd3_attrstat +#define nfsd3_wccstatres nfsd3_attrstat +#define nfsd3_createres nfsd3_diropres +#define nfsd3_voidres nfsd3_voidargs +struct nfsd3_voidargs { int dummy; }; + +#define PROC(name, argt, rest, relt, cache) \ + { (svc_procfunc) nfsd3_proc_##name, \ + (kxdrproc_t) nfs3svc_decode_##argt##args, \ + (kxdrproc_t) nfs3svc_encode_##rest##res, \ + (kxdrproc_t) nfs3svc_release_##relt, \ + sizeof(struct nfsd3_##argt##args), \ + sizeof(struct nfsd3_##rest##res), \ + 0, \ + cache \ + } +struct svc_procedure nfsd_procedures3[22] = { + PROC(null, void, void, void, RC_NOCACHE), + PROC(getattr, fhandle, attrstat, fhandle, RC_NOCACHE), + PROC(setattr, sattr, wccstat, fhandle, RC_REPLBUFF), + PROC(lookup, dirop, dirop, fhandle2, RC_NOCACHE), + PROC(access, access, access, fhandle, RC_NOCACHE), + PROC(readlink, fhandle, readlink, fhandle, RC_NOCACHE), + PROC(read, read, read, fhandle, RC_NOCACHE), + PROC(write, write, write, fhandle, RC_REPLBUFF), + PROC(create, create, create, fhandle2, RC_REPLBUFF), + PROC(mkdir, mkdir, create, fhandle2, RC_REPLBUFF), + PROC(symlink, symlink, create, fhandle2, RC_REPLBUFF), + PROC(mknod, mknod, create, fhandle2, RC_REPLBUFF), + PROC(remove, dirop, wccstat, fhandle, RC_REPLBUFF), + PROC(rmdir, dirop, wccstat, fhandle, RC_REPLBUFF), + PROC(rename, rename, rename, fhandle2, RC_REPLBUFF), + PROC(link, link, link, fhandle2, RC_REPLBUFF), + PROC(readdir, readdir, readdir, fhandle, RC_NOCACHE), + PROC(readdirplus,readdirplus, readdir, fhandle, RC_NOCACHE), + PROC(fsstat, fhandle, fsstat, void, RC_NOCACHE), + PROC(fsinfo, fhandle, fsinfo, void, RC_NOCACHE), + PROC(pathconf, fhandle, pathconf, void, RC_NOCACHE), + PROC(commit, commit, commit, fhandle, RC_REPLBUFF) +}; diff -Naur pre9/linux/fs/nfsd/nfs3xdr.c test/linux/fs/nfsd/nfs3xdr.c --- pre9/linux/fs/nfsd/nfs3xdr.c Mon Apr 7 11:35:31 1997 +++ test/linux/fs/nfsd/nfs3xdr.c Mon Sep 18 14:35:50 2000 @@ -3,7 +3,7 @@ * * XDR support for nfsd/protocol version 3. * - * Copyright (C) 1995, 1996 Olaf Kirch <okir@monad.swb.de> + * Copyright (C) 1995, 1996, 1997 Olaf Kirch <okir@monad.swb.de> */ #include <linux/types.h> @@ -17,16 +17,16 @@ #define NFSDDBG_FACILITY NFSDDBG_XDR -u32 nfs_ok, nfserr_perm, nfserr_noent, nfserr_io, nfserr_nxio, - nfserr_acces, nfserr_exist, nfserr_nodev, nfserr_notdir, - nfserr_isdir, nfserr_fbig, nfserr_nospc, nfserr_rofs, - nfserr_nametoolong, nfserr_dquot, nfserr_stale; - #ifdef NFSD_OPTIMIZE_SPACE # define inline #endif /* + * Size of encoded NFS3 file handle, in words + */ +#define NFS3_FHANDLE_WORDS (1 + XDR_QUADLEN(sizeof(struct knfs_fh))) + +/* * Mapping of S_IF* types to NFS file types */ static u32 nfs3_ftypes[] = { @@ -37,48 +37,9 @@ }; /* - * Initialization of NFS status variables - */ -void -nfs3xdr_init(void) -{ - static int inited = 0; - - if (inited) - return; - - nfs_ok = htonl(NFS_OK); - nfserr_perm = htonl(NFSERR_PERM); - nfserr_noent = htonl(NFSERR_NOENT); - nfserr_io = htonl(NFSERR_IO); - nfserr_nxio = htonl(NFSERR_NXIO); - nfserr_acces = htonl(NFSERR_ACCES); - nfserr_exist = htonl(NFSERR_EXIST); - nfserr_nodev = htonl(NFSERR_NODEV); - nfserr_notdir = htonl(NFSERR_NOTDIR); - nfserr_isdir = htonl(NFSERR_ISDIR); - nfserr_fbig = htonl(NFSERR_FBIG); - nfserr_nospc = htonl(NFSERR_NOSPC); - nfserr_rofs = htonl(NFSERR_ROFS); - nfserr_nametoolong = htonl(NFSERR_NAMETOOLONG); - nfserr_dquot = htonl(NFSERR_DQUOT); - nfserr_stale = htonl(NFSERR_STALE); - - inited = 1; -} - -/* * XDR functions for basic NFS types */ static inline u32 * -enc64(u32 *p, u64 val) -{ - *p++ = (val >> 32); - *p++ = (val & 0xffffffff); - return p; -} - -static inline u32 * dec64(u32 *p, u64 *valp) { *valp = ((u64) ntohl(*p++)) << 32; @@ -103,13 +64,10 @@ static inline u32 * decode_fh(u32 *p, struct svc_fh *fhp) { - if (*p++ != sizeof(struct knfs_fh)) + if (ntohl(*p++) != sizeof(struct knfs_fh)) return NULL; memcpy(&fhp->fh_handle, p, sizeof(struct knfs_fh)); - fhp->fh_inode = NULL; - fhp->fh_export = NULL; - return p + (sizeof(struct knfs_fh) >> 2); } @@ -179,27 +137,35 @@ iap->ia_gid = ntohl(*p++); } if (*p++) { + u64 newsize; + iap->ia_valid |= ATTR_SIZE; - iap->ia_size = ntohl(*p++); + p = dec64(p, &newsize); + if (newsize <= NFS_OFFSET_MAX) + iap->ia_size = (u32) newsize; + else + iap->ia_size = ~(size_t) 0; } - if ((tmp = *p++) == 1) { + if ((tmp = ntohl(*p++)) == 1) { /* set to server time */ iap->ia_valid |= ATTR_ATIME; - } else if (tmp == 2) { + } else if (tmp == 2) { /* set to client time */ iap->ia_valid |= ATTR_ATIME | ATTR_ATIME_SET; iap->ia_atime = ntohl(*p++), p++; } - if ((tmp = *p++) != 0) { - iap->ia_valid |= ATTR_MTIME | ATTR_MTIME_SET; - } else if (tmp == 2) { + if ((tmp = ntohl(*p++)) == 1) { /* set to server time */ iap->ia_valid |= ATTR_MTIME; + } else if (tmp == 2) { /* set to client time */ + iap->ia_valid |= ATTR_MTIME | ATTR_MTIME_SET; iap->ia_mtime = ntohl(*p++), p++; } return p; } static inline u32 * -encode_fattr3(struct svc_rqst *rqstp, u32 *p, struct inode *inode) +encode_fattr3(struct svc_rqst *rqstp, u32 *p, struct dentry *dentry) { + struct inode *inode = dentry->d_inode; + if (!inode) { printk("nfsd: NULL inode in %s:%d", __FILE__, __LINE__); return NULL; @@ -215,7 +181,16 @@ } else { p = enc64(p, (u64) inode->i_size); } - p = enc64(p, inode->i_blksize * inode->i_blocks); + /* + * For the 'used' member, we take i_blocks if set; assuming 512-byte + * units. Some FSs don't set this, so all we can do then is + * use the size. + */ + if (inode->i_blocks) { + p = enc64(p, ((u64)inode->i_blocks)<<9 ); + } else { + p = enc64(p, (u64) inode->i_size); + } *p++ = htonl((u32) MAJOR(inode->i_rdev)); *p++ = htonl((u32) MINOR(inode->i_rdev)); p = enc64(p, (u64) inode->i_dev); @@ -227,19 +202,54 @@ return p; } +static inline u32 * +encode_saved_post_attr(struct svc_rqst *rqstp, u32 *p, struct svc_fh *fhp) +{ + struct inode *inode = fhp->fh_dentry->d_inode; + + /* Attributes to follow */ + *p++ = xdr_one; + + *p++ = htonl(nfs3_ftypes[(fhp->fh_post_mode & S_IFMT) >> 12]); + *p++ = htonl((u32) fhp->fh_post_mode); + *p++ = htonl((u32) fhp->fh_post_nlink); + *p++ = htonl((u32) nfsd_ruid(rqstp, fhp->fh_post_uid)); + *p++ = htonl((u32) nfsd_rgid(rqstp, fhp->fh_post_gid)); + if (S_ISLNK(fhp->fh_post_mode) && fhp->fh_post_size > NFS3_MAXPATHLEN) { + p = enc64(p, (u64) NFS3_MAXPATHLEN); + } else { + p = enc64(p, (u64) fhp->fh_post_size); + } + if (fhp->fh_post_blocks) { + p = enc64(p, ((u64)fhp->fh_post_blocks)<<9); + } else { + p = enc64(p, (u64) fhp->fh_post_size); + } + *p++ = htonl((u32) MAJOR(fhp->fh_post_rdev)); + *p++ = htonl((u32) MINOR(fhp->fh_post_rdev)); + p = enc64(p, (u64) inode->i_dev); + p = enc64(p, (u64) inode->i_ino); + p = encode_time3(p, fhp->fh_post_atime); + p = encode_time3(p, fhp->fh_post_mtime); + p = encode_time3(p, fhp->fh_post_ctime); + + return p; +} + /* * Encode post-operation attributes. * The inode may be NULL if the call failed because of a stale file * handle. In this case, no attributes are returned. */ static u32 * -encode_post_op_attr(struct svc_rqst *rqstp, u32 *p, struct inode *inode) +encode_post_op_attr(struct svc_rqst *rqstp, u32 *p, struct dentry *dentry) { - if (inode == NULL) { - *p++ = xdr_zero; - return p; + if (dentry && dentry->d_inode != NULL) { + *p++ = xdr_one; /* attributes follow */ + return encode_fattr3(rqstp, p, dentry); } - return encode_fattr3(rqstp, p, inode); + *p++ = xdr_zero; + return p; } /* @@ -248,17 +258,22 @@ static u32 * encode_wcc_data(struct svc_rqst *rqstp, u32 *p, struct svc_fh *fhp) { - struct inode *inode = fhp->fh_inode; + struct dentry *dentry = fhp->fh_dentry; - if (fhp->fh_post_version == inode->i_version) { - *p++ = xdr_one; - p = enc64(p, (u64) fhp->fh_pre_size); - p = encode_time3(p, fhp->fh_pre_mtime); - p = encode_time3(p, fhp->fh_pre_ctime); - } else { - *p++ = xdr_zero; + if (dentry && dentry->d_inode && fhp->fh_post_saved) { + if (fhp->fh_pre_saved) { + *p++ = xdr_one; + p = enc64(p, (u64) fhp->fh_pre_size); + p = encode_time3(p, fhp->fh_pre_mtime); + p = encode_time3(p, fhp->fh_pre_ctime); + } else { + *p++ = xdr_zero; + } + return encode_saved_post_attr(rqstp, p, fhp); } - return encode_post_op_attr(rqstp, p, inode); + /* no pre- or post-attrs */ + *p++ = xdr_zero; + return encode_post_op_attr(rqstp, p, dentry); } /* @@ -299,10 +314,12 @@ struct nfsd3_sattrargs *args) { if (!(p = decode_fh(p, &args->fh)) - || !(p = decode_sattr3(p, &args->attrs)) - || (*p++ && !(p = decode_time3(p, &args->guardtime)))) + || !(p = decode_sattr3(p, &args->attrs))) return 0; + if ((args->check_guard = ntohl(*p++)) != 0) + p = decode_time3(p, &args->guardtime); + return xdr_argsize_check(rqstp, p); } @@ -333,10 +350,10 @@ struct nfsd3_readargs *args) { if (!(p = decode_fh(p, &args->fh)) - || !(p = dec64(p, &args->offset)) - || !(p = dec64(p, &args->count))) + || !(p = dec64(p, &args->offset))) return 0; + args->count = ntohl(*p++); return xdr_argsize_check(rqstp, p); } @@ -345,14 +362,14 @@ struct nfsd3_writeargs *args) { if (!(p = decode_fh(p, &args->fh)) - || !(p = dec64(p, &args->offset)) - || !(p = dec64(p, &args->count))) + || !(p = dec64(p, &args->offset))) return 0; + args->count = ntohl(*p++); args->stable = ntohl(*p++); args->len = ntohl(*p++); args->data = (char *) p; - p += (args->len + 3) >> 2; + p += XDR_QUADLEN(args->len); return xdr_argsize_check(rqstp, p); } @@ -366,11 +383,12 @@ return 0; switch (args->createmode = ntohl(*p++)) { - case 0: case 1: + case NFS3_CREATE_UNCHECKED: + case NFS3_CREATE_GUARDED: if (!(p = decode_sattr3(p, &args->attrs))) return 0; break; - case 2: + case NFS3_CREATE_EXCLUSIVE: args->verf = p; p += 2; break; @@ -460,8 +478,9 @@ { if (!(p = decode_fh(p, &args->fh))) return 0; - args->cookie = ntohl(*p++); + p = dec64(p, &args->cookie); args->verf = p; p += 2; + args->dircount = ~0; args->count = ntohl(*p++); return xdr_argsize_check(rqstp, p); @@ -473,7 +492,7 @@ { if (!(p = decode_fh(p, &args->fh))) return 0; - args->cookie = ntohl(*p++); + p = dec64(p, &args->cookie); args->verf = p; p += 2; args->dircount = ntohl(*p++); args->count = ntohl(*p++); @@ -485,9 +504,9 @@ nfs3svc_decode_commitargs(struct svc_rqst *rqstp, u32 *p, struct nfsd3_commitargs *args) { - if (!(p = decode_fh(p, &args->fh)) - || !(p = dec64(p, &args->offset))) + if (!(p = decode_fh(p, &args->fh))) return 0; + p = dec64(p, &args->offset); args->count = ntohl(*p++); return xdr_argsize_check(rqstp, p); @@ -496,12 +515,23 @@ /* * XDR encode functions */ +/* + * There must be an encoding function for void results so svc_process + * will work properly. + */ +int +nfs3svc_encode_voidres(struct svc_rqst *rqstp, u32 *p, void *dummy) +{ + return xdr_ressize_check(rqstp, p); +} + /* GETATTR */ int nfs3svc_encode_attrstat(struct svc_rqst *rqstp, u32 *p, struct nfsd3_attrstat *resp) { - if (!(p = encode_fattr3(rqstp, p, resp->fh.fh_inode))) + if (resp->status == 0 + && !(p = encode_fattr3(rqstp, p, resp->fh.fh_dentry))) return 0; return xdr_ressize_check(rqstp, p); } @@ -518,15 +548,14 @@ /* LOOKUP */ int -nfs3svc_encode_lookupres(struct svc_rqst *rqstp, u32 *p, - struct nfsd3_lookupres *resp) +nfs3svc_encode_diropres(struct svc_rqst *rqstp, u32 *p, + struct nfsd3_diropres *resp) { if (resp->status == 0) { p = encode_fh(p, &resp->fh); - if (!(p = encode_fattr3(rqstp, p, resp->fh.fh_inode))) - return 0; + p = encode_post_op_attr(rqstp, p, resp->fh.fh_dentry); } - p = encode_post_op_attr(rqstp, p, resp->dirfh.fh_inode); + p = encode_post_op_attr(rqstp, p, resp->dirfh.fh_dentry); return xdr_ressize_check(rqstp, p); } @@ -535,7 +564,7 @@ nfs3svc_encode_accessres(struct svc_rqst *rqstp, u32 *p, struct nfsd3_accessres *resp) { - p = encode_post_op_attr(rqstp, p, resp->fh.fh_inode); + p = encode_post_op_attr(rqstp, p, resp->fh.fh_dentry); if (resp->status == 0) *p++ = htonl(resp->access); return xdr_ressize_check(rqstp, p); @@ -546,7 +575,7 @@ nfs3svc_encode_readlinkres(struct svc_rqst *rqstp, u32 *p, struct nfsd3_readlinkres *resp) { - p = encode_post_op_attr(rqstp, p, resp->fh.fh_inode); + p = encode_post_op_attr(rqstp, p, resp->fh.fh_dentry); if (resp->status == 0) { *p++ = htonl(resp->len); p += XDR_QUADLEN(resp->len); @@ -559,7 +588,7 @@ nfs3svc_encode_readres(struct svc_rqst *rqstp, u32 *p, struct nfsd3_readres *resp) { - p = encode_post_op_attr(rqstp, p, resp->fh.fh_inode); + p = encode_post_op_attr(rqstp, p, resp->fh.fh_dentry); if (resp->status == 0) { *p++ = htonl(resp->count); *p++ = htonl(resp->eof); @@ -587,11 +616,12 @@ /* CREATE, MKDIR, SYMLINK, MKNOD */ int nfs3svc_encode_createres(struct svc_rqst *rqstp, u32 *p, - struct nfsd3_createres *resp) + struct nfsd3_diropres *resp) { if (resp->status == 0) { + *p++ = xdr_one; p = encode_fh(p, &resp->fh); - p = encode_post_op_attr(rqstp, p, resp->fh.fh_inode); + p = encode_post_op_attr(rqstp, p, resp->fh.fh_dentry); } p = encode_wcc_data(rqstp, p, &resp->dirfh); return xdr_ressize_check(rqstp, p); @@ -612,7 +642,7 @@ nfs3svc_encode_linkres(struct svc_rqst *rqstp, u32 *p, struct nfsd3_linkres *resp) { - p = encode_post_op_attr(rqstp, p, resp->fh.fh_inode); + p = encode_post_op_attr(rqstp, p, resp->fh.fh_dentry); p = encode_wcc_data(rqstp, p, &resp->tfh); return xdr_ressize_check(rqstp, p); } @@ -622,73 +652,116 @@ nfs3svc_encode_readdirres(struct svc_rqst *rqstp, u32 *p, struct nfsd3_readdirres *resp) { - p = encode_post_op_attr(rqstp, p, resp->fh.fh_inode); + p = encode_post_op_attr(rqstp, p, resp->fh.fh_dentry); if (resp->status == 0) { /* stupid readdir cookie */ - *p++ = ntohl(resp->fh.fh_inode->i_mtime); - *p++ = xdr_zero; - p = resp->list_end; + memcpy(p, resp->verf, 8); p += 2; + p += XDR_QUADLEN(resp->count); } return xdr_ressize_check(rqstp, p); } -#define NFS3_ENTRYPLUS_BAGGAGE ((1 + 20 + 1 + NFS3_FHSIZE) << 2) -int -nfs3svc_encode_entry(struct readdir_cd *cd, const char *name, - int namlen, unsigned long offset, ino_t ino) +/* + * Encode a directory entry. This one works for both normal readdir + * and readdirplus. + * The normal readdir reply requires 2 (fileid) + 1 (stringlen) + * + string + 2 (cookie) + 1 (next) words, i.e. 6 + strlen. + * + * The readdirplus baggage is 1+21 words for post_op_attr, plus the + * file handle. + */ + +#define NFS3_ENTRY_BAGGAGE (2 + 1 + 2 + 1) +#define NFS3_ENTRYPLUS_BAGGAGE (1 + 21 + 1 + (NFS3_FHSIZE >> 2)) +static int +encode_entry(struct readdir_cd *cd, const char *name, + int namlen, off_t offset, ino_t ino, int plus) { u32 *p = cd->buffer; int buflen, slen, elen; - struct svc_fh fh; - if (offset > ~((u64) 0)) - return -EINVAL; if (cd->offset) - *cd->offset = htonl(offset); + enc64(cd->offset, (u64) offset); - /* For readdirplus, look up the inode */ - if (cd->plus && nfsd_lookup(cd->rqstp, cd->dirfh, name, namlen, &fh)) + /* nfsd_readdir calls us with name == 0 when it wants us to + * set the last offset entry. */ + if (name == 0) return 0; + /* + dprintk("encode_entry(%.*s @%ld%s)\n", + namlen, name, (long) offset, plus? " plus" : ""); + */ + /* truncate filename if too long */ if (namlen > NFS3_MAXNAMLEN) namlen = NFS3_MAXNAMLEN; slen = XDR_QUADLEN(namlen); - elen = slen + (cd->plus? NFS3_ENTRYPLUS_BAGGAGE : 0); - if ((buflen = cd->buflen - elen - 4) < 0) { + elen = slen + NFS3_ENTRY_BAGGAGE + + (plus? NFS3_ENTRYPLUS_BAGGAGE : 0); + if ((buflen = cd->buflen - elen) < 0) { cd->eob = 1; - if (cd->plus) - fh_put(&fh); return -EINVAL; } - *p++ = xdr_one; /* mark entry present */ - *p++ = xdr_zero; /* file id (64 bit) */ - *p++ = htonl((u32) ino); - *p++ = htonl((u32) namlen); /* name length & name */ + *p++ = xdr_one; /* mark entry present */ + p = enc64(p, ino); /* file id */ +#ifdef XDR_ENCODE_STRING_TAKES_LENGTH + p = xdr_encode_string(p, name, namlen); /* name length & name */ +#else + /* just like nfsproc.c */ + *p++ = htonl((u32) namlen); + p[slen - 1] = 0; /* don't leak kernel data */ memcpy(p, name, namlen); p += slen; +#endif + + cd->offset = p; /* remember pointer */ + p = enc64(p, NFS_OFFSET_MAX); /* offset of next entry */ /* throw in readdirplus baggage */ - if (cd->plus) { - p = encode_post_op_attr(cd->rqstp, p, fh.fh_inode); - p = encode_fh(p, &fh); - fh_put(&fh); - } + if (plus) { + struct svc_fh fh; - cd->offset = p; /* remember pointer */ - p = enc64(p, ~(u64) 0); /* offset of next entry */ + fh_init(&fh); + /* Disabled for now because of lock-up */ + if (0 && nfsd_lookup(cd->rqstp, cd->dirfh, name, namlen, &fh) == 0) { + p = encode_post_op_attr(cd->rqstp, p, fh.fh_dentry); + p = encode_fh(p, &fh); + fh_put(&fh); + } else { + /* Didn't find this entry... weird. + * Proceed without the attrs anf fh anyway. + */ + *p++ = 0; + *p++ = 0; + } + } cd->buflen = buflen; cd->buffer = p; return 0; } +int +nfs3svc_encode_entry(struct readdir_cd *cd, const char *name, + int namlen, off_t offset, ino_t ino) +{ + return encode_entry(cd, name, namlen, offset, ino, 0); +} + +int +nfs3svc_encode_entry_plus(struct readdir_cd *cd, const char *name, + int namlen, off_t offset, ino_t ino) +{ + return encode_entry(cd, name, namlen, offset, ino, 1); +} + /* FSSTAT */ int -nfs3svc_encode_statfsres(struct svc_rqst *rqstp, u32 *p, - struct nfsd3_statfsres *resp) +nfs3svc_encode_fsstatres(struct svc_rqst *rqstp, u32 *p, + struct nfsd3_fsstatres *resp) { struct statfs *s = &resp->stats; u64 bs = s->f_bsize; @@ -722,9 +795,9 @@ *p++ = htonl(resp->f_wtpref); *p++ = htonl(resp->f_wtmult); *p++ = htonl(resp->f_dtpref); - *p++ = htonl(resp->f_maxfilesize); + p = enc64(p, resp->f_maxfilesize); + *p++ = xdr_one; *p++ = xdr_zero; - *p++ = htonl(1000000000 / HZ); *p++ = htonl(resp->f_properties); } @@ -741,8 +814,8 @@ if (resp->status == 0) { *p++ = htonl(resp->p_link_max); *p++ = htonl(resp->p_name_max); - *p++ = xdr_one; /* always reject long file names */ - *p++ = xdr_one; /* chown restricted */ + *p++ = htonl(resp->p_no_trunc); + *p++ = htonl(resp->p_chown_restricted); *p++ = htonl(resp->p_case_insensitive); *p++ = htonl(resp->p_case_preserving); } @@ -769,7 +842,7 @@ */ int nfs3svc_release_fhandle(struct svc_rqst *rqstp, u32 *p, - struct nfsd_fhandle *resp) + struct nfsd3_attrstat *resp) { fh_put(&resp->fh); return 1; @@ -777,7 +850,7 @@ int nfs3svc_release_fhandle2(struct svc_rqst *rqstp, u32 *p, - struct nfsd3_fhandle2 *resp) + struct nfsd3_fhandle_pair *resp) { fh_put(&resp->fh1); fh_put(&resp->fh2); diff -Naur pre9/linux/fs/nfsd/nfscache.c test/linux/fs/nfsd/nfscache.c --- pre9/linux/fs/nfsd/nfscache.c Sun Jan 24 21:54:35 1999 +++ test/linux/fs/nfsd/nfscache.c Mon Sep 18 14:35:50 2000 @@ -143,8 +143,9 @@ nfsd_cache_lookup(struct svc_rqst *rqstp, int type) { struct svc_cacherep *rh, *rp; - struct svc_client *clp = rqstp->rq_client; u32 xid = rqstp->rq_xid, + proto = rqstp->rq_prot, + vers = rqstp->rq_vers, proc = rqstp->rq_proc; unsigned long age; @@ -158,7 +159,9 @@ while ((rp = rp->c_hash_next) != rh) { if (rp->c_state != RC_UNUSED && xid == rp->c_xid && proc == rp->c_proc && - exp_checkaddr(clp, rp->c_client)) { + proto == rp->c_prot && vers == rp->c_vers && + time_before(jiffies, rp->c_timestamp + 120*HZ) && + memcmp((char*)&rqstp->rq_addr, (char*)&rp->c_addr, rqstp->rq_addrlen)==0) { nfsdstats.rchits++; goto found_entry; } @@ -195,7 +198,11 @@ rp->c_state = RC_INPROG; rp->c_xid = xid; rp->c_proc = proc; - rp->c_client = rqstp->rq_addr.sin_addr; + memcpy(&rp->c_addr, &rqstp->rq_addr, sizeof(rp->c_addr)); + rp->c_prot = proto; + rp->c_vers = vers; + rp->c_timestamp = jiffies; + hash_refile(rp); /* release any buffer */ @@ -210,12 +217,13 @@ found_entry: /* We found a matching entry which is either in progress or done. */ age = jiffies - rp->c_timestamp; - rp->c_timestamp = jiffies; - lru_put_front(rp); /* Request being processed or excessive rexmits */ if (rp->c_state == RC_INPROG || age < RC_DELAY) return RC_DROPIT; + + rp->c_timestamp = jiffies; + lru_put_front(rp); /* From the hall of fame of impractical attacks: * Is this a user who tries to snoop on the cache? */ diff -Naur pre9/linux/fs/nfsd/nfsctl.c test/linux/fs/nfsd/nfsctl.c --- pre9/linux/fs/nfsd/nfsctl.c Wed May 3 17:16:46 2000 +++ test/linux/fs/nfsd/nfsctl.c Mon Sep 18 14:35:50 2000 @@ -18,7 +18,6 @@ #include <linux/fcntl.h> #include <linux/net.h> #include <linux/in.h> -#include <linux/version.h> #include <linux/unistd.h> #include <linux/malloc.h> #include <linux/proc_fs.h> @@ -363,7 +362,6 @@ do_nfsservctl = NULL; nfsd_export_shutdown(); nfsd_cache_shutdown(); - nfsd_fh_free(); remove_proc_entry("fs/nfs/time-diff-margin", NULL); remove_proc_entry("fs/nfs/exports", NULL); remove_proc_entry("fs/nfs", NULL); diff -Naur pre9/linux/fs/nfsd/nfsfh.c test/linux/fs/nfsd/nfsfh.c --- pre9/linux/fs/nfsd/nfsfh.c Wed May 3 17:16:46 2000 +++ test/linux/fs/nfsd/nfsfh.c Mon Sep 18 14:35:50 2000 @@ -5,6 +5,7 @@ * * Copyright (C) 1995, 1996 Olaf Kirch <okir@monad.swb.de> * Portions Copyright (C) 1999 G. Allen Morris III <gam3@acm.org> + * Extensive rewrite by Neil Brown <neilb@cse.unsw.edu.au> Southern-Spring 1999 */ #include <linux/sched.h> @@ -22,334 +23,50 @@ #define NFSDDBG_FACILITY NFSDDBG_FH #define NFSD_PARANOIA 1 /* #define NFSD_DEBUG_VERBOSE 1 */ -/* #define NFSD_DEBUG_VERY_VERBOSE 1 */ -extern unsigned long max_mapnr; - -#define NFSD_FILE_CACHE 0 -#define NFSD_DIR_CACHE 1 -struct fh_entry { - struct dentry * dentry; - unsigned long reftime; - ino_t ino; - kdev_t dev; -}; - -#define NFSD_MAXFH \ - (((nfsd_nservers + 1) >> 1) * PAGE_SIZE/sizeof(struct fh_entry)) -static struct fh_entry *filetable = NULL; -static struct fh_entry *dirstable = NULL; static int nfsd_nr_verified = 0; static int nfsd_nr_put = 0; -static unsigned long nfsd_next_expire = 0; - -static int add_to_fhcache(struct dentry *, int); -struct dentry * lookup_inode(kdev_t, ino_t, ino_t); - -static LIST_HEAD(fixup_head); -static LIST_HEAD(path_inuse); -static int nfsd_nr_fixups = 0; -static int nfsd_nr_paths = 0; -#define NFSD_MAX_PATHS 500 -#define NFSD_MAX_FIXUPS 500 -#define NFSD_MAX_FIXUP_AGE 30*HZ - -struct nfsd_fixup { - struct list_head lru; - unsigned long reftime; - ino_t dirino; - ino_t ino; - kdev_t dev; - ino_t new_dirino; -}; - -struct nfsd_path { - struct list_head lru; - unsigned long reftime; - int users; - ino_t ino; - kdev_t dev; - char name[1]; -}; - -static struct nfsd_fixup * -find_cached_lookup(kdev_t dev, ino_t dirino, ino_t ino) -{ - struct list_head *tmp = fixup_head.next; - - for (; tmp != &fixup_head; tmp = tmp->next) { - struct nfsd_fixup *fp; - - fp = list_entry(tmp, struct nfsd_fixup, lru); -#ifdef NFSD_DEBUG_VERY_VERBOSE -printk("fixup %lu %lu, %lu %lu %s %s\n", - fp->ino, ino, - fp->dirino, dirino, - kdevname(fp->dev), kdevname(dev)); -#endif - if (fp->ino != ino) - continue; - if (fp->dirino != dirino) - continue; - if (fp->dev != dev) - continue; - fp->reftime = jiffies; - list_del(tmp); - list_add(tmp, &fixup_head); - return fp; - } - return NULL; -} - -/* - * Save the dirino from a rename. - */ -void -add_to_rename_cache(ino_t new_dirino, - kdev_t dev, ino_t dirino, ino_t ino) -{ - struct nfsd_fixup *fp; - - if (dirino == new_dirino) - return; - - fp = find_cached_lookup(dev, - dirino, - ino); - if (fp) { - fp->new_dirino = new_dirino; - return; - } - - /* - * Add a new entry. The small race here is unimportant: - * if another task adds the same lookup, both entries - * will be consistent. - */ - fp = kmalloc(sizeof(struct nfsd_fixup), GFP_KERNEL); - if (fp) { - fp->dirino = dirino; - fp->ino = ino; - fp->dev = dev; - fp->new_dirino = new_dirino; - list_add(&fp->lru, &fixup_head); - nfsd_nr_fixups++; - } -} - -/* - * Save the dentry pointer from a successful lookup. - */ - -static void free_fixup_entry(struct nfsd_fixup *fp) -{ - list_del(&fp->lru); -#ifdef NFSD_DEBUG_VERY_VERBOSE -printk("free_rename_entry: %lu->%lu %lu/%s\n", - fp->dirino, - fp->new_dirino, - fp->ino, - kdevname(fp->dev), - (jiffies - fp->reftime)); -#endif - kfree(fp); - nfsd_nr_fixups--; -} - -/* - * Copy a dentry's path into the specified buffer. - */ -static int copy_path(char *buffer, struct dentry *dentry, int namelen) -{ - char *p, *b = buffer; - int result = 0, totlen = 0, len; - - while (1) { - struct dentry *parent; - dentry = dentry->d_covers; - parent = dentry->d_parent; - len = dentry->d_name.len; - p = (char *) dentry->d_name.name + len; - totlen += len; - if (totlen > namelen) - goto out; - while (len--) - *b++ = *(--p); - if (dentry == parent) - break; - dentry = parent; - totlen++; - if (totlen > namelen) - goto out; - *b++ = '/'; - } - *b = 0; - - /* - * Now reverse in place ... - */ - p = buffer; - while (p < b) { - char c = *(--b); - *b = *p; - *p++ = c; - } - result = 1; -out: - return result; -} - -/* - * Add a dentry's path to the path cache. - */ -static int add_to_path_cache(struct dentry *dentry) -{ - struct inode *inode = dentry->d_inode; - struct dentry *this; - struct nfsd_path *new; - int len, result = 0; - -#ifdef NFSD_DEBUG_VERBOSE -printk("add_to_path_cache: caching %s/%s\n", -dentry->d_parent->d_name.name, dentry->d_name.name); -#endif - /* - * Get the length of the full pathname. - */ -restart: - len = 0; - this = dentry; - while (1) { - struct dentry *parent; - this = this->d_covers; - parent = this->d_parent; - len += this->d_name.len; - if (this == parent) - break; - this = parent; - len++; - } - /* - * Allocate a structure to hold the path. - */ - new = kmalloc(sizeof(struct nfsd_path) + len, GFP_KERNEL); - if (new) { - new->users = 0; - new->reftime = jiffies; - new->ino = inode->i_ino; - new->dev = inode->i_dev; - result = copy_path(new->name, dentry, len); - if (!result) - goto retry; - list_add(&new->lru, &path_inuse); - nfsd_nr_paths++; -#ifdef NFSD_DEBUG_VERBOSE -printk("add_to_path_cache: added %s, paths=%d\n", new->name, nfsd_nr_paths); -#endif - } - return result; - - /* - * If the dentry's path length changed, just try again. - */ -retry: - kfree(new); - printk(KERN_DEBUG "add_to_path_cache: path length changed, retrying\n"); - goto restart; -} - -/* - * Search for a path entry for the specified (dev, inode). - */ -static struct nfsd_path *get_path_entry(kdev_t dev, ino_t ino) -{ - struct nfsd_path *pe; - struct list_head *tmp; - for (tmp = path_inuse.next; tmp != &path_inuse; tmp = tmp->next) { - pe = list_entry(tmp, struct nfsd_path, lru); - if (pe->ino != ino) - continue; - if (pe->dev != dev) - continue; - list_del(tmp); - list_add(tmp, &path_inuse); - pe->users++; - pe->reftime = jiffies; -#ifdef NFSD_PARANOIA -printk("get_path_entry: found %s for %s/%ld\n", pe->name, kdevname(dev), ino); -#endif - return pe; - } - return NULL; -} - -static void put_path(struct nfsd_path *pe) -{ - pe->users--; -} - -static void free_path_entry(struct nfsd_path *pe) -{ - if (pe->users) - printk(KERN_DEBUG "free_path_entry: %s in use, users=%d\n", - pe->name, pe->users); - list_del(&pe->lru); - kfree(pe); - nfsd_nr_paths--; -} struct nfsd_getdents_callback { - struct nfsd_dirent *dirent; - ino_t dirino; /* parent inode number */ - int found; /* dirent inode matched? */ + struct qstr *name; /* name that was found. name->name already points to a buffer */ + unsigned long ino; /* the inum we are looking for */ + int found; /* inode matched? */ int sequence; /* sequence counter */ }; -struct nfsd_dirent { - ino_t ino; /* preset to desired entry */ - int len; - char name[256]; -}; - /* - * A rather strange filldir function to capture the inode number - * for the second entry (the parent inode) and the name matching - * the specified inode number. + * A rather strange filldir function to capture + * the name matching the specified inode number. */ -static int filldir_one(void * __buf, const char * name, int len, +static int filldir_one(void * __buf, const char * name, int len, off_t pos, ino_t ino) { struct nfsd_getdents_callback *buf = __buf; - struct nfsd_dirent *dirent = buf->dirent; + struct qstr *qs = buf->name; + char *nbuf = (char*)qs->name; /* cast is to get rid of "const" */ int result = 0; buf->sequence++; -#ifdef NFSD_DEBUG_VERY_VERBOSE -printk("filldir_one: seq=%d, ino=%lu, name=%s\n", buf->sequence, ino, name); +#ifdef NFSD_DEBUG_VERBOSE +dprintk("filldir_one: seq=%d, ino=%ld, name=%s\n", buf->sequence, ino, name); #endif - if (buf->sequence == 2) { - buf->dirino = ino; - goto out; - } - if (dirent->ino == ino) { - dirent->len = len; - memcpy(dirent->name, name, len); - dirent->name[len] = '\0'; + if (buf->ino == ino) { + qs->len = len; + memcpy(nbuf, name, len); + nbuf[len] = '\0'; buf->found = 1; result = -1; } -out: return result; } /* - * Read a directory and return the parent inode number and the name - * of the specified entry. The dirent must be initialized with the - * inode number of the desired entry. + * Read a directory and return the name of the specified entry. + * i_sem is already down(). */ -static int get_parent_ino(struct dentry *dentry, struct nfsd_dirent *dirent) +static int get_ino_name(struct dentry *dentry, struct qstr *name, unsigned long ino) { struct inode *dir = dentry->d_inode; int error; @@ -372,15 +89,13 @@ if (!file.f_op->readdir) goto out_close; - buffer.dirent = dirent; - buffer.dirino = 0; + buffer.name = name; + buffer.ino = ino; buffer.found = 0; buffer.sequence = 0; while (1) { int old_seq = buffer.sequence; - down(&dir->i_sem); error = file.f_op->readdir(&file, &buffer, filldir_one); - up(&dir->i_sem); if (error < 0) break; @@ -391,7 +106,6 @@ if (old_seq == buffer.sequence) break; } - dirent->ino = buffer.dirino; out_close: if (file.f_op->release) @@ -400,716 +114,380 @@ return error; } -/* - * Look up a dentry given inode and parent inode numbers. - * - * This relies on the ability of a Unix-like filesystem to return - * the parent inode of a directory as the ".." (second) entry. - * - * This could be further optimized if we had an efficient way of - * searching for a dentry given the inode: as we walk up the tree, - * it's likely that a dentry exists before we reach the root. +/* this should be provided by each filesystem in an nfsd_operations interface as + * iget isn't really the right interface */ -struct dentry * lookup_inode(kdev_t dev, ino_t dirino, ino_t ino) +static struct dentry *nfsd_iget(struct super_block *sb, unsigned long ino, __u32 generation) { - struct super_block *sb; - struct dentry *root, *dentry, *result; - struct inode *dir; - char *name; - unsigned long page; - ino_t root_ino; - int error; - struct nfsd_dirent dirent; - result = ERR_PTR(-ENOMEM); - page = __get_free_page(GFP_KERNEL); - if (!page) - goto out; - - /* - * Get the root dentry for the device. - */ - result = ERR_PTR(-ENOENT); - sb = get_super(dev); - if (!sb) - goto out_page; - root = dget(sb->s_root); - root_ino = root->d_inode->i_ino; /* usually 2 */ - - name = (char *) page + PAGE_SIZE; - *(--name) = 0; - - /* - * Walk up the tree to construct the name string. - * When we reach the root inode, look up the name - * relative to the root dentry. + /* + * ext2fs' read_inode has been strengthed to return a bad_inode if + * the inode had been deleted. + * + * Currently we don't know the generation for parent directory, + * so a generation of 0 means "accept any" */ - while (1) { - if (ino == root_ino) { - if (*name == '/') - name++; - /* - * Note: this dput()s the root dentry. - */ - result = lookup_dentry(name, root, 0); - break; - } - - /* - * Fix for /// bad export bug: if dirino is the root, - * get the real root dentry rather than creating a temporary - * "root" dentry. XXX We could extend this to use - * any existing dentry for the located 'dir', but all - * of this code is going to be completely rewritten soon, - * so I won't bother. - */ - - if (dirino == root_ino) { - dentry = dget(root); - } - else { - result = ERR_PTR(-ENOENT); - dir = iget_in_use(sb, dirino); - if (!dir) - goto out_root; - dentry = d_alloc_root(dir, NULL); - if (!dentry) - goto out_iput; - } - - /* - * Get the name for this inode and the next parent inode. - */ - dirent.ino = ino; - error = get_parent_ino(dentry, &dirent); - result = ERR_PTR(error); - dput(dentry); - if (error) - goto out_root; - /* - * Prepend the name to the buffer. - */ - result = ERR_PTR(-ENAMETOOLONG); - name -= (dirent.len + 1); - if ((unsigned long) name <= page) - goto out_root; - memcpy(name + 1, dirent.name, dirent.len); - *name = '/'; - - /* - * Make sure we can't get caught in a loop ... - */ - if (dirino == dirent.ino && dirino != root_ino) { - printk(KERN_DEBUG - "lookup_inode: looping?? (ino=%ld, path=%s)\n", - dirino, name); - goto out_root; - } - ino = dirino; - dirino = dirent.ino; - } - -out_page: - free_page(page); -out: - return result; - - /* - * Error exits ... - */ -out_iput: - result = ERR_PTR(-ENOMEM); - iput(dir); -out_root: - dput(root); - goto out_page; -} - -/* - * Find an entry in the cache matching the given dentry pointer. - */ -static struct fh_entry *find_fhe(struct dentry *dentry, int cache, - struct fh_entry **empty) -{ - struct fh_entry *fhe; - int i, found = (empty == NULL) ? 1 : 0; - - if (!dentry) - goto out; - - fhe = (cache == NFSD_FILE_CACHE) ? &filetable[0] : &dirstable[0]; - for (i = 0; i < NFSD_MAXFH; i++, fhe++) { - if (fhe->dentry == dentry) { - fhe->reftime = jiffies; - return fhe; - } - if (!found && !fhe->dentry) { - found = 1; - *empty = fhe; - } - } -out: - return NULL; -} - -/* - * Expire a cache entry. - */ -static void expire_fhe(struct fh_entry *empty, int cache) -{ - struct dentry *dentry = empty->dentry; - -#ifdef NFSD_DEBUG_VERBOSE -printk("expire_fhe: expiring %s %s/%s, d_count=%d, ino=%lu\n", -(cache == NFSD_FILE_CACHE) ? "file" : "dir", -dentry->d_parent->d_name.name, dentry->d_name.name, dentry->d_count,empty->ino); -#endif - empty->dentry = NULL; /* no dentry */ - /* - * Add the parent to the dir cache before releasing the dentry, - * and check whether to save a copy of the dentry's path. - */ - if (dentry != dentry->d_parent) { - struct dentry *parent = dget(dentry->d_parent); - if (add_to_fhcache(parent, NFSD_DIR_CACHE)) - nfsd_nr_verified++; - else - dput(parent); - /* - * If we're expiring a directory, copy its path. - */ - if (cache == NFSD_DIR_CACHE) { - add_to_path_cache(dentry); - } - } - dput(dentry); - nfsd_nr_put++; -} - -/* - * Look for an empty slot, or select one to expire. - */ -static void expire_slot(int cache) -{ - struct fh_entry *fhe, *empty = NULL; - unsigned long oldest = -1; - int i; - - fhe = (cache == NFSD_FILE_CACHE) ? &filetable[0] : &dirstable[0]; - for (i = 0; i < NFSD_MAXFH; i++, fhe++) { - if (!fhe->dentry) - goto out; - if (fhe->reftime < oldest) { - oldest = fhe->reftime; - empty = fhe; - } - } - if (empty) - expire_fhe(empty, cache); - -out: - return; -} - -/* - * Expire any cache entries older than a certain age. - */ -static void expire_old(int cache, int age) -{ - struct fh_entry *fhe; - int i; - -#ifdef NFSD_DEBUG_VERY_VERBOSE -printk("expire_old: expiring %s older than %d\n", -(cache == NFSD_FILE_CACHE) ? "file" : "dir", age); -#endif - fhe = (cache == NFSD_FILE_CACHE) ? &filetable[0] : &dirstable[0]; - for (i = 0; i < NFSD_MAXFH; i++, fhe++) { - if (!fhe->dentry) - continue; - if ((jiffies - fhe->reftime) > age) - expire_fhe(fhe, cache); - } - - /* - * Trim the fixup cache ... - */ - while (nfsd_nr_fixups > NFSD_MAX_FIXUPS) { - struct nfsd_fixup *fp; - fp = list_entry(fixup_head.prev, struct nfsd_fixup, lru); - if ((jiffies - fp->reftime) < NFSD_MAX_FIXUP_AGE) - break; - free_fixup_entry(fp); - } - - /* - * Trim the path cache ... - */ - while (nfsd_nr_paths > NFSD_MAX_PATHS) { - struct nfsd_path *pe; - pe = list_entry(path_inuse.prev, struct nfsd_path, lru); - if (pe->users) - break; - free_path_entry(pe); - } -} - -/* - * Add a dentry to the file or dir cache. - * - * Note: As NFS file handles must have an inode, we don't accept - * negative dentries. - */ -static int add_to_fhcache(struct dentry *dentry, int cache) -{ - struct fh_entry *fhe, *empty = NULL; - struct inode *inode = dentry->d_inode; - + struct inode *inode; + struct list_head *lp; + struct dentry *result; + inode = iget_in_use(sb, ino); if (!inode) { -#ifdef NFSD_PARANOIA -printk("add_to_fhcache: %s/%s rejected, no inode!\n", -dentry->d_parent->d_name.name, dentry->d_name.name); -#endif - return 0; - } + dprintk("nfsd_iget: failed to find ino: %lu on %s\n", + ino, bdevname(sb->s_dev)); + return ERR_PTR(-ESTALE); + } + if (is_bad_inode(inode) + || (generation && inode->i_generation != generation) + ) { + /* we didn't find the right inode.. */ + dprintk("fh_verify: Inode %lu, Bad count: %d %d or version %u %u\n", + inode->i_ino, + inode->i_nlink, inode->i_count, + inode->i_generation, + generation); -repeat: - fhe = find_fhe(dentry, cache, &empty); - if (fhe) { - return 0; + iput(inode); + return ERR_PTR(-ESTALE); } - - /* - * Not found ... make a new entry. + /* now to find a dentry. + * If possible, get a well-connected one */ - if (empty) { - empty->dentry = dentry; - empty->reftime = jiffies; - empty->ino = inode->i_ino; - empty->dev = inode->i_dev; - return 1; - } - - /* if nfsd_server is zero, NFSD_MAXFH will be zero too, so - * find_fhe() will NEVER find the file handle NOR an empty space, - * and expire_slot will not be able to expire any file handle, - * because NFSD_MAXFH is zero ... */ - - if (nfsd_nservers <= 0) { - return 0; - } - - expire_slot(cache); - goto repeat; -} - -/* - * Find an entry in the dir cache for the specified inode number. - */ -static struct fh_entry *find_fhe_by_ino(kdev_t dev, ino_t ino) -{ - struct fh_entry * fhe = &dirstable[0]; - int i; - - for (i = 0; i < NFSD_MAXFH; i++, fhe++) { - if (fhe->ino == ino && fhe->dev == dev) { - fhe->reftime = jiffies; - return fhe; + for (lp = inode->i_dentry.next; lp != &inode->i_dentry ; lp=lp->next) { + result = list_entry(lp,struct dentry, d_alias); + if (! (result->d_flags & DCACHE_NFSD_DISCONNECTED)) { + dget(result); + iput(inode); + return result; } } - return NULL; + result = d_alloc_root(inode, NULL); + if (result == NULL) { + iput(inode); + return ERR_PTR(-ENOMEM); + } + result->d_flags |= DCACHE_NFSD_DISCONNECTED; + d_rehash(result); /* so a dput won't loose it */ + return result; } -/* - * Find the (directory) dentry with the specified (dev, inode) number. - * Note: this leaves the dentry in the cache. +/* this routine links an IS_ROOT dentry into the dcache tree. It gains "parent" + * as a parent and "name" as a name + * It should possibly go in dcache.c */ -static struct dentry *find_dentry_by_ino(kdev_t dev, ino_t ino) +int d_splice(struct dentry *target, struct dentry *parent, struct qstr *name) { - struct fh_entry *fhe; - struct nfsd_path *pe; - struct dentry * dentry; - -#ifdef NFSD_DEBUG_VERBOSE -printk("find_dentry_by_ino: looking for inode %ld\n", ino); -#endif - /* - * Special case: inode number 2 is the root inode, - * so we can use the root dentry for the device. - */ - if (ino == 2) { - struct super_block *sb = get_super(dev); - if (sb) { + struct dentry *tdentry; #ifdef NFSD_PARANOIA -printk("find_dentry_by_ino: getting root dentry for %s\n", kdevname(dev)); -#endif - if (sb->s_root) { - dentry = dget(sb->s_root); - goto out; - } else { + if (!IS_ROOT(target)) + printk("nfsd: d_splice with no-root target: %s/%s\n", parent->d_name.name, name->name); + if (!(target->d_flags & DCACHE_NFSD_DISCONNECTED)) + printk("nfsd: d_splice with non-DISCONNECTED target: %s/%s\n", parent->d_name.name, name->name); +#endif + name->hash = full_name_hash(name->name, name->len); + tdentry = d_alloc(parent, name); + if (tdentry == NULL) + return -ENOMEM; + d_move(target, tdentry); + + /* tdentry will have been made a "child" of target (the parent of target) + * make it an IS_ROOT instead + */ + list_del(&tdentry->d_child); + tdentry->d_parent = tdentry; + d_rehash(target); + dput(tdentry); + + /* if parent is properly connected, then we can assert that + * the children are connected, but it must be a singluar (non-forking) + * branch + */ + if (!(parent->d_flags & DCACHE_NFSD_DISCONNECTED)) { + while (target) { + target->d_flags &= ~DCACHE_NFSD_DISCONNECTED; + parent = target; + if (list_empty(&parent->d_subdirs)) + target = NULL; + else { + target = list_entry(parent->d_subdirs.next, struct dentry, d_child); #ifdef NFSD_PARANOIA - printk("find_dentry_by_ino: %s has no root??\n", - kdevname(dev)); + /* must be only child */ + if (target->d_child.next != &parent->d_subdirs + || target->d_child.prev != &parent->d_subdirs) + printk("nfsd: d_splice found non-singular disconnected branch: %s/%s\n", + parent->d_name.name, target->d_name.name); #endif } } } + return 0; +} - /* - * Search the dentry cache ... - */ - fhe = find_fhe_by_ino(dev, ino); - if (fhe) { - dentry = dget(fhe->dentry); - goto out; - } - /* - * Search the path cache ... - */ - dentry = NULL; - pe = get_path_entry(dev, ino); - if (pe) { - struct dentry *res; - res = lookup_dentry(pe->name, NULL, 0); - if (!IS_ERR(res)) { - struct inode *inode = res->d_inode; - if (inode && inode->i_ino == ino && - inode->i_dev == dev) { - dentry = res; -#ifdef NFSD_PARANOIA -printk("find_dentry_by_ino: found %s/%s, ino=%ld\n", -dentry->d_parent->d_name.name, dentry->d_name.name, ino); -#endif - if (add_to_fhcache(dentry, NFSD_DIR_CACHE)) { - dget(dentry); - nfsd_nr_verified++; - } - put_path(pe); - } else { - dput(res); - put_path(pe); - /* We should delete it from the cache. */ - free_path_entry(pe); +/* this routine finds the dentry of the parent of a given directory + * it should be in the filesystem accessed by nfsd_operations + * it assumes lookup("..") works. + */ +struct dentry *nfsd_findparent(struct dentry *child) +{ + struct dentry *tdentry, *pdentry; + tdentry = d_alloc(child, &(const struct qstr) {"..", 2, 0}); + if (!tdentry) + return ERR_PTR(-ENOMEM); + + /* I'm going to assume that if the returned dentry is different, then + * it is well connected. But nobody returns different dentrys do they? + */ + pdentry = child->d_inode->i_op->lookup(child->d_inode, tdentry); + d_drop(tdentry); /* we never want ".." hashed */ + if (!pdentry) { + /* I don't want to return a ".." dentry. + * I would prefer to return an unconnected "IS_ROOT" dentry, + * though a properly connected dentry is even better + */ + /* if first or last of alias list is not tdentry, use that + * else make a root dentry + */ + struct list_head *aliases = &tdentry->d_inode->i_dentry; + if (aliases->next != aliases) { + pdentry = list_entry(aliases->next, struct dentry, d_alias); + if (pdentry == tdentry) + pdentry = list_entry(aliases->prev, struct dentry, d_alias); + if (pdentry == tdentry) + pdentry = NULL; + if (pdentry) dget(pdentry); + } + if (pdentry == NULL) { + pdentry = d_alloc_root(igrab(tdentry->d_inode), NULL); + if (pdentry) { + pdentry->d_flags |= DCACHE_NFSD_DISCONNECTED; + d_rehash(pdentry); } - } else { -#ifdef NFSD_PARANOIA -printk("find_dentry_by_ino: %s lookup failed\n", pe->name); -#endif - put_path(pe); - /* We should delete it from the cache. */ - free_path_entry(pe); } + if (pdentry == NULL) + pdentry = ERR_PTR(-ENOMEM); } -out: - return dentry; + dput(tdentry); /* it is not hashed, it will be discarded */ + return pdentry; } -/* - * Look for an entry in the file cache matching the dentry pointer, - * and verify that the (dev, inode) numbers are correct. If found, - * the entry is removed from the cache. - */ -static struct dentry *find_dentry_in_fhcache(struct knfs_fh *fh) +static struct dentry *splice(struct dentry *child, struct dentry *parent) { -/* FIXME: this must use the dev/ino/dir_ino triple. */ -#if 0 - struct fh_entry * fhe; - - fhe = find_fhe(fh->fh_dcookie, NFSD_FILE_CACHE, NULL); - if (fhe) { - struct dentry *parent, *dentry; - struct inode *inode; - - dentry = fhe->dentry; - inode = dentry->d_inode; + int err = 0; + struct qstr qs; + char namebuf[256]; + struct list_head *lp; + struct dentry *tmp; + /* child is an IS_ROOT (anonymous) dentry, but it is hypothesised that + * it should be a child of parent. + * We see if we can find a name and, if we can - splice it in. + * We hold the i_sem on the parent the whole time to try to follow + * locking protocols. + */ + qs.name = namebuf; + down(&parent->d_inode->i_sem); - if (!inode) { -#ifdef NFSD_PARANOIA -printk("find_dentry_in_fhcache: %s/%s has no inode!\n", -dentry->d_parent->d_name.name, dentry->d_name.name); -#endif + /* Now, things might have changed while we waited. + * Possibly a friendly filesystem found child and spliced it in in + * response to a lookup (though nobody does this yet). + * In this case, just succeed. + */ + if (child->d_parent == parent) goto out; + /* Possibly a new dentry has been made for this child->d_inode in + * parent by a lookup. In this case return that dentry. + * caller must notice and act accordingly + */ + for (lp = child->d_inode->i_dentry.next; lp != &child->d_inode->i_dentry ; lp=lp->next) { + tmp = list_entry(lp,struct dentry, d_alias); + if (tmp->d_parent == parent) { + child = dget(tmp); goto out; } - if (inode->i_ino != u32_to_ino_t(fh->fh_ino)) - goto out; - if (inode->i_dev != u32_to_kdev_t(fh->fh_dev)) - goto out; - - fhe->dentry = NULL; - fhe->ino = 0; - fhe->dev = 0; - nfsd_nr_put++; - /* - * Make sure the parent is in the dir cache ... - */ - parent = dget(dentry->d_parent); - if (add_to_fhcache(parent, NFSD_DIR_CACHE)) - nfsd_nr_verified++; - else - dput(parent); - return dentry; } -out: -#endif - return NULL; -} - -/* - * Look for an entry in the parent directory with the specified - * inode number. - */ -static struct dentry *lookup_by_inode(struct dentry *parent, ino_t ino) -{ - struct dentry *dentry; - int error; - struct nfsd_dirent dirent; - - /* - * Search the directory for the inode number. + /* well, if we can find a name for child in parent, it should be + * safe to splice it in */ - dirent.ino = ino; - error = get_parent_ino(parent, &dirent); - if (error) { -#ifdef NFSD_PARANOIA_EXTREME -printk("lookup_by_inode: ino %ld not found in %s\n", ino, parent->d_name.name); -#endif - goto no_entry; - } -#ifdef NFSD_PARANOIA_EXTREME -printk("lookup_by_inode: found %s\n", dirent.name); -#endif - - dentry = lookup_dentry(dirent.name, parent, 0); - if (!IS_ERR(dentry)) { - if (dentry->d_inode && dentry->d_inode->i_ino == ino) - goto out; -#ifdef NFSD_PARANOIA_EXTREME -printk("lookup_by_inode: %s/%s inode mismatch??\n", -parent->d_name.name, dentry->d_name.name); -#endif - dput(dentry); - } else { -#ifdef NFSD_PARANOIA_EXTREME -printk("lookup_by_inode: %s lookup failed, error=%ld\n", -dirent.name, PTR_ERR(dentry)); -#endif + err = get_ino_name(parent, &qs, child->d_inode->i_ino); + if (err) + goto out; + tmp = d_lookup(parent, &qs); + if (tmp) { + /* Now that IS odd. I wonder what it means... */ + err = -EEXIST; + printk("nfsd-fh: found a name that I didn't expect: %s/%s\n", parent->d_name.name, qs.name); + dput(tmp); + goto out; } - -no_entry: - dentry = NULL; -out: - return dentry; + err = d_splice(child, parent, &qs); + dprintk("nfsd_fh: found name %s for ino %ld\n", child->d_name.name, child->d_inode->i_ino); + out: + up(&parent->d_inode->i_sem); + if (err) + return ERR_PTR(err); + else + return child; } /* - * Search the fix-up list for a dentry from a prior lookup. + * This is the basic lookup mechanism for turning an NFS file handle + * into a dentry. + * We use nfsd_iget and if that doesn't return a suitably connected dentry, + * we try to find the parent, and the parent of that and so-on until a + * connection if made. */ -static ino_t nfsd_cached_lookup(struct knfs_fh *fh) -{ - struct nfsd_fixup *fp; - - fp = find_cached_lookup(u32_to_kdev_t(fh->fh_dev), - u32_to_ino_t(fh->fh_dirino), - u32_to_ino_t(fh->fh_ino)); - if (fp) - return fp->new_dirino; - return 0; -} - -void -expire_all(void) +static struct dentry * +find_fh_dentry(struct super_block *sb, struct knfs_fh *fh, int needpath) { - if (time_after_eq(jiffies, nfsd_next_expire)) { - expire_old(NFSD_FILE_CACHE, 5*HZ); - expire_old(NFSD_DIR_CACHE , 60*HZ); - nfsd_next_expire = jiffies + 5*HZ; - } -} + struct dentry *dentry, *result = NULL; + struct dentry *tmp; + int found =0; + int err; + /* the sb->s_nfsd_free_path_sem semaphore is needed to make sure that only one unconnected (free) + * dcache path ever exists, as otherwise two partial paths might get + * joined together, which would be very confusing. + * If there is ever an unconnected non-root directory, then this lock + * must be held. + */ -/* - * Free cache after unlink/rmdir. - */ -void -expire_by_dentry(struct dentry *dentry) -{ - struct fh_entry *fhe; - fhe = find_fhe(dentry, NFSD_FILE_CACHE, NULL); - if (fhe) { - expire_fhe(fhe, NFSD_FILE_CACHE); - } - fhe = find_fhe(dentry, NFSD_DIR_CACHE, NULL); - if (fhe) { - expire_fhe(fhe, NFSD_DIR_CACHE); + nfsdstats.fh_lookup++; + /* + * Attempt to find the inode. + */ + retry: + result = nfsd_iget(sb, fh->fh_ino, fh->fh_generation); + err = PTR_ERR(result); + if (IS_ERR(result)) + goto err_out; + err = -ESTALE; + if (!result) { + dprintk("find_fh_dentry: No inode found.\n"); + goto err_out; } -} + if (! (result->d_flags & DCACHE_NFSD_DISCONNECTED)) + return result; -/* - * The is the basic lookup mechanism for turning an NFS file handle - * into a dentry. There are several levels to the search: - * (1) Look for the dentry pointer the short-term fhcache, - * and verify that it has the correct inode number. - * - * (2) Try to validate the dentry pointer in the file handle, - * and verify that it has the correct inode number. If this - * fails, check for a cached lookup in the fix-up list and - * repeat step (2) using the new dentry pointer. - * - * (3) Look up the dentry by using the inode and parent inode numbers - * to build the name string. This should succeed for any Unix-like - * filesystem. - * - * (4) Search for the parent dentry in the dir cache, and then - * look for the name matching the inode number. - * - * (5) The most general case ... search the whole volume for the inode. - * - * If successful, we return a dentry with the use count incremented. - * - * Note: steps (4) and (5) above are probably unnecessary now that (3) - * is working. Remove the code once this is verified ... - */ -static struct dentry * -find_fh_dentry(struct knfs_fh *fh) -{ - struct super_block *sb; - struct dentry *dentry, *parent; - struct inode * inode; - struct list_head *lst; - int looked_up = 0, retry = 0; - ino_t dirino; + /* result is now an anonymous dentry, which may be adequate as it + * stands, or else will get spliced into the dcache tree */ - /* - * Stage 1: Look for the dentry in the short-term fhcache. - */ - dentry = find_dentry_in_fhcache(fh); - if (dentry) { - nfsdstats.fh_cached++; - goto out; + if (!S_ISDIR(result->d_inode->i_mode) && ! needpath) { + nfsdstats.fh_anon++; + return result; } - /* - * Stage 2: Attempt to find the inode. + + /* It's a directory, or we are required to confirm the file's + * location in the tree. */ - sb = get_super(fh->fh_dev); - if (NULL == sb) { - printk("find_fh_dentry: No SuperBlock for device %s.", - kdevname(fh->fh_dev)); - dentry = NULL; - goto out; - } + dprintk("nfs_fh: need to look harder for %d/%d\n",sb->s_dev,fh->fh_ino); + down(&sb->s_nfsd_free_path_sem); - dirino = u32_to_ino_t(fh->fh_dirino); - inode = iget_in_use(sb, fh->fh_ino); - if (!inode) { - dprintk("find_fh_dentry: No inode found.\n"); - goto out_five; - } - goto check; -recheck: - if (!inode) { - dprintk("find_fh_dentry: No inode found.\n"); - goto out_three; + /* claiming the semaphore might have allow things to get fixed up */ + if (! (result->d_flags & DCACHE_NFSD_DISCONNECTED)) { + up(&sb->s_nfsd_free_path_sem); + return result; } -check: - for (lst = inode->i_dentry.next; - lst != &inode->i_dentry; - lst = lst->next) { - dentry = list_entry(lst, struct dentry, d_alias); - -/* if we are looking up a directory then we don't need the parent! */ - if (!dentry || - !dentry->d_parent || - !dentry->d_parent->d_inode) { -printk("find_fh_dentry: Found a useless inode %lu\n", inode->i_ino); - continue; - } - if (dentry->d_parent->d_inode->i_ino != dirino) - continue; - dget(dentry); - iput(inode); -#ifdef NFSD_DEBUG_VERBOSE - printk("find_fh_dentry: Found%s %s/%s filehandle dirino = %lu, %lu\n", - retry ? " Renamed" : "", - dentry->d_parent->d_name.name, - dentry->d_name.name, - dentry->d_parent->d_inode->i_ino, - dirino); -#endif - goto out; - } /* for inode->i_dentry */ - /* - * Before proceeding to a lookup, check for a rename - */ - if (!retry && (dirino = nfsd_cached_lookup(fh))) { - dprintk("find_fh_dentry: retry with %lu\n", dirino); - retry = 1; - goto recheck; + found = 0; + if (!S_ISDIR(result->d_inode->i_mode)) { + nfsdstats.fh_nocache_nondir++; + if (fh->fh_dirino == 0) + goto err_result; /* don't know how to find parent */ + else { + /* need to iget fh->fh_dirino and make sure this + * inode is in that directory + */ + dentry = nfsd_iget(sb, fh->fh_dirino, 0); + err = PTR_ERR(dentry); + if (IS_ERR(dentry)) + goto err_result; + err = -ESTALE; + if (!dentry->d_inode + || !S_ISDIR(dentry->d_inode->i_mode)) { + goto err_dentry; + } + if ((!dentry->d_flags & DCACHE_NFSD_DISCONNECTED)) + found = 1; + tmp = splice(result, dentry); + err = PTR_ERR(tmp); + if (IS_ERR(tmp)) + goto err_dentry; + if (tmp != result) { + /* it is safe to just use tmp instead, but + * we must discard result first + */ + d_drop(result); + dput(result); + result = tmp; + /* If !found, then this is really wierd, + * but it shouldn't hurt */ + } + } + } else { + nfsdstats.fh_nocache_dir++; + dentry = dget(result); } - iput(inode); + while(!found) { + /* LOOP INVARIANT */ + /* haven't found a place in the tree yet, but we do have a + * free path from dentry down to result, and dentry is a + * directory. Have a hold on dentry and result + */ + struct dentry *pdentry; + struct inode *parent; - dprintk("find_fh_dentry: dirino not found %lu\n", dirino); + pdentry = nfsd_findparent(dentry); + err = PTR_ERR(pdentry); + if (IS_ERR(pdentry)) + goto err_dentry; + parent = pdentry->d_inode; + err = -EACCES; + if (!parent) { + dput(pdentry); + goto err_dentry; + } -out_three: + if (!(dentry->d_flags & DCACHE_NFSD_DISCONNECTED)) + found = 1; - /* - * Stage 3: Look up the dentry based on the inode and parent inode - * numbers. This should work for all Unix-like filesystems. - */ - looked_up = 1; - dentry = lookup_inode(u32_to_kdev_t(fh->fh_dev), - u32_to_ino_t(fh->fh_dirino), - u32_to_ino_t(fh->fh_ino)); - if (!IS_ERR(dentry)) { - struct inode * inode = dentry->d_inode; -#ifdef NFSD_DEBUG_VERBOSE -printk("find_fh_dentry: looked up %s/%s\n", - dentry->d_parent->d_name.name, dentry->d_name.name); -#endif - if (inode && inode->i_ino == u32_to_ino_t(fh->fh_ino)) { - nfsdstats.fh_lookup++; - goto out; + tmp = splice(dentry, pdentry); + if (tmp != dentry) { + /* Something wrong. We need to drop the whole + * dentry->result path whatever it was + */ + struct dentry *d; + for (d=result ; d ; d=(d->d_parent == d)?NULL:d->d_parent) + d_drop(d); + } + if (IS_ERR(tmp)) { + err = PTR_ERR(tmp); + dput(pdentry); + goto err_dentry; + } + if (tmp != dentry) { + /* we lost a race, try again + */ + dput(tmp); + dput(dentry); + dput(result); /* this will discard the whole free path, so we can up the semaphore */ + up(&sb->s_nfsd_free_path_sem); + goto retry; } -#ifdef NFSD_PARANOIA -printk("find_fh_dentry: %s/%s lookup mismatch!\n", - dentry->d_parent->d_name.name, dentry->d_name.name); -#endif dput(dentry); + dentry = pdentry; } + dput(dentry); + up(&sb->s_nfsd_free_path_sem); + return result; - /* - * Stage 4: Look for the parent dentry in the fhcache ... - */ - parent = find_dentry_by_ino(u32_to_kdev_t(fh->fh_dev), - u32_to_ino_t(fh->fh_dirino)); - if (parent) { - /* - * ... then search for the inode in the parent directory. - */ - dget(parent); - dentry = lookup_by_inode(parent, u32_to_ino_t(fh->fh_ino)); - dput(parent); - if (dentry) - goto out; - } - -out_five: - - /* - * Stage 5: Search the whole volume, Yea Right. - */ -#ifdef NFSD_PARANOIA_EXTREME -printk("find_fh_dentry: %s/%u dir/%u not found!\n", - kdevname(u32_to_kdev_t(fh->fh_dev)), fh->fh_ino, fh->fh_dirino); -#endif - dentry = NULL; - nfsdstats.fh_stale++; - -out: - expire_all(); - return dentry; +err_dentry: + dput(dentry); +err_result: + dput(result); + up(&sb->s_nfsd_free_path_sem); +err_out: + if (err == -ESTALE) + nfsdstats.fh_stale++; + return ERR_PTR(err); } /* @@ -1117,6 +495,9 @@ * * Note that the file handle dentry may need to be freed even after * an error return. + * + * This is only called at the start of an nfsproc call, so fhp points to + * a svc_fh which is all 0 except for the over-the-wire file handle. */ u32 fh_verify(struct svc_rqst *rqstp, struct svc_fh *fhp, int type, int access) @@ -1134,57 +515,79 @@ fh->fh_ino, fh->fh_dirino); - if (fhp->fh_dverified) - goto check_type; - /* - * Look up the export entry. - */ - error = nfserr_stale; - exp = exp_get(rqstp->rq_client, - u32_to_kdev_t(fh->fh_xdev), - u32_to_ino_t(fh->fh_xino)); - if (!exp) { - /* export entry revoked */ - nfsdstats.fh_stale++; - goto out; - } + if (!fhp->fh_dentry) { + /* + * Security: Check that the fh is internally consistant + * (from <gam3@acm.org>) + */ + if (fh->fh_dev != fh->fh_xdev) { + printk("fh_verify: Security: export on other device (%s, %s).\n", + kdevname(fh->fh_dev), kdevname(fh->fh_xdev)); + error = nfserr_stale; + nfsdstats.fh_stale++; + goto out; + } - /* Check if the request originated from a secure port. */ - error = nfserr_perm; - if (!rqstp->rq_secure && EX_SECURE(exp)) { - printk(KERN_WARNING - "nfsd: request from insecure port (%08x:%d)!\n", - ntohl(rqstp->rq_addr.sin_addr.s_addr), - ntohs(rqstp->rq_addr.sin_port)); - goto out; - } + /* + * Look up the export entry. + */ + error = nfserr_stale; + exp = exp_get(rqstp->rq_client, + u32_to_kdev_t(fh->fh_xdev), + u32_to_ino_t(fh->fh_xino)); + if (!exp) { + /* export entry revoked */ + nfsdstats.fh_stale++; + goto out; + } + + /* Check if the request originated from a secure port. */ + error = nfserr_perm; + if (!rqstp->rq_secure && EX_SECURE(exp)) { + printk(KERN_WARNING + "nfsd: request from insecure port (%08x:%d)!\n", + ntohl(rqstp->rq_addr.sin_addr.s_addr), + ntohs(rqstp->rq_addr.sin_port)); + goto out; + } - /* Set user creds if we haven't done so already. */ - nfsd_setuser(rqstp, exp); + /* Set user creds if we haven't done so already. */ + nfsd_setuser(rqstp, exp); - /* - * Look up the dentry using the NFS file handle. - */ - error = nfserr_noent; - dentry = find_fh_dentry(fh); - if (!dentry) { - goto out; - } - if (IS_ERR(dentry)) { - error = nfserrno(-PTR_ERR(dentry)); - goto out; + /* + * Look up the dentry using the NFS file handle. + */ + + dentry = find_fh_dentry(exp->ex_dentry->d_inode->i_sb, + fh, + !(exp->ex_flags & NFSEXP_NOSUBTREECHECK)); + + if (IS_ERR(dentry)) { + error = nfserrno(-PTR_ERR(dentry)); + goto out; + } +#ifdef NFSD_PARANOIA + if (S_ISDIR(dentry->d_inode->i_mode) && + (dentry->d_flags & DCACHE_NFSD_DISCONNECTED)) { + printk("nfsd: find_fh_dentry returned a DISCONNECTED directory: %s/%s\n", + dentry->d_parent->d_name.name, dentry->d_name.name); + } +#endif + + fhp->fh_dentry = dentry; + fhp->fh_export = exp; + nfsd_nr_verified++; + } else { + /* just rechecking permissions + * (e.g. nfsproc_create calls fh_verify, then nfsd_create + * does as well) + */ + dprintk("nfsd: fh_verify - just checking\n"); + dentry = fhp->fh_dentry; + exp = fhp->fh_export; } - /* - * Note: it's possible the returned dentry won't be the one in the - * file handle. We can correct the file handle for our use, but - * unfortunately the client will keep sending the broken one. Let's - * hope the lookup will keep patching things up. - */ - fhp->fh_dentry = dentry; - fhp->fh_export = exp; - fhp->fh_dverified = 1; - nfsd_nr_verified++; + inode = dentry->d_inode; /* Type check. The correct error return for type mismatches * does not seem to be generally agreed upon. SunOS seems to @@ -1192,39 +595,7 @@ * spec says this is incorrect (implementation notes for the * write call). */ -check_type: - dentry = fhp->fh_dentry; - inode = dentry->d_inode; - error = nfserr_stale; - /* On a heavily loaded SMP machine, more than one identical - requests may run at the same time on different processors. - One thread may get here with unfinished fh after another - thread just fetched the inode. It doesn't make any senses - to check fh->fh_generation here since it has not been set - yet. In that case, we shouldn't send back the stale - filehandle to the client. We use fh->fh_dcookie to indicate - if fh->fh_generation is set or not. If fh->fh_dcookie is - not set, don't return stale filehandle. */ - if (inode->i_generation != fh->fh_generation) { - if (fh->fh_dcookie) { - dprintk("fh_verify: Bad version %lu %u %u: 0x%x, 0x%x\n", - inode->i_ino, - inode->i_generation, - fh->fh_generation, - type, access); - nfsdstats.fh_stale++; - goto out; - } - else { - /* We get here when inode is fetched by other - threads. We just use what is in there. */ - fh->fh_ino = ino_t_to_u32(inode->i_ino); - fh->fh_generation = inode->i_generation; - fh->fh_dcookie = (struct dentry *)0xfeebbaca; - nfsdstats.fh_concurrent++; - } - } - exp = fhp->fh_export; + if (type > 0 && (inode->i_mode & S_IFMT) != type) { error = (type == S_IFDIR)? nfserr_notdir : nfserr_isdir; goto out; @@ -1238,38 +609,37 @@ * Security: Check that the export is valid for dentry <gam3@acm.org> */ error = 0; - if (fh->fh_dev != fh->fh_xdev) { - printk("fh_verify: Security: export on other device (%s, %s).\n", - kdevname(fh->fh_dev), kdevname(fh->fh_xdev)); - error = nfserr_stale; - nfsdstats.fh_stale++; - } else if (exp->ex_dentry != dentry) { - struct dentry *tdentry = dentry; - do { - tdentry = tdentry->d_parent; - if (exp->ex_dentry == tdentry) - break; - /* executable only by root and we can't be root */ - if (current->fsuid - && !(tdentry->d_inode->i_uid - && (tdentry->d_inode->i_mode & S_IXUSR)) - && !(tdentry->d_inode->i_gid - && (tdentry->d_inode->i_mode & S_IXGRP)) - && !(tdentry->d_inode->i_mode & S_IXOTH) - && (exp->ex_flags & NFSEXP_ROOTSQUASH)) { + if (!(exp->ex_flags & NFSEXP_NOSUBTREECHECK)) { + if (exp->ex_dentry != dentry) { + struct dentry *tdentry = dentry; + + do { + tdentry = tdentry->d_parent; + if (exp->ex_dentry == tdentry) + break; + /* executable only by root and we can't be root */ + if (current->fsuid + && (exp->ex_flags & NFSEXP_ROOTSQUASH) + && !(tdentry->d_inode->i_uid + && (tdentry->d_inode->i_mode & S_IXUSR)) + && !(tdentry->d_inode->i_gid + && (tdentry->d_inode->i_mode & S_IXGRP)) + && !(tdentry->d_inode->i_mode & S_IXOTH) + ) { + error = nfserr_stale; + nfsdstats.fh_stale++; + dprintk("fh_verify: no root_squashed access.\n"); + } + } while ((tdentry != tdentry->d_parent)); + if (exp->ex_dentry != tdentry) { error = nfserr_stale; nfsdstats.fh_stale++; -dprintk("fh_verify: no root_squashed access.\n"); + printk("nfsd Security: %s/%s bad export.\n", + dentry->d_parent->d_name.name, + dentry->d_name.name); + goto out; } - } while ((tdentry != tdentry->d_parent)); - if (exp->ex_dentry != tdentry) { - error = nfserr_stale; - nfsdstats.fh_stale++; - printk("nfsd Security: %s/%s bad export.\n", - dentry->d_parent->d_name.name, - dentry->d_name.name); - goto out; } } @@ -1277,10 +647,11 @@ if (!error) { error = nfsd_permission(exp, dentry, access); } -#ifdef NFSD_PARANOIA -if (error) -printk("fh_verify: %s/%s permission failure, acc=%x, error=%d\n", -dentry->d_parent->d_name.name, dentry->d_name.name, access, (error >> 24)); +#ifdef NFSD_PARANOIA_EXTREME + if (error) { + printk("fh_verify: %s/%s permission failure, acc=%x, error=%d\n", + dentry->d_parent->d_name.name, dentry->d_name.name, access, (error >> 24)); + } #endif out: return error; @@ -1300,7 +671,7 @@ struct dentry *parent = dentry->d_parent; dprintk("nfsd: fh_compose(exp %x/%ld %s/%s, ino=%ld)\n", - exp->ex_dev, exp->ex_ino, + exp->ex_dev, (long) exp->ex_ino, parent->d_name.name, dentry->d_name.name, (inode ? inode->i_ino : 0)); @@ -1309,33 +680,33 @@ * may not be done on error paths, but the cleanup must call fh_put. * Fix this soon! */ - if (fhp->fh_dverified || fhp->fh_locked || fhp->fh_dentry) { + if (fhp->fh_dentry || fhp->fh_locked) { printk(KERN_ERR "fh_compose: fh %s/%s not initialized!\n", parent->d_name.name, dentry->d_name.name); } fh_init(fhp); - fhp->fh_handle.fh_dcookie = dentry; + fhp->fh_handle.fh_dirino = ino_t_to_u32(parent->d_inode->i_ino); + fhp->fh_handle.fh_dev = kdev_t_to_u32(parent->d_inode->i_dev); + fhp->fh_handle.fh_xdev = kdev_t_to_u32(exp->ex_dev); + fhp->fh_handle.fh_xino = ino_t_to_u32(exp->ex_ino); + fhp->fh_handle.fh_dcookie = (struct dentry *)0xfeebbaca; if (inode) { fhp->fh_handle.fh_ino = ino_t_to_u32(inode->i_ino); fhp->fh_handle.fh_generation = inode->i_generation; - fhp->fh_handle.fh_dcookie = (struct dentry *)0xfeebbaca; + if (S_ISDIR(inode->i_mode) || (exp->ex_flags & NFSEXP_NOSUBTREECHECK)) + fhp->fh_handle.fh_dirino = 0; } - fhp->fh_handle.fh_dirino = ino_t_to_u32(parent->d_inode->i_ino); - fhp->fh_handle.fh_dev = kdev_t_to_u32(parent->d_inode->i_dev); - fhp->fh_handle.fh_xdev = kdev_t_to_u32(exp->ex_dev); - fhp->fh_handle.fh_xino = ino_t_to_u32(exp->ex_ino); fhp->fh_dentry = dentry; /* our internal copy */ fhp->fh_export = exp; - /* We stuck it there, we know it's good. */ - fhp->fh_dverified = 1; nfsd_nr_verified++; } /* * Update file handle information after changing a dentry. + * This is only called by nfsd_create */ void fh_update(struct svc_fh *fhp) @@ -1343,7 +714,7 @@ struct dentry *dentry; struct inode *inode; - if (!fhp->fh_dverified) + if (!fhp->fh_dentry) goto out_bad; dentry = fhp->fh_dentry; @@ -1352,7 +723,9 @@ goto out_negative; fhp->fh_handle.fh_ino = ino_t_to_u32(inode->i_ino); fhp->fh_handle.fh_generation = inode->i_generation; - fhp->fh_handle.fh_dcookie = (struct dentry *)0xfeebbaca; + if (S_ISDIR(inode->i_mode) || (fhp->fh_export->ex_flags & NFSEXP_NOSUBTREECHECK)) + fhp->fh_handle.fh_dirino = 0; + out: return; @@ -1366,22 +739,19 @@ } /* - * Release a file handle. If the file handle carries a dentry count, - * we add the dentry to the short-term cache rather than release it. + * Release a file handle. */ void fh_put(struct svc_fh *fhp) { struct dentry * dentry = fhp->fh_dentry; - if (fhp->fh_dverified) { + if (dentry) { fh_unlock(fhp); - fhp->fh_dverified = 0; + fhp->fh_dentry = NULL; if (!dentry->d_count) goto out_bad; - if (!dentry->d_inode || !add_to_fhcache(dentry, 0)) { - dput(dentry); - nfsd_nr_put++; - } + dput(dentry); + nfsd_nr_put++; } return; @@ -1389,118 +759,4 @@ printk(KERN_ERR "fh_put: %s/%s has d_count 0!\n", dentry->d_parent->d_name.name, dentry->d_name.name); return; -} - -/* - * Flush any cached dentries for the specified device - * or for all devices. - * - * This is called when revoking the last export for a - * device, so that it can be unmounted cleanly. - */ -void nfsd_fh_flush(kdev_t dev) -{ - struct fh_entry *fhe; - int i, pass = 2; - - fhe = &filetable[0]; - while (pass--) { - for (i = 0; i < NFSD_MAXFH; i++, fhe++) { - struct dentry *dentry = fhe->dentry; - if (!dentry) - continue; - if (dev && dentry->d_inode->i_dev != dev) - continue; - fhe->dentry = NULL; - dput(dentry); - nfsd_nr_put++; - } - fhe = &dirstable[0]; - } -} - -/* - * Free the rename and path caches. - */ -void nfsd_fh_free(void) -{ - struct list_head *tmp; - int i; - - /* Flush dentries for all devices */ - nfsd_fh_flush(0); - - /* - * N.B. write a destructor for these lists ... - */ - i = 0; - while ((tmp = fixup_head.next) != &fixup_head) { - struct nfsd_fixup *fp; - fp = list_entry(tmp, struct nfsd_fixup, lru); - free_fixup_entry(fp); - i++; - } - printk(KERN_DEBUG "nfsd_fh_free: %d fixups freed\n", i); - - i = 0; - while ((tmp = path_inuse.next) != &path_inuse) { - struct nfsd_path *pe; - pe = list_entry(tmp, struct nfsd_path, lru); - free_path_entry(pe); - i++; - } - printk(KERN_DEBUG "nfsd_fh_free: %d paths freed\n", i); - - printk(KERN_DEBUG "nfsd_fh_free: verified %d, put %d\n", - nfsd_nr_verified, nfsd_nr_put); -} - -void nfsd_fh_init(void) -{ - extern void __my_nfsfh_is_too_big(void); - - if (filetable) - return; - - /* Sanity check */ - if (sizeof(struct nfs_fhbase) > 32) - __my_nfsfh_is_too_big(); - - filetable = kmalloc(sizeof(struct fh_entry) * NFSD_MAXFH, - GFP_KERNEL); - dirstable = kmalloc(sizeof(struct fh_entry) * NFSD_MAXFH, - GFP_KERNEL); - - if (filetable == NULL || dirstable == NULL) { - printk(KERN_WARNING "nfsd_fh_init : Could not allocate fhcache\n"); - nfsd_nservers = 0; - return; - } - - memset(filetable, 0, NFSD_MAXFH*sizeof(struct fh_entry)); - memset(dirstable, 0, NFSD_MAXFH*sizeof(struct fh_entry)); - INIT_LIST_HEAD(&path_inuse); - INIT_LIST_HEAD(&fixup_head); - - printk(KERN_DEBUG - "nfsd_fh_init : initialized fhcache, entries=%lu\n", NFSD_MAXFH); - /* - * Display a warning if the ino_t is larger than 32 bits. - */ - if (sizeof(ino_t) > sizeof(__u32)) - printk(KERN_INFO - "NFSD: ino_t is %d bytes, using lower 4 bytes\n", - sizeof(ino_t)); -} - -void -nfsd_fh_shutdown(void) -{ - if (!filetable) - return; - printk(KERN_DEBUG - "nfsd_fh_shutdown : freeing %ld fhcache entries.\n", NFSD_MAXFH); - kfree(filetable); - kfree(dirstable); - filetable = dirstable = NULL; } diff -Naur pre9/linux/fs/nfsd/nfsproc.c test/linux/fs/nfsd/nfsproc.c --- pre9/linux/fs/nfsd/nfsproc.c Wed May 3 17:16:46 2000 +++ test/linux/fs/nfsd/nfsproc.c Mon Sep 18 14:35:50 2000 @@ -1,7 +1,10 @@ /* * nfsproc2.c Process version 2 NFS requests. + * linux/fs/nfsd/nfs2proc.c + * + * Process version 2 NFS requests. * - * Copyright (C) 1995 Olaf Kirch <okir@monad.swb.de> + * Copyright (C) 1995-1997 Olaf Kirch <okir@monad.swb.de> */ #include <linux/linkage.h> @@ -27,6 +30,9 @@ #define NFSDDBG_FACILITY NFSDDBG_PROC +/* Check for dir entries '.' and '..' */ +#define isdotent(n, l) (l < 3 && n[0] == '.' && (l == 1 || n[1] == '.')) + #define RETURN(st) return st static void @@ -144,7 +150,7 @@ ntohl(rqstp->rq_addr.sin_addr.s_addr), ntohs(rqstp->rq_addr.sin_port), argp->count); - argp->count = avail; + argp->count = avail << 2; } resp->count = argp->count; @@ -211,6 +217,12 @@ } else if (nfserr) goto done; + nfserr = nfserr_acces; + if (!argp->len) + goto done; + nfserr = nfserr_exist; + if (isdotent(argp->name, argp->len)) + goto done; /* * Do a lookup to verify the new file handle. */ @@ -223,7 +235,7 @@ * whether the file exists or not. Time to bail ... */ nfserr = nfserr_acces; - if (!newfhp->fh_dverified) { + if (!newfhp->fh_dentry) { printk(KERN_WARNING "nfsd_proc_create: file handle not verified\n"); goto done; @@ -261,7 +273,7 @@ goto out_unlock; attr->ia_valid |= ATTR_MODE; - attr->ia_mode = type | mode; + attr->ia_mode = mode; /* Special treatment for non-regular files according to the * gospel of sun micro @@ -279,22 +291,21 @@ type = S_IFIFO; } else if (size != rdev) { /* dev got truncated because of 16bit Linux dev_t */ - nfserr = nfserr_io; /* or nfserr_inval? */ + nfserr = nfserr_inval; goto out_unlock; } else { /* Okay, char or block special */ is_borc = 1; } + /* we've used the SIZE information, so discard it */ + attr->ia_valid &= ~ATTR_SIZE; + /* Make sure the type and device matches */ nfserr = nfserr_exist; if (inode && (type != (inode->i_mode & S_IFMT) || (is_borc && inode->i_rdev != rdev))) goto out_unlock; - - /* invalidate size because only (type == S_IFREG) has - size. */ - attr->ia_valid &= ~ATTR_SIZE; } nfserr = 0; @@ -388,11 +399,8 @@ */ nfserr = nfsd_symlink(rqstp, &argp->ffh, argp->fname, argp->flen, argp->tname, argp->tlen, - &newfh); - if (!nfserr) { - argp->attrs.ia_valid &= ~ATTR_SIZE; - nfserr = nfsd_setattr(rqstp, &newfh, &argp->attrs); - } + &newfh, &argp->attrs); + fh_put(&argp->ffh); fh_put(&newfh); @@ -411,7 +419,7 @@ dprintk("nfsd: MKDIR %p %s\n", SVCFH_DENTRY(&argp->fh), argp->name); - if (resp->fh.fh_dverified) { + if (resp->fh.fh_dentry) { printk(KERN_WARNING "nfsd_proc_mkdir: response already verified??\n"); } @@ -446,8 +454,8 @@ nfsd_proc_readdir(struct svc_rqst *rqstp, struct nfsd_readdirargs *argp, struct nfsd_readdirres *resp) { - u32 * buffer; - int nfserr, count; + u32 * buffer; + int nfserr, count; dprintk("nfsd: READDIR %d/%d %d bytes at %d\n", SVCFH_DEV(&argp->fh), SVCFH_INO(&argp->fh), @@ -467,7 +475,8 @@ /* Read directory and encode entries on the fly */ nfserr = nfsd_readdir(rqstp, &argp->fh, (loff_t) argp->cookie, - nfssvc_encode_entry, buffer, &count); + nfssvc_encode_entry, + buffer, &count, NULL); resp->count = count; fh_put(&argp->fh); @@ -547,6 +556,8 @@ { NFSERR_NXIO, ENXIO }, { NFSERR_ACCES, EACCES }, { NFSERR_EXIST, EEXIST }, + { NFSERR_XDEV, EXDEV }, + { NFSERR_MLINK, EMLINK }, { NFSERR_NODEV, ENODEV }, { NFSERR_NOTDIR, ENOTDIR }, { NFSERR_ISDIR, EISDIR }, @@ -554,39 +565,22 @@ { NFSERR_FBIG, EFBIG }, { NFSERR_NOSPC, ENOSPC }, { NFSERR_ROFS, EROFS }, + { NFSERR_MLINK, EMLINK }, { NFSERR_NAMETOOLONG, ENAMETOOLONG }, { NFSERR_NOTEMPTY, ENOTEMPTY }, #ifdef EDQUOT { NFSERR_DQUOT, EDQUOT }, #endif { NFSERR_STALE, ESTALE }, - { NFSERR_WFLUSH, EIO }, { -1, EIO } }; int i; for (i = 0; nfs_errtbl[i].nfserr != -1; i++) { if (nfs_errtbl[i].syserr == errno) - return htonl (nfs_errtbl[i].nfserr); + return htonl(nfs_errtbl[i].nfserr); } printk (KERN_INFO "nfsd: non-standard errno: %d\n", errno); return nfserr_io; } -#if 0 -static void -nfsd_dump(char *tag, u32 *buf, int len) -{ - int i; - - printk(KERN_NOTICE - "nfsd: %s (%d words)\n", tag, len); - - for (i = 0; i < len && i < 32; i += 8) - printk(KERN_NOTICE - " %08lx %08lx %08lx %08lx" - " %08lx %08lx %08lx %08lx\n", - buf[i], buf[i+1], buf[i+2], buf[i+3], - buf[i+4], buf[i+5], buf[i+6], buf[i+7]); -} -#endif diff -Naur pre9/linux/fs/nfsd/nfssvc.c test/linux/fs/nfsd/nfssvc.c --- pre9/linux/fs/nfsd/nfssvc.c Tue Jan 4 10:12:23 2000 +++ test/linux/fs/nfsd/nfssvc.c Mon Sep 18 14:35:50 2000 @@ -42,7 +42,7 @@ extern struct svc_program nfsd_program; static void nfsd(struct svc_rqst *rqstp); struct timeval nfssvc_boot = { 0, 0 }; -static int nfsd_active = 0; +static atomic_t nfsd_active = ATOMIC_INIT(0); /* * Maximum number of nfsd processes @@ -55,24 +55,19 @@ struct svc_serv * serv; int error; - dprintk("nfsd: creating service\n"); error = -EINVAL; + if (atomic_read(&nfsd_active)) + goto out; /* already running */ if (nrservs <= 0) goto out; if (nrservs > NFSD_MAXSERVS) nrservs = NFSD_MAXSERVS; + + dprintk("nfsd: creating service (%d)\n", nrservs); nfsd_nservers = nrservs; + nfssvc_boot = xtime; error = -ENOMEM; - nfsd_fh_init(); /* NFS dentry cache */ - if (nfsd_nservers == 0) - goto out; - - error = -ENOMEM; - nfsd_racache_init(); /* Readahead param cache */ - if (nfsd_nservers == 0) - goto out; - serv = svc_create(&nfsd_program, NFSD_BUFSIZE, NFSSVC_XDRSIZE); if (serv == NULL) goto out; @@ -87,6 +82,8 @@ goto failure; #endif + nfsd_racache_init(); /* Readahead param cache */ + while (nrservs--) { error = svc_create_thread(nfsd, serv); if (error < 0) @@ -106,27 +103,27 @@ nfsd(struct svc_rqst *rqstp) { struct svc_serv *serv = rqstp->rq_server; - int oldumask, err, first = 0; + int err; - /* Lock module and set up kernel thread */ + /* Lock module */ MOD_INC_USE_COUNT; + + /* Set up kernel thread */ lock_kernel(); exit_mm(current); + sprintf(current->comm, "nfsd"); current->session = 1; current->pgrp = 1; + current->fs->umask = 0; + + /* Count active threads */ + atomic_inc(&nfsd_active); + + /* Start lockd */ + lockd_up(); + /* Let svc_process check client's authentication. */ rqstp->rq_auth = 1; - sprintf(current->comm, "nfsd"); - - oldumask = current->fs->umask; /* Set umask to 0. */ - current->fs->umask = 0; - if (!nfsd_active++) { - nfssvc_boot = xtime; /* record boot time */ - first = 1; - } -#if 0 - lockd_up(); /* start lockd */ -#endif /* * The main request loop @@ -143,13 +140,8 @@ * recvfrom routine. */ while ((err = svc_recv(serv, rqstp, - first?5*HZ:MAX_SCHEDULE_TIMEOUT)) == -EAGAIN) { - if (first && 1) { - exp_readlock(); - expire_all(); - exp_unlock(); - } - } + MAX_SCHEDULE_TIMEOUT)) == -EAGAIN) + ; if (err < 0) break; @@ -184,23 +176,17 @@ printk(KERN_WARNING "nfsd: terminating on signal %d\n", signo); } -#if 0 + /* Count active threads */ + if (atomic_dec_and_test(&nfsd_active)) { + nfsd_export_shutdown(); /* revoke all exports */ + nfsd_racache_shutdown(); /* release read-ahead cache */ + } + /* Release lockd */ lockd_down(); -#endif - if (!--nfsd_active) { - printk("nfsd: last server exiting\n"); - /* revoke all exports */ - nfsd_export_shutdown(); - /* release fhcache */ - nfsd_fh_shutdown (); - /* release read-ahead cache */ - nfsd_racache_shutdown(); - } - /* Destroy the thread */ + /* Destroy the thread's resources */ svc_exit_thread(rqstp); - current->fs->umask = oldumask; /* Release module */ MOD_DEC_USE_COUNT; @@ -213,7 +199,8 @@ kxdrproc_t xdr; u32 nfserr; - dprintk("nfsd_dispatch: proc %d\n", rqstp->rq_proc); + dprintk("nfsd_dispatch: vers %d proc %d\n", + rqstp->rq_vers, rqstp->rq_proc); proc = rqstp->rq_procinfo; /* Check whether we have this call in the cache. */ @@ -242,8 +229,20 @@ svc_putlong(&rqstp->rq_resbuf, nfserr); /* Encode result. - * FIXME: Most NFSv3 calls return wcc data even when the call failed + * For NFSv2, additional info is never returned in case of an error. */ +#ifdef CONFIG_NFSD_V3 + if (!(nfserr && rqstp->rq_vers == 2)) { + xdr = proc->pc_encode; + if (xdr && !xdr(rqstp, rqstp->rq_resbuf.buf, rqstp->rq_resp)) { + /* Failed to encode result. Release cache entry */ + dprintk("nfsd: failed to encode result!\n"); + nfsd_cache_update(rqstp, RC_NOCACHE, NULL); + *statp = rpc_system_err; + return 1; + } + } +#else xdr = proc->pc_encode; if (!nfserr && xdr && !xdr(rqstp, rqstp->rq_resbuf.buf, rqstp->rq_resp)) { @@ -253,6 +252,7 @@ *statp = rpc_system_err; return 1; } +#endif /* CONFIG_NFSD_V3 */ /* Store reply in cache. */ nfsd_cache_update(rqstp, proc->pc_cachetype, statp + 1); @@ -262,16 +262,16 @@ static struct svc_version nfsd_version2 = { 2, 18, nfsd_procedures2, nfsd_dispatch }; -#ifdef CONFIG_NFSD_NFS3 +#ifdef CONFIG_NFSD_V3 static struct svc_version nfsd_version3 = { - 3, 23, nfsd_procedures3, nfsd_dispatch + 3, 22, nfsd_procedures3, nfsd_dispatch }; #endif static struct svc_version * nfsd_version[] = { NULL, NULL, &nfsd_version2, -#ifdef CONFIG_NFSD_NFS3 +#ifdef CONFIG_NFSD_V3 &nfsd_version3, #endif }; diff -Naur pre9/linux/fs/nfsd/nfsxdr.c test/linux/fs/nfsd/nfsxdr.c --- pre9/linux/fs/nfsd/nfsxdr.c Wed Nov 26 13:08:38 1997 +++ test/linux/fs/nfsd/nfsxdr.c Mon Sep 18 14:35:50 2000 @@ -18,9 +18,14 @@ #define NFSDDBG_FACILITY NFSDDBG_XDR u32 nfs_ok, nfserr_perm, nfserr_noent, nfserr_io, nfserr_nxio, - nfserr_inval, nfserr_acces, nfserr_exist, nfserr_nodev, nfserr_notdir, - nfserr_isdir, nfserr_fbig, nfserr_nospc, nfserr_rofs, - nfserr_nametoolong, nfserr_dquot, nfserr_stale; + nfserr_acces, nfserr_exist, nfserr_xdev, nfserr_nodev, + nfserr_notdir, nfserr_isdir, nfserr_inval, nfserr_fbig, + nfserr_nospc, nfserr_rofs, nfserr_mlink, + nfserr_nametoolong, nfserr_notempty, nfserr_dquot, nfserr_stale, + nfserr_remote, nfserr_badhandle, nfserr_notsync, + nfserr_badcookie, nfserr_notsupp, nfserr_toosmall, + nfserr_serverfault, nfserr_badtype, nfserr_jukebox; + #ifdef NFSD_OPTIMIZE_SPACE # define inline @@ -52,18 +57,32 @@ nfserr_noent = htonl(NFSERR_NOENT); nfserr_io = htonl(NFSERR_IO); nfserr_inval = htonl(NFSERR_INVAL); - nfserr_nxio = htonl(NFSERR_NXIO); - nfserr_acces = htonl(NFSERR_ACCES); - nfserr_exist = htonl(NFSERR_EXIST); - nfserr_nodev = htonl(NFSERR_NODEV); - nfserr_notdir = htonl(NFSERR_NOTDIR); - nfserr_isdir = htonl(NFSERR_ISDIR); - nfserr_fbig = htonl(NFSERR_FBIG); - nfserr_nospc = htonl(NFSERR_NOSPC); - nfserr_rofs = htonl(NFSERR_ROFS); + nfserr_nxio = htonl(NFSERR_NXIO); + nfserr_acces = htonl(NFSERR_ACCES); + nfserr_exist = htonl(NFSERR_EXIST); + nfserr_xdev = htonl(NFSERR_XDEV); + nfserr_nodev = htonl(NFSERR_NODEV); + nfserr_notdir = htonl(NFSERR_NOTDIR); + nfserr_isdir = htonl(NFSERR_ISDIR); + nfserr_inval = htonl(NFSERR_INVAL); + nfserr_fbig = htonl(NFSERR_FBIG); + nfserr_nospc = htonl(NFSERR_NOSPC); + nfserr_rofs = htonl(NFSERR_ROFS); + nfserr_mlink = htonl(NFSERR_MLINK); nfserr_nametoolong = htonl(NFSERR_NAMETOOLONG); - nfserr_dquot = htonl(NFSERR_DQUOT); - nfserr_stale = htonl(NFSERR_STALE); + nfserr_notempty = htonl(NFSERR_NOTEMPTY); + nfserr_dquot = htonl(NFSERR_DQUOT); + nfserr_stale = htonl(NFSERR_STALE); + nfserr_remote = htonl(NFSERR_REMOTE); + nfserr_badhandle = htonl(NFSERR_BADHANDLE); + nfserr_notsync = htonl(NFSERR_NOT_SYNC); + nfserr_badcookie = htonl(NFSERR_BAD_COOKIE); + nfserr_notsupp = htonl(NFSERR_NOTSUPP); + nfserr_toosmall = htonl(NFSERR_TOOSMALL); + nfserr_serverfault = htonl(NFSERR_SERVERFAULT); + nfserr_badtype = htonl(NFSERR_BADTYPE); + nfserr_jukebox = htonl(NFSERR_JUKEBOX); + inited = 1; } diff -Naur pre9/linux/fs/nfsd/stats.c test/linux/fs/nfsd/stats.c --- pre9/linux/fs/nfsd/stats.c Tue Jan 4 10:12:23 2000 +++ test/linux/fs/nfsd/stats.c Mon Sep 18 14:35:50 2000 @@ -32,16 +32,15 @@ { int len; - len = sprintf(buffer, "rc %d %d %d %d %d %d %d %d %d\n", + len = sprintf(buffer, "rc %d %d %d %d %d %d %d %d\n", nfsdstats.rchits, nfsdstats.rcmisses, nfsdstats.rcnocache, - nfsdstats.fh_cached, - nfsdstats.fh_valid, - nfsdstats.fh_fixup, nfsdstats.fh_lookup, - nfsdstats.fh_stale, - nfsdstats.fh_concurrent); + nfsdstats.fh_anon, + nfsdstats.fh_nocache_nondir, + nfsdstats.fh_nocache_dir, + nfsdstats.fh_stale); /* Assume we haven't hit EOF yet. Will be set by svc_proc_read. */ *eof = 0; diff -Naur pre9/linux/fs/nfsd/vfs.c test/linux/fs/nfsd/vfs.c --- pre9/linux/fs/nfsd/vfs.c Tue Jan 4 10:12:23 2000 +++ test/linux/fs/nfsd/vfs.c Mon Sep 18 14:35:50 2000 @@ -31,6 +31,10 @@ #include <linux/sunrpc/svc.h> #include <linux/nfsd/nfsd.h> +#ifdef CONFIG_NFSD_V3 +#include <linux/nfs3.h> +#include <linux/nfsd/xdr3.h> +#endif /* CONFIG_NFSD_V3 */ #include <linux/nfsd/nfsfh.h> #include <linux/quotaops.h> @@ -41,16 +45,15 @@ #define NFSDDBG_FACILITY NFSDDBG_FILEOP #define NFSD_PARANOIA -/* Open mode for nfsd_open */ -#define OPEN_READ 0 -#define OPEN_WRITE 1 - -/* Hack until we have a macro check for mandatory locks. */ -#ifndef IS_ISMNDLK -#define IS_ISMNDLK(i) (((i)->i_mode & (S_ISGID|S_IXGRP|S_IFMT)) \ - == (S_ISGID|S_IFREG)) -#endif +/* We must ignore files (but only files) which might have mandatory + * locks on them because there is no way to know if the accesser has + * the lock. + */ +/* MANDATORY_LOCK taken from 2.3 */ +#define MANDATORY_LOCK(inode) \ + (IS_MANDLOCK(inode) && ((inode)->i_mode & (S_ISGID | S_IXGRP)) == S_ISGID) +#define IS_ISMNDLK(i) (S_ISREG((i)->i_mode) && MANDATORY_LOCK(i)) /* Time difference margin in seconds for comparison. It is a dynamically-tunable parameter via /proc/fs/nfs/time-diff-margin. */ @@ -83,65 +86,58 @@ static struct raparms * raparml = NULL; static struct raparms * raparm_cache = NULL; + +/* + * We need to do a check-parent every time + * after we have locked the parent - to verify + * that the parent is still our parent and + * that we are still hashed onto it.. + * + * This is required in case two processes race + * on removing (or moving) the same entry: the + * parent lock will serialize them, but the + * other process will be too late.. + * + * Note that this nfsd_check_parent is identical + * the check_parent in linux/fs/namei.c. + */ +#define nfsd_check_parent(dir, dentry) \ + ((dir) == (dentry)->d_parent && !list_empty(&dentry->d_hash)) + /* * Lock a parent directory following the VFS locking protocol. */ int fh_lock_parent(struct svc_fh *parent_fh, struct dentry *dchild) { - int nfserr = 0; - fh_lock(parent_fh); /* * Make sure the parent->child relationship still holds, * and that the child is still hashed. */ - if (dchild->d_parent != parent_fh->fh_dentry) - goto out_not_parent; - if (list_empty(&dchild->d_hash)) - goto out_not_hashed; -out: - return nfserr; + if (nfsd_check_parent(parent_fh->fh_dentry, dchild)) + return 0; -out_not_parent: printk(KERN_WARNING - "fh_lock_parent: %s/%s parent changed\n", + "fh_lock_parent: %s/%s parent changed or child unhashed\n", dchild->d_parent->d_name.name, dchild->d_name.name); - goto out_unlock; -out_not_hashed: - printk(KERN_WARNING - "fh_lock_parent: %s/%s unhashed\n", - dchild->d_parent->d_name.name, dchild->d_name.name); -out_unlock: - nfserr = nfserr_noent; - fh_unlock(parent_fh); - goto out; -} -/* - * Deny access to certain file systems - */ -static inline int -fs_off_limits(struct super_block *sb) -{ - return !sb || sb->s_magic == NFS_SUPER_MAGIC - || sb->s_magic == PROC_SUPER_MAGIC; + fh_unlock(parent_fh); + return nfserr_noent; } -/* - * Check whether directory is a mount point, but it is all right if - * this is precisely the local mount point being exported. - */ -static inline int -nfsd_iscovered(struct dentry *dentry, struct svc_export *exp) -{ - return (dentry != dentry->d_covers && - dentry != exp->ex_dentry); -} /* * Look up one component of a pathname. * N.B. After this call _both_ fhp and resfh need an fh_put + * + * If the lookup would cross a mountpoint, and the mounted filesystem + * is exported to the client with NFSEXP_CROSSMNT, then the lookup is + * accepted as it stands and the mounted directory is + * returned. Otherwise the covered directory is returned. + * NOTE: this mountpoint crossing is not supported properly by all + * clients and is explicitly disallowed for NFSv3 + * NeilBrown <neilb@cse.unsw.edu.au> */ int nfsd_lookup(struct svc_rqst *rqstp, struct svc_fh *fhp, const char *name, @@ -166,33 +162,62 @@ if (err) goto out; #endif - err = nfserr_noent; - if (fs_off_limits(dparent->d_sb)) - goto out; err = nfserr_acces; - if (nfsd_iscovered(dparent, exp)) - goto out; /* Lookup the name, but don't follow links */ - dchild = lookup_dentry(name, dget(dparent), 0); - if (IS_ERR(dchild)) - goto out_nfserr; - /* - * check if we have crossed a mount point ... - */ - if (dchild->d_sb != dparent->d_sb) { - struct dentry *tdentry; - tdentry = dchild->d_covers; - if (tdentry == dchild) - goto out_dput; - dput(dchild); - dchild = dget(tdentry); - if (dchild->d_sb != dparent->d_sb) { -printk("nfsd_lookup: %s/%s crossed mount point!\n", dparent->d_name.name, dchild->d_name.name); - goto out_dput; + if (strcmp(name, "..")==0) { + /* checking mountpoint crossing is very different when stepping up */ + if (dparent == exp->ex_dentry) { + if (!EX_CROSSMNT(exp)) + dchild = dget(dparent); /* .. == . just like at / */ + else + { + struct svc_export *exp2 = NULL; + struct dentry *dp; + dchild = dparent->d_covers->d_parent; + for (dp=dchild; + exp2 == NULL && dp->d_covers->d_parent != dp; + dp=dp->d_covers->d_parent) + exp2 = exp_get(exp->ex_client, dp->d_inode->i_dev, dp->d_inode->i_ino); + if (exp2==NULL || dchild->d_sb != exp2->ex_dentry->d_sb) { + dchild = dget(dparent); + } else { + dget(dchild); + exp = exp2; + } + } + } else + dchild = dget(dparent->d_parent); + } else { + dchild = lookup_dentry(name, dget(dparent), 0); + if (IS_ERR(dchild)) + goto out_nfserr; + /* + * check if we have crossed a mount point ... + */ + if (dchild->d_sb != dparent->d_sb) { + struct svc_export *exp2 = NULL; + exp2 = exp_get(rqstp->rq_client, + dchild->d_inode->i_dev, + dchild->d_inode->i_ino); + if (exp2 && EX_CROSSMNT(exp2)) + /* successfully crossed mount point */ + exp = exp2; + else if (dchild->d_covers->d_sb == dparent->d_sb) { + /* stay in the original filesystem */ + struct dentry *tdentry = dget(dchild->d_covers); + dput(dchild); + dchild = tdentry; + } else { + /* This cannot possibly happen */ + printk("nfsd_lookup: %s/%s impossible mount point!\n", dparent->d_name.name, dchild->d_name.name); + dput(dchild); + err = nfserr_acces; + goto out; + + } } } - /* * Note: we compose the file handle now, but as the * dentry may be negative, it may need to be updated. @@ -207,10 +232,6 @@ out_nfserr: err = nfserrno(-PTR_ERR(dchild)); goto out; -out_dput: - dput(dchild); - err = nfserr_acces; - goto out; } /* @@ -226,6 +247,8 @@ int ftype = 0; int imode; int err; + kernel_cap_t saved_cap = 0; + int size_change = 0; if (iap->ia_valid & (ATTR_ATIME | ATTR_MTIME | ATTR_SIZE)) accmode |= MAY_WRITE; @@ -234,39 +257,43 @@ /* Get inode */ err = fh_verify(rqstp, fhp, ftype, accmode); - if (err) + if (err || !iap->ia_valid) goto out; dentry = fhp->fh_dentry; inode = dentry->d_inode; err = inode_change_ok(inode, iap); - if (err) { - /* It is very tricky. When you are not the file owner, - but have the write permission, you should be allowed - to set atime and mtime to the current time on the - server. However, the NFS V2 protocol doesn't support - it. It has been fixed in V3. Here we do this: if the - current server time and atime/mtime are close enough, - we use the current server time. */ -#define CURRENT_TIME_SET (ATTR_ATIME_SET | ATTR_MTIME_SET) - if (iap->ia_mtime == iap->ia_atime - && ((iap->ia_valid & (CURRENT_TIME_SET)) - == CURRENT_TIME_SET)) { - time_t now = CURRENT_TIME; - time_t delta = iap->ia_atime - now; - if (delta < 0) delta = -delta; - if (delta <= nfsd_time_diff_margin) { - iap->ia_valid &= ~CURRENT_TIME_SET; - goto current_time_ok; - } + /* could be a "touch" (utimes) request where the user is not the owner but does + * have write permission. In this case the user should be allowed to set + * both times to the current time. We could just assume any such SETATTR + * is intended to set the times to "now", but we do a couple of simple tests + * to increase our confidence. + */ +#define BOTH_TIME_SET (ATTR_ATIME_SET | ATTR_MTIME_SET) +#define MAX_TOUCH_TIME_ERROR (30*60) + if (err + && (iap->ia_valid & BOTH_TIME_SET) == BOTH_TIME_SET + && iap->ia_mtime == iap->ia_ctime + ) { + /* looks good. now just make sure time is in the right ballpark. + * solaris, at least, doesn't seem to care what the time request is + */ + time_t delta = iap->ia_atime - CURRENT_TIME; + if (delta<0) delta = -delta; + if (delta < MAX_TOUCH_TIME_ERROR) { + /* turn off ATTR_[AM]TIME_SET but leave ATTR_[AM]TIME + * this will cause notify_change to setthese times to "now" + */ + iap->ia_valid &= ~BOTH_TIME_SET; + err = inode_change_ok(inode, iap); } - goto out_nfserr; } -current_time_ok: + if (err) + goto out_nfserr; - /* The size case is special... */ + /* The size case is special. It changes the file as well as the attributes. */ if (iap->ia_valid & ATTR_SIZE) { if (!S_ISREG(inode->i_mode)) printk("nfsd_setattr: size change??\n"); @@ -275,22 +302,12 @@ if (err) goto out; } - DQUOT_INIT(inode); err = get_write_access(inode); - if (err) { - DQUOT_DROP(inode); + if (err) goto out_nfserr; - } - /* N.B. Should we update the inode cache here? */ - inode->i_size = iap->ia_size; - if (inode->i_op && inode->i_op->truncate) - inode->i_op->truncate(inode); - mark_inode_dirty(inode); - put_write_access(inode); - DQUOT_DROP(inode); - iap->ia_valid &= ~ATTR_SIZE; - iap->ia_valid |= ATTR_MTIME; - iap->ia_mtime = CURRENT_TIME; + size_change = 1; + + DQUOT_INIT(inode); } imode = inode->i_mode; @@ -312,24 +329,53 @@ } /* Change the attributes. */ - if (iap->ia_valid) { - kernel_cap_t saved_cap = 0; - iap->ia_valid |= ATTR_CTIME; - iap->ia_ctime = CURRENT_TIME; - if (current->fsuid != 0) { - saved_cap = current->cap_effective; - cap_clear(current->cap_effective); - } - err = notify_change(dentry, iap); - if (current->fsuid != 0) - current->cap_effective = saved_cap; - if (err) - goto out_nfserr; - if (EX_ISSYNC(fhp->fh_export)) - write_inode_now(inode); + + iap->ia_valid |= ATTR_CTIME; + if (current->fsuid != 0) { + saved_cap = current->cap_effective; + cap_clear(current->cap_effective); + } +#ifdef CONFIG_QUOTA + /* DQUOT_TRANSFER needs both ia_uid and ia_gid defined */ + if (iap->ia_valid & (ATTR_UID|ATTR_GID)) { + if (! (iap->ia_valid & ATTR_UID)) + iap->ia_uid = inode->i_uid; + if (! (iap->ia_valid & ATTR_GID)) + iap->ia_gid = inode->i_gid; + iap->ia_valid |= ATTR_UID|ATTR_GID; } +#endif /* CONFIG_QUOTA */ + + fh_lock(fhp); +#ifdef CONFIG_QUOTA + if (iap->ia_valid & (ATTR_UID|ATTR_GID)) + err = DQUOT_TRANSFER(dentry, iap); + else +#endif + err = notify_change(dentry, iap); + + if (size_change) { + if (!err) { + vmtruncate(inode,iap->ia_size); + if (inode->i_op && inode->i_op->truncate) + inode->i_op->truncate(inode); + } + fh_unlock(fhp); + put_write_access(inode); + } else + fh_unlock(fhp); + + if (current->fsuid != 0) + current->cap_effective = saved_cap; + if (err) + goto out_nfserr; + if (EX_ISSYNC(fhp->fh_export)) + write_inode_now(inode); err = 0; + + /* Don't unlock inode; the nfssvc_release functions are supposed + * to do this. */ out: return err; @@ -338,20 +384,107 @@ goto out; } +#ifdef CONFIG_NFSD_V3 +/* + * Check server access rights to a file system object + */ +struct accessmap { + u32 access; + int how; +}; +static struct accessmap nfs3_regaccess[] = { + { NFS3_ACCESS_READ, MAY_READ }, + { NFS3_ACCESS_EXECUTE, MAY_EXEC }, + { NFS3_ACCESS_MODIFY, MAY_WRITE|MAY_TRUNC }, + { NFS3_ACCESS_EXTEND, MAY_WRITE }, + + { 0, 0 } +}; + +static struct accessmap nfs3_diraccess[] = { + { NFS3_ACCESS_READ, MAY_READ }, + { NFS3_ACCESS_LOOKUP, MAY_EXEC }, + { NFS3_ACCESS_MODIFY, MAY_EXEC|MAY_WRITE|MAY_TRUNC }, + { NFS3_ACCESS_EXTEND, MAY_EXEC|MAY_WRITE }, + { NFS3_ACCESS_DELETE, MAY_REMOVE }, + + { 0, 0 } +}; + +static struct accessmap nfs3_anyaccess[] = { + /* XXX: should we try to cover read/write here for clients that + * rely on us to do their access checking for special files? */ + + { 0, 0 } +}; + +int +nfsd_access(struct svc_rqst *rqstp, struct svc_fh *fhp, u32 *access) +{ + struct accessmap *map; + struct svc_export *export; + struct dentry *dentry; + u32 query, result = 0; + int error; + + error = fh_verify(rqstp, fhp, 0, MAY_NOP); + if (error) + goto out; + + export = fhp->fh_export; + dentry = fhp->fh_dentry; + + if (S_ISREG(dentry->d_inode->i_mode)) { + map = nfs3_regaccess; + } else if (S_ISDIR(dentry->d_inode->i_mode)) { + map = nfs3_diraccess; + } else { + map = nfs3_anyaccess; + } + + query = *access; + while (map->access) { + if (map->access & query) { + error = nfsd_permission(export, dentry, (map->how | NO_OWNER_OVERRIDE)); + if (error == 0) + result |= map->access; + else if ((error == nfserr_perm) || (error == nfserr_acces)) { + /* + * This access type is denyed; but the + * access query itself succeeds. + */ + error = 0; + } else { + /* + * Some fatal error. Fail the query. + */ + goto out; + } + } + map++; + } + *access = result; + +out: + return error; +} +#endif + + + /* * Open an existing file or directory. - * The wflag argument indicates write access. + * The access argument indicates the type of open (read/write/lock) * N.B. After this call fhp needs an fh_put */ int nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, int type, - int wflag, struct file *filp) + int access, struct file *filp) { struct dentry *dentry; struct inode *inode; - int access, err; + int err; - access = wflag? MAY_WRITE : MAY_READ; err = fh_verify(rqstp, fhp, type, access); if (err) goto out; @@ -368,24 +501,27 @@ if (!inode->i_op || !inode->i_op->default_file_ops) goto out; - if (wflag && (err = get_write_access(inode)) != 0) + if ((access & MAY_WRITE) && (err = get_write_access(inode)) != 0) goto out_nfserr; memset(filp, 0, sizeof(*filp)); filp->f_op = inode->i_op->default_file_ops; filp->f_count = 1; - filp->f_flags = wflag? O_WRONLY : O_RDONLY; - filp->f_mode = wflag? FMODE_WRITE : FMODE_READ; filp->f_dentry = dentry; - - if (wflag) + if (access & MAY_WRITE) { + filp->f_flags = O_WRONLY; + filp->f_mode = FMODE_WRITE; DQUOT_INIT(inode); + } else { + filp->f_flags = O_RDONLY; + filp->f_mode = FMODE_READ; + } err = 0; if (filp->f_op && filp->f_op->open) { err = filp->f_op->open(inode, filp); if (err) { - if (wflag) + if (access & MAY_WRITE) put_write_access(inode); /* I nearly added put_filp() call here, but this filp @@ -419,17 +555,33 @@ filp->f_op->release(inode, filp); if (filp->f_mode & FMODE_WRITE) { put_write_access(inode); - DQUOT_DROP(inode); } } /* * Sync a file + * As this calls fsync (not fdatasync) there is no need for a write_inode + * after it. */ void -nfsd_sync(struct inode *inode, struct file *filp) +nfsd_sync(struct file *filp) { + dprintk("nfsd: sync file %s\n", filp->f_dentry->d_name.name); + down(&filp->f_dentry->d_inode->i_sem); filp->f_op->fsync(filp, filp->f_dentry); + up(&filp->f_dentry->d_inode->i_sem); +} + +void +nfsd_sync_dir(struct dentry *dp) +{ + struct inode *inode = dp->d_inode; + int (*fsync) (struct file *, struct dentry *); + + if (inode->i_op->default_file_ops + && (fsync = inode->i_op->default_file_ops->fsync)) { + fsync(NULL, dp); + } } /* @@ -478,7 +630,7 @@ int err; struct file file; - err = nfsd_open(rqstp, fhp, S_IFREG, OPEN_READ, &file); + err = nfsd_open(rqstp, fhp, S_IFREG, MAY_READ, &file); if (err) goto out; err = nfserr_perm; @@ -543,11 +695,11 @@ uid_t saved_euid; #endif - if (!cnt) - goto out; - err = nfsd_open(rqstp, fhp, S_IFREG, OPEN_WRITE, &file); + err = nfsd_open(rqstp, fhp, S_IFREG, MAY_WRITE, &file); if (err) goto out; + if (!cnt) + goto out_close; err = nfserr_perm; if (!file.f_op->write) goto out_close; @@ -560,11 +712,21 @@ * Request sync writes if * - the sync export option has been set, or * - the client requested O_SYNC behavior (NFSv3 feature). + * - The file system doesn't support fsync(). * When gathered writes have been configured for this volume, * flushing the data to disk is handled separately below. */ +#ifdef CONFIG_NFSD_V3 + if (rqstp->rq_vers == 2) + stable = EX_ISSYNC(exp); + else if (file.f_op->fsync == 0) + stable = 1; + if (stable && !EX_WGATHER(exp)) + file.f_flags |= O_SYNC; +#else if ((stable || (stable = EX_ISSYNC(exp))) && !EX_WGATHER(exp)) file.f_flags |= O_SYNC; +#endif /* CONFIG_NFSD_V3 */ fh_lock(fhp); /* lock inode */ file.f_pos = offset; /* set write offset */ @@ -618,26 +780,25 @@ */ if (EX_WGATHER(exp) && (inode->i_writecount > 1 || (last_ino == inode->i_ino && last_dev == inode->i_dev))) { -#if 0 - interruptible_sleep_on_timeout(&inode->i_wait, 10 * HZ / 1000); -#else dprintk("nfsd: write defer %d\n", current->pid); + current->state = TASK_UNINTERRUPTIBLE; schedule_timeout((HZ+99)/100); + current->state = TASK_RUNNING; dprintk("nfsd: write resume %d\n", current->pid); -#endif } if (inode->i_state & I_DIRTY) { dprintk("nfsd: write sync %d\n", current->pid); - nfsd_sync(inode, &file); - write_inode_now(inode); + nfsd_sync(&file); } +#if 0 wake_up(&inode->i_wait); +#endif last_ino = inode->i_ino; last_dev = inode->i_dev; } - dprintk("nfsd: write complete\n"); + dprintk("nfsd: write complete err=%d\n", err); if (err >= 0) err = 0; else @@ -648,6 +809,38 @@ return err; } + +#ifdef CONFIG_NFSD_V3 +/* + * Commit all pendig writes to stable storage. + * Strictly speaking, we could sync just indicated the file region here, + * but there's currently no way we can ask the VFS to do so. + * + * We lock the file to make sure we return full WCC data to the client. + */ +int +nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp, + off_t offset, unsigned long count) +{ + struct file file; + int err; + + if ((err = nfsd_open(rqstp, fhp, S_IFREG, MAY_WRITE, &file)) != 0) + return err; + + fh_lock(fhp); + if (file.f_op && file.f_op->fsync) { + file.f_op->fsync(&file, file.f_dentry); + } else { + err = nfserr_notsupp; + } + fh_unlock(fhp); + + nfsd_close(&file); + return err; +} +#endif /* CONFIG_NFSD_V3 */ + /* * Create a file (regular, directory, device, fifo); UNIX sockets * not yet implemented. @@ -669,6 +862,7 @@ err = nfserr_perm; if (!flen) goto out; + err = fh_verify(rqstp, fhp, S_IFDIR, MAY_CREATE); if (err) goto out; @@ -679,40 +873,42 @@ err = nfserr_notdir; if(!dirp->i_op || !dirp->i_op->lookup) goto out; + + err = nfserr_exist; + if (isdotent(fname, flen)) + goto out; /* * Check whether the response file handle has been verified yet. * If it has, the parent directory should already be locked. */ - if (!resfhp->fh_dverified) { + if (!resfhp->fh_dentry) { dchild = lookup_dentry(fname, dget(dentry), 0); err = PTR_ERR(dchild); if (IS_ERR(dchild)) goto out_nfserr; fh_compose(resfhp, fhp->fh_export, dchild); + /* Lock the parent and check for errors ... */ + err = fh_lock_parent(fhp, dchild); + if (err) + goto out; } else { dchild = resfhp->fh_dentry; - if (!fhp->fh_locked) + if (!fhp->fh_locked) { + /* not actually possible */ printk(KERN_ERR "nfsd_create: parent %s/%s not locked!\n", dentry->d_parent->d_name.name, dentry->d_name.name); - } - err = nfserr_exist; - if (dchild->d_inode) - goto out; - if (!fhp->fh_locked) { - /* Lock the parent and check for errors ... */ - err = fh_lock_parent(fhp, dchild); - if (err) + err = -EIO; goto out; } + } /* * Make sure the child dentry is still negative ... */ err = nfserr_exist; if (dchild->d_inode) { - printk(KERN_WARNING - "nfsd_create: dentry %s/%s not negative!\n", + dprintk("nfsd_create: dentry %s/%s not negative!\n", dentry->d_name.name, dchild->d_name.name); goto out; } @@ -739,24 +935,29 @@ case S_IFSOCK: opfunc = dirp->i_op->mknod; break; + default: + printk("nfsd: bad file type %o in nfsd_create\n", type); + err = nfserr_inval; } if (!opfunc) goto out; if (!(iap->ia_valid & ATTR_MODE)) iap->ia_mode = 0; + iap->ia_mode = (iap->ia_mode & S_IALLUGO) | type; /* * Call the dir op function to create the object. */ DQUOT_INIT(dirp); err = opfunc(dirp, dchild, iap->ia_mode, rdev); - DQUOT_DROP(dirp); if (err < 0) goto out_nfserr; - if (EX_ISSYNC(fhp->fh_export)) - write_inode_now(dirp); + if (EX_ISSYNC(fhp->fh_export)) { + nfsd_sync_dir(dentry); + write_inode_now(dchild->d_inode); + } /* * Update the file handle to get the new inode info. @@ -779,6 +980,129 @@ goto out; } +#ifdef CONFIG_NFSD_V3 +/* + * NFSv3 version of nfsd_create + */ +int +nfsd_create_v3(struct svc_rqst *rqstp, struct svc_fh *fhp, + char *fname, int flen, struct iattr *iap, + struct svc_fh *resfhp, int createmode, u32 *verifier) +{ + struct dentry *dentry, *dchild; + struct inode *dirp; + int err; + + err = nfserr_perm; + if (!flen) + goto out; + if (!(iap->ia_valid & ATTR_MODE)) + iap->ia_mode = 0; + err = fh_verify(rqstp, fhp, S_IFDIR, MAY_CREATE); + if (err) + goto out; + + dentry = fhp->fh_dentry; + dirp = dentry->d_inode; + + /* Get all the sanity checks out of the way before + * we lock the parent. */ + err = nfserr_notdir; + if(!dirp->i_op || !dirp->i_op->lookup) + goto out; + err = nfserr_perm; + if(!dirp->i_op->create) + goto out; + + err = nfserr_exist; + if (isdotent(fname, flen)) + goto out; + /* + * Compose the response file handle. + */ + dchild = lookup_dentry(fname, dget(dentry), 0); + err = PTR_ERR(dchild); + if(IS_ERR(dchild)) + goto out_nfserr; + fh_compose(resfhp, fhp->fh_export, dchild); + + /* + * We must lock the directory before we check for the inode. + */ + err = fh_lock_parent(fhp, dchild); + if (err) + goto out; + + if (dchild->d_inode) { + err = 0; + + if (resfhp->fh_handle.fh_ino == 0) + /* inode might have been instantiated while we slept */ + fh_update(resfhp); + + switch (createmode) { + case NFS3_CREATE_UNCHECKED: + if (! S_ISREG(dchild->d_inode->i_mode)) + err = nfserr_exist; + else { + iap->ia_valid &= ATTR_SIZE; + goto set_attr; + } + break; + case NFS3_CREATE_EXCLUSIVE: + if ( dchild->d_inode->i_mtime == verifier[0] + && dchild->d_inode->i_atime == verifier[1] + && dchild->d_inode->i_mode == S_IFREG + && dchild->d_inode->i_size == 0 ) + break; + /* fallthru */ + case NFS3_CREATE_GUARDED: + err = nfserr_exist; + } + goto out; + } + + err = dirp->i_op->create(dirp, dchild, iap->ia_mode); + if (err < 0) + goto out_nfserr; + + if (EX_ISSYNC(fhp->fh_export)) { + nfsd_sync_dir(dentry); + /* setattr will sync the child (or not) */ + } + + /* + * Update the filehandle to get the new inode info. + */ + fh_update(resfhp); + err = 0; + + if (createmode == NFS3_CREATE_EXCLUSIVE) { + /* Cram the verifier into atime/mtime */ + iap->ia_valid = ATTR_MTIME|ATTR_ATIME|ATTR_MTIME_SET|ATTR_ATIME_SET; + iap->ia_mtime = verifier[0]; + iap->ia_atime = verifier[1]; + } + + /* Set file attributes. Mode has already been set and + * setting uid/gid works only for root. Irix appears to + * send along the gid when it tries to implement setgid + * directories via NFS. Clear out all that cruft. + */ + set_attr: + if ((iap->ia_valid &= ~(ATTR_UID|ATTR_GID|ATTR_MODE)) != 0) + err = nfsd_setattr(rqstp, resfhp, iap); + + out: + fh_unlock(fhp); + return err; + + out_nfserr: + err = nfserrno(-err); + goto out; +} +#endif /* CONFIG_NFSD_V3 */ + /* * Truncate a file. * The calling routines must make sure to update the ctime @@ -817,15 +1141,16 @@ cap_clear(current->cap_effective); } err = notify_change(dentry, &newattrs); - if (current->fsuid != 0) - current->cap_effective = saved_cap; if (!err) { vmtruncate(inode, size); if (inode->i_op && inode->i_op->truncate) inode->i_op->truncate(inode); } + if (current->fsuid != 0) + current->cap_effective = saved_cap; put_write_access(inode); - DQUOT_DROP(inode); + if (EX_ISSYNC(fhp->fh_export)) + nfsd_sync_dir(dentry); fh_unlock(fhp); out_nfserr: if (err) @@ -859,7 +1184,10 @@ goto out; UPDATE_ATIME(inode); - /* N.B. Why does this call need a get_fs()?? */ + /* N.B. Why does this call need a get_fs()?? + * Remove the set_fs and watch the fireworks:-) --okir + */ + oldfs = get_fs(); set_fs(KERNEL_DS); err = inode->i_op->readlink(dentry, buf, *lenp); set_fs(oldfs); @@ -884,7 +1212,8 @@ nfsd_symlink(struct svc_rqst *rqstp, struct svc_fh *fhp, char *fname, int flen, char *path, int plen, - struct svc_fh *resfhp) + struct svc_fh *resfhp, + struct iattr *iap) { struct dentry *dentry, *dnew; struct inode *dirp; @@ -899,9 +1228,11 @@ goto out; dentry = fhp->fh_dentry; - err = nfserr_perm; - if (nfsd_iscovered(dentry, fhp->fh_export)) + err = nfserr_exist; + if (isdotent(fname, flen)) goto out; + + err = nfserr_perm; dirp = dentry->d_inode; if (!dirp->i_op || !dirp->i_op->symlink) goto out; @@ -922,10 +1253,20 @@ if (!dnew->d_inode) { DQUOT_INIT(dirp); err = dirp->i_op->symlink(dirp, dnew, path); - DQUOT_DROP(dirp); if (!err) { if (EX_ISSYNC(fhp->fh_export)) - write_inode_now(dirp); + nfsd_sync_dir(dentry); + if (iap) { + iap->ia_valid &= ATTR_MODE /* ~(ATTR_MODE|ATTR_UID|ATTR_GID)*/; + if (iap->ia_valid) { + iap->ia_valid |= ATTR_CTIME; + iap->ia_mode = (iap->ia_mode&S_IALLUGO) + | S_IFLNK; + err = notify_change(dnew, iap); + if (!err && EX_ISSYNC(fhp->fh_export)) + write_inode_now(dentry->d_inode); + } + } } else err = nfserrno(-err); } @@ -949,7 +1290,7 @@ */ int nfsd_link(struct svc_rqst *rqstp, struct svc_fh *ffhp, - char *fname, int len, struct svc_fh *tfhp) + char *fname, int flen, struct svc_fh *tfhp) { struct dentry *ddir, *dnew, *dold; struct inode *dirp, *dest; @@ -958,12 +1299,16 @@ err = fh_verify(rqstp, ffhp, S_IFDIR, MAY_CREATE); if (err) goto out; - err = fh_verify(rqstp, tfhp, S_IFREG, MAY_NOP); + err = fh_verify(rqstp, tfhp, -S_IFDIR, MAY_NOP); if (err) goto out; err = nfserr_perm; - if (!len) + if (!flen) + goto out; + + err = nfserr_exist; + if (isdotent(fname, flen)) goto out; ddir = ffhp->fh_dentry; @@ -987,10 +1332,7 @@ dold = tfhp->fh_dentry; dest = dold->d_inode; - err = nfserr_acces; - if (nfsd_iscovered(ddir, ffhp->fh_export)) - goto out_unlock; - /* FIXME: nxdev for NFSv3 */ + err = (rqstp->rq_vers == 2) ? nfserr_acces : nfserr_xdev; if (dirp->i_dev != dest->i_dev) goto out_unlock; @@ -1002,10 +1344,9 @@ DQUOT_INIT(dirp); err = dirp->i_op->link(dold, dirp, dnew); - DQUOT_DROP(dirp); if (!err) { if (EX_ISSYNC(ffhp->fh_export)) { - write_inode_now(dirp); + nfsd_sync_dir(ddir); write_inode_now(dest); } } else @@ -1024,26 +1365,12 @@ } /* - * We need to do a check-parent every time - * after we have locked the parent - to verify - * that the parent is still our parent and - * that we are still hashed onto it.. - * - * This is requied in case two processes race - * on removing (or moving) the same entry: the - * parent lock will serialize them, but the - * other process will be too late.. - */ -#define check_parent(dir, dentry) \ - ((dir) == (dentry)->d_parent->d_inode && !list_empty(&dentry->d_hash)) - -/* * This follows the model of double_lock() in the VFS. */ static inline void nfsd_double_down(struct semaphore *s1, struct semaphore *s2) { if (s1 != s2) { - if ((unsigned long) s1 > (unsigned long) s2) { + if ((unsigned long) s1 < (unsigned long) s2) { struct semaphore *tmp = s1; s1 = s2; s2 = tmp; @@ -1085,12 +1412,12 @@ tdentry = tfhp->fh_dentry; tdir = tdentry->d_inode; - /* N.B. We shouldn't need this ... dentry layer handles it */ + err = (rqstp->rq_vers == 2) ? nfserr_acces : nfserr_xdev; + if (fdir->i_dev != tdir->i_dev) + goto out; + err = nfserr_perm; - if (!flen || (fname[0] == '.' && - (flen == 1 || (flen == 2 && fname[1] == '.'))) || - !tlen || (tname[0] == '.' && - (tlen == 1 || (tlen == 2 && tname[1] == '.')))) + if (!flen || isdotent(fname, flen) || !tlen || isdotent(tname, tlen)) goto out; odentry = lookup_dentry(fname, dget(fdentry), 0); @@ -1111,31 +1438,36 @@ * Lock the parent directories. */ nfsd_double_down(&tdir->i_sem, &fdir->i_sem); + +#ifdef CONFIG_NFSD_V3 + /* Fill in the pre-op attr for the wcc data for both + * tdir and fdir + */ + fill_pre_wcc(ffhp); + fill_pre_wcc(tfhp); +#endif /* CONFIG_NFSD_V3 */ + err = -ENOENT; /* GAM3 check for parent changes after locking. */ - if (check_parent(fdir, odentry) && - check_parent(tdir, ndentry)) { + if (nfsd_check_parent(fdentry, odentry) && + nfsd_check_parent(tdentry, ndentry)) { err = vfs_rename(fdir, odentry, tdir, ndentry); if (!err && EX_ISSYNC(tfhp->fh_export)) { - write_inode_now(fdir); - write_inode_now(tdir); + nfsd_sync_dir(tdentry); + nfsd_sync_dir(fdentry); } } else dprintk("nfsd: Caught race in nfsd_rename"); - DQUOT_DROP(fdir); - DQUOT_DROP(tdir); +#ifdef CONFIG_NFSD_V3 + /* Fill in the post-op attr for the wcc data for both + * tdir and fdir + */ + fill_post_wcc(ffhp); + fill_post_wcc(tfhp); +#endif /* CONFIG_NFSD_V3 */ nfsd_double_up(&tdir->i_sem, &fdir->i_sem); - - if (!err && odentry->d_inode) { - add_to_rename_cache(tdir->i_ino, - odentry->d_inode->i_dev, - fdir->i_ino, - odentry->d_inode->i_ino); - } else { - printk(": no inode in rename or err: %d.\n", err); - } dput(ndentry); out_dput_old: @@ -1162,13 +1494,12 @@ struct inode *dirp; int err; - /* N.B. We shouldn't need this test ... handled by dentry layer */ - err = nfserr_acces; - if (!flen || isdotent(fname, flen)) - goto out; err = fh_verify(rqstp, fhp, S_IFDIR, MAY_REMOVE); if (err) goto out; + err = nfserr_acces; + if (!flen || isdotent(fname, flen)) + goto out; dentry = fhp->fh_dentry; dirp = dentry->d_inode; @@ -1177,13 +1508,13 @@ err = PTR_ERR(rdentry); if (IS_ERR(rdentry)) goto out_nfserr; + if (!rdentry->d_inode) { dput(rdentry); err = nfserr_noent; goto out; } - expire_by_dentry(rdentry); if (type != S_IFDIR) { /* It's UNLINK */ @@ -1194,30 +1525,33 @@ err = vfs_unlink(dirp, rdentry); - DQUOT_DROP(dirp); fh_unlock(fhp); dput(rdentry); - expire_by_dentry(rdentry); + } else { /* It's RMDIR */ /* See comments in fs/namei.c:do_rmdir */ rdentry->d_count++; nfsd_double_down(&dirp->i_sem, &rdentry->d_inode->i_sem); - if (!fhp->fh_pre_mtime) - fhp->fh_pre_mtime = dirp->i_mtime; + +#ifdef CONFIG_NFSD_V3 + fill_pre_wcc(fhp); +#else fhp->fh_locked = 1; +#endif /* CONFIG_NFSD_V3 */ err = -ENOENT; - if (check_parent(dirp, rdentry)) + if (nfsd_check_parent(dentry, rdentry)) err = vfs_rmdir(dirp, rdentry); rdentry->d_count--; - DQUOT_DROP(dirp); - if (!fhp->fh_post_version) - fhp->fh_post_version = dirp->i_version; +#ifdef CONFIG_NFSD_V3 + fill_post_wcc(fhp); +#else fhp->fh_locked = 0; +#endif /* CONFIG_NFSD_V3 */ nfsd_double_up(&dirp->i_sem, &rdentry->d_inode->i_sem); dput(rdentry); @@ -1225,9 +1559,11 @@ if (err) goto out_nfserr; - if (EX_ISSYNC(fhp->fh_export)) - write_inode_now(dirp); - + if (EX_ISSYNC(fhp->fh_export)) { + down(&dentry->d_inode->i_sem); + nfsd_sync_dir(dentry); + up(&dentry->d_inode->i_sem); + } out: return err; @@ -1238,10 +1574,11 @@ /* * Read entries from a directory. + * The verifier is an NFSv3 thing we ignore for now. */ int nfsd_readdir(struct svc_rqst *rqstp, struct svc_fh *fhp, loff_t offset, - encode_dent_fn func, u32 *buffer, int *countp) + encode_dent_fn func, u32 *buffer, int *countp, u32 *verf) { struct inode *inode; u32 *p; @@ -1249,13 +1586,11 @@ struct file file; struct readdir_cd cd; - err = 0; - if (offset > ~(u32) 0) - goto out; - - err = nfsd_open(rqstp, fhp, S_IFDIR, OPEN_READ, &file); + err = nfsd_open(rqstp, fhp, S_IFDIR, MAY_READ, &file); if (err) goto out; + if (offset > ~(u32) 0) + goto out_close; err = nfserr_notdir; if (!file.f_op->readdir) @@ -1267,6 +1602,7 @@ cd.rqstp = rqstp; cd.buffer = buffer; cd.buflen = *countp; /* count of words */ + cd.dirfh = fhp; /* * Read the directory entries. This silly loop is necessary because @@ -1296,8 +1632,14 @@ /* If we didn't fill the buffer completely, we're at EOF */ eof = !cd.eob; - if (cd.offset) - *cd.offset = htonl(file.f_pos); + if (cd.offset) { +#ifdef CONFIG_NFSD_V3 + if (rqstp->rq_vers == 3) + (void)enc64(cd.offset, file.f_pos); + else +#endif /* CONFIG_NFSD_V3 */ + *cd.offset = htonl(file.f_pos); + } p = cd.buffer; *p++ = 0; /* no more entries */ @@ -1360,17 +1702,33 @@ struct inode *inode = dentry->d_inode; int err; kernel_cap_t saved_cap = 0; + int owneraccess; + + /* + * Check if we are to use "owner may always access" semantics, + * then clean out the flag bit which controls this. It might be + * clearer to reverse the logic of this flag, but I didn't + * want to change a lot of code in a stable kernel - dhiggen. + */ + + if (acc & NO_OWNER_OVERRIDE) { + owneraccess = 0; + acc &= ~NO_OWNER_OVERRIDE; + } else { + owneraccess = 1; + } if (acc == MAY_NOP) return 0; #if 0 - dprintk("nfsd: permission 0x%x%s%s%s%s%s mode 0%o%s%s%s\n", + dprintk("nfsd: permission 0x%x%s%s%s%s%s%s mode 0%o%s%s%s\n", acc, (acc & MAY_READ)? " read" : "", (acc & MAY_WRITE)? " write" : "", (acc & MAY_EXEC)? " exec" : "", (acc & MAY_SATTR)? " sattr" : "", (acc & MAY_TRUNC)? " trunc" : "", + (acc & MAY_LOCK)? " lock" : "", inode->i_mode, IS_IMMUTABLE(inode)? " immut" : "", IS_APPEND(inode)? " append" : "", @@ -1378,33 +1736,40 @@ dprintk(" owner %d/%d user %d/%d\n", inode->i_uid, inode->i_gid, current->fsuid, current->fsgid); #endif -#ifndef CONFIG_NFSD_SUN - if (dentry->d_mounts != dentry) { - return nfserr_perm; - } -#endif if (acc & (MAY_WRITE | MAY_SATTR | MAY_TRUNC)) { if (EX_RDONLY(exp) || IS_RDONLY(inode)) return nfserr_rofs; - if (S_ISDIR(inode->i_mode) && nfsd_iscovered(dentry, exp)) - return nfserr_perm; if (/* (acc & MAY_WRITE) && */ IS_IMMUTABLE(inode)) return nfserr_perm; } if ((acc & MAY_TRUNC) && IS_APPEND(inode)) return nfserr_perm; + if (acc & MAY_LOCK) { + /* If we cannot rely on authentication in NLM requests, + * just allow locks, others require read permission + */ + if (exp->ex_flags & NFSEXP_NOAUTHNLM) + return 0; + else + acc = MAY_READ; + } /* - * The file owner always gets access permission. This is to make - * file access work even when the client has done a fchmod(fd, 0). + * The file owner always gets access permission (except in the + * special case of a V3 ACCESS call, which is used for checking at + * open() time). This is to make file access work even when the + * client has done a fchmod(fd, 0). * * However, `cp foo bar' should fail nevertheless when bar is * readonly. A sensible way to do this might be to reject all * attempts to truncate a read-only file, because a creat() call * always implies file truncation. + * dhXXX we are not currently setting MAY_TRUNC from a SETATTR which + * changes the size, so this check is not enforced. It probably + * should be? */ - if (inode->i_uid == current->fsuid /* && !(acc & MAY_TRUNC) */) + if (owneraccess && inode->i_uid == current->fsuid /* && !(acc & MAY_TRUNC) */) return 0; if (current->fsuid != 0) { diff -Naur pre9/linux/fs/super.c test/linux/fs/super.c --- pre9/linux/fs/super.c Mon Sep 18 13:58:34 2000 +++ test/linux/fs/super.c Mon Sep 18 14:35:50 2000 @@ -566,6 +566,7 @@ s->s_flags = flags; s->s_dirt = 0; sema_init(&s->s_vfs_rename_sem,1); + sema_init(&s->s_nfsd_free_path_sem,1); /* N.B. Should lock superblock now ... */ if (!type->read_super(s, data, silent)) goto out_fail; @@ -1154,6 +1155,7 @@ sb->s_dev = get_unnamed_dev(); sb->s_flags = root_mountflags; sema_init(&sb->s_vfs_rename_sem,1); + sema_init(&sb->s_nfsd_free_path_sem,1); vfsmnt = add_vfsmnt(sb, "/dev/root", "/"); if (vfsmnt) { if (nfs_root_mount(sb) >= 0) { diff -Naur pre9/linux/include/linux/dcache.h test/linux/include/linux/dcache.h --- pre9/linux/include/linux/dcache.h Mon Sep 18 13:58:34 2000 +++ test/linux/include/linux/dcache.h Mon Sep 18 14:35:50 2000 @@ -100,6 +100,12 @@ * renamed" and has to be * deleted on the last dput() */ +#define DCACHE_NFSD_DISCONNECTED 0x0004 /* This dentry is not currently connected to the + * dcache tree. Its parent will either be itself, + * or will have this flag as well. + * If this dentry points to a directory, then + * s_nfsd_free_path semaphore will be down + */ /* * d_drop() unhashes the entry from the parent diff -Naur pre9/linux/include/linux/ext2_fs.h test/linux/include/linux/ext2_fs.h --- pre9/linux/include/linux/ext2_fs.h Mon Sep 18 13:58:34 2000 +++ test/linux/include/linux/ext2_fs.h Mon Sep 18 14:48:28 2000 @@ -238,7 +238,7 @@ } masix1; } osd1; /* OS dependent 1 */ __u32 i_block[EXT2_N_BLOCKS];/* Pointers to blocks */ - __u32 i_version; /* File version (for NFS) */ + __u32 i_generation; /* File version (for NFS) */ __u32 i_file_acl; /* File ACL */ __u32 i_dir_acl; /* Directory ACL */ __u32 i_faddr; /* Fragment address */ diff -Naur pre9/linux/include/linux/ext2_fs_i.h test/linux/include/linux/ext2_fs_i.h --- pre9/linux/include/linux/ext2_fs_i.h Thu Apr 2 13:39:51 1998 +++ test/linux/include/linux/ext2_fs_i.h Mon Sep 18 14:35:50 2000 @@ -29,7 +29,7 @@ __u32 i_file_acl; __u32 i_dir_acl; __u32 i_dtime; - __u32 i_version; + __u32 not_used_1; /* FIX: not used/ 2.2 placeholder */ __u32 i_block_group; __u32 i_next_alloc_block; __u32 i_next_alloc_goal; diff -Naur pre9/linux/include/linux/fs.h test/linux/include/linux/fs.h --- pre9/linux/include/linux/fs.h Mon Sep 18 13:58:34 2000 +++ test/linux/include/linux/fs.h Mon Sep 18 14:46:59 2000 @@ -571,6 +571,15 @@ * even looking at it. You had been warned. */ struct semaphore s_vfs_rename_sem; /* Kludge */ + + /* The next field is used by knfsd when converting a (inode number based) + * file handle into a dentry. As it builds a path in the dcache tree from + * the bottom up, there may for a time be a subpath of dentrys which is not + * connected to the main tree. This semaphore ensure that there is only ever + * one such free path per filesystem. Note that unconnected files (or other + * non-directories) are allowed, but not unconnected diretories. + */ + struct semaphore s_nfsd_free_path_sem; }; /* diff -Naur pre9/linux/include/linux/lockd/debug.h test/linux/include/linux/lockd/debug.h --- pre9/linux/include/linux/lockd/debug.h Tue May 11 10:36:15 1999 +++ test/linux/include/linux/lockd/debug.h Mon Sep 18 15:04:07 2000 @@ -45,6 +45,7 @@ #define NLMDBG_CLNTSUBS 0x0020 #define NLMDBG_SVCSUBS 0x0040 #define NLMDBG_HOSTCACHE 0x0080 +#define NLMDBG_XDR 0x0100 #define NLMDBG_ALL 0x7fff diff -Naur pre9/linux/include/linux/lockd/lockd.h test/linux/include/linux/lockd/lockd.h --- pre9/linux/include/linux/lockd/lockd.h Mon Sep 18 13:58:34 2000 +++ test/linux/include/linux/lockd/lockd.h Mon Sep 18 15:04:07 2000 @@ -17,6 +17,7 @@ #include <linux/nfsd/nfsfh.h> #include <linux/lockd/bind.h> #include <linux/lockd/xdr.h> +#include <linux/lockd/xdr4.h> #include <linux/lockd/debug.h> /* @@ -112,6 +113,7 @@ */ extern struct rpc_program nlm_program; extern struct svc_procedure nlmsvc_procedures[]; +extern struct svc_procedure nlmsvc_procedures4[]; extern unsigned long nlmsvc_grace_period; extern unsigned long nlmsvc_timeout; diff -Naur pre9/linux/include/linux/lockd/xdr4.h test/linux/include/linux/lockd/xdr4.h --- pre9/linux/include/linux/lockd/xdr4.h Wed Dec 31 16:00:00 1969 +++ test/linux/include/linux/lockd/xdr4.h Mon Sep 18 15:04:07 2000 @@ -0,0 +1,43 @@ +/* + * linux/include/linux/lockd/xdr.h + * + * XDR types for the NLM protocol + * + * Copyright (C) 1996 Olaf Kirch <okir@monad.swb.de> + */ + +#ifndef LOCKD_XDR4_H +#define LOCKD_XDR4_H + +#include <linux/fs.h> +#include <linux/nfs.h> +#include <linux/sunrpc/xdr.h> +#include <linux/lockd/xdr.h> + +extern u32 nlm4_granted, nlm4_lck_denied, nlm4_lck_denied_nolocks, + nlm4_lck_blocked, nlm4_lck_denied_grace_period, nlm4_deadlock, + nlm4_rofs, nlm4_stale_fh, nlm4_fbig, nlm4_failed; + +#define NLMSVC_XDRSIZE sizeof(struct nlm_args) + +int nlm4svc_decode_testargs(struct svc_rqst *, u32 *, struct nlm_args *); +int nlm4svc_encode_testres(struct svc_rqst *, u32 *, struct nlm_res *); +int nlm4svc_decode_lockargs(struct svc_rqst *, u32 *, struct nlm_args *); +int nlm4svc_decode_cancargs(struct svc_rqst *, u32 *, struct nlm_args *); +int nlm4svc_decode_unlockargs(struct svc_rqst *, u32 *, struct nlm_args *); +int nlm4svc_encode_res(struct svc_rqst *, u32 *, struct nlm_res *); +int nlm4svc_decode_res(struct svc_rqst *, u32 *, struct nlm_res *); +int nlm4svc_encode_void(struct svc_rqst *, u32 *, void *); +int nlm4svc_decode_void(struct svc_rqst *, u32 *, void *); +int nlm4svc_decode_shareargs(struct svc_rqst *, u32 *, struct nlm_args *); +int nlm4svc_encode_shareres(struct svc_rqst *, u32 *, struct nlm_res *); +int nlm4svc_decode_notify(struct svc_rqst *, u32 *, struct nlm_args *); +int nlm4svc_decode_reboot(struct svc_rqst *, u32 *, struct nlm_reboot *); +/* +int nlmclt_encode_testargs(struct rpc_rqst *, u32 *, struct nlm_args *); +int nlmclt_encode_lockargs(struct rpc_rqst *, u32 *, struct nlm_args *); +int nlmclt_encode_cancargs(struct rpc_rqst *, u32 *, struct nlm_args *); +int nlmclt_encode_unlockargs(struct rpc_rqst *, u32 *, struct nlm_args *); + */ + +#endif /* LOCKD_XDR4_H */ diff -Naur pre9/linux/include/linux/nfsd/cache.h test/linux/include/linux/nfsd/cache.h --- pre9/linux/include/linux/nfsd/cache.h Mon Dec 28 14:09:59 1998 +++ test/linux/include/linux/nfsd/cache.h Mon Sep 18 14:49:01 2000 @@ -25,9 +25,11 @@ unsigned char c_state, /* unused, inprog, done */ c_type, /* status, buffer */ c_secure : 1; /* req came from port < 1024 */ - struct in_addr c_client; + struct sockaddr_in c_addr; u32 c_xid; + u32 c_prot; u32 c_proc; + u32 c_vers; unsigned long c_timestamp; union { struct svc_buf u_buffer; diff -Naur pre9/linux/include/linux/nfsd/const.h test/linux/include/linux/nfsd/const.h --- pre9/linux/include/linux/nfsd/const.h Mon Sep 18 13:58:34 2000 +++ test/linux/include/linux/nfsd/const.h Mon Sep 18 14:35:50 2000 @@ -1,19 +1,14 @@ /* - * include/linux/nfsd/nfsconst.h + * include/linux/nfsd/const.h * * Various constants related to NFS. * - * Copyright (C) 1995 Olaf Kirch <okir@monad.swb.de> + * Copyright (C) 1995-1997 Olaf Kirch <okir@monad.swb.de> */ -#ifndef __NFSCONST_H__ -#define __NFSCONST_H__ +#ifndef _LINUX_NFSD_CONST_H +#define _LINUX_NFSD_CONST_H -#include <linux/limits.h> -#include <linux/types.h> -#include <linux/unistd.h> -#include <linux/dirent.h> -#include <linux/fs.h> #include <linux/nfs.h> #define NFS_FHSIZE 32 @@ -32,12 +27,10 @@ #define NFS2_MAXPATHLEN 1024 #define NFS2_MAXNAMLEN 255 -#define NFS2_FHSIZE NFS_FHSIZE #define NFS2_COOKIESIZE 4 #define NFS3_MAXPATHLEN PATH_MAX #define NFS3_MAXNAMLEN NAME_MAX -#define NFS3_FHSIZE NFS_FHSIZE #define NFS3_COOKIEVERFSIZE 8 #define NFS3_CREATEVERFSIZE 8 #define NFS3_WRITEVERFSIZE 8 @@ -48,43 +41,6 @@ # define NFS_SUPER_MAGIC 0x6969 #endif -/* - * NFS stats. The good thing with these values is that NFSv3 errors are - * a superset of NFSv2 errors (with the exception of NFSERR_WFLUSH which - * no-one uses anyway), so we can happily mix code as long as we make sure - * no NFSv3 errors are returned to NFSv2 clients. - */ -#define NFS_OK 0 /* v2 v3 */ -#define NFSERR_PERM 1 /* v2 v3 */ -#define NFSERR_NOENT 2 /* v2 v3 */ -#define NFSERR_IO 5 /* v2 v3 */ -#define NFSERR_NXIO 6 /* v2 v3 */ -#define NFSERR_ACCES 13 /* v2 v3 */ -#define NFSERR_EXIST 17 /* v2 v3 */ -#define NFSERR_XDEV 18 /* v3 */ -#define NFSERR_NODEV 19 /* v2 v3 */ -#define NFSERR_NOTDIR 20 /* v2 v3 */ -#define NFSERR_ISDIR 21 /* v2 v3 */ -#define NFSERR_INVAL 22 /* v3 */ -#define NFSERR_FBIG 27 /* v2 v3 */ -#define NFSERR_NOSPC 28 /* v2 v3 */ -#define NFSERR_ROFS 30 /* v2 v3 */ -#define NFSERR_MLINK 31 /* v3 */ -#define NFSERR_NAMETOOLONG 63 /* v2 v3 */ -#define NFSERR_NOTEMPTY 66 /* v2 v3 */ -#define NFSERR_DQUOT 69 /* v2 v3 */ -#define NFSERR_STALE 70 /* v2 v3 */ -#define NFSERR_REMOTE 71 /* v3 */ -#define NFSERR_WFLUSH 99 /* v2 */ -#define NFSERR_BADHANDLE 10001 /* v3 */ -#define NFSERR_NOT_SYNC 10002 /* v3 */ -#define NFSERR_BAD_COOKIE 10003 /* v3 */ -#define NFSERR_NOTSUPP 10004 /* v3 */ -#define NFSERR_TOOSMALL 10005 /* v3 */ -#define NFSERR_SERVERFAULT 10006 /* v3 */ -#define NFSERR_BADTYPE 10007 /* v3 */ -#define NFSERR_JUKEBOX 10008 /* v3 */ - #endif /* __KERNEL__ */ -#endif /* __NFSCONST_H__ */ +#endif /* _LINUX_NFSD_CONST_H */ diff -Naur pre9/linux/include/linux/nfsd/export.h test/linux/include/linux/nfsd/export.h --- pre9/linux/include/linux/nfsd/export.h Tue May 11 10:36:17 1999 +++ test/linux/include/linux/nfsd/export.h Mon Sep 18 14:49:01 2000 @@ -4,16 +4,17 @@ * Public declarations for NFS exports. The definitions for the * syscall interface are in nfsctl.h * - * Copyright (C) 1995, 1996 Olaf Kirch <okir@monad.swb.de> + * Copyright (C) 1995-1997 Olaf Kirch <okir@monad.swb.de> */ #ifndef NFSD_EXPORT_H #define NFSD_EXPORT_H -#include <linux/types.h> -#include <linux/socket.h> -#include <linux/in.h> -#include <linux/fs.h> +#include <asm/types.h> +#ifdef __KERNEL__ +# include <linux/types.h> +# include <linux/in.h> +#endif /* * Important limits for the exports stuff. @@ -34,8 +35,10 @@ #define NFSEXP_UIDMAP 0x0040 #define NFSEXP_KERBEROS 0x0080 /* not available */ #define NFSEXP_SUNSECURE 0x0100 -#define NFSEXP_CROSSMNT 0x0200 /* not available */ -#define NFSEXP_ALLFLAGS 0x03FF +#define NFSEXP_CROSSMNT 0x0200 +#define NFSEXP_NOSUBTREECHECK 0x0400 +#define NFSEXP_NOAUTHNLM 0x0800 /* Don't authenticate NLM requests - just trust */ +#define NFSEXP_ALLFLAGS 0x0FFF #ifdef __KERNEL__ diff -Naur pre9/linux/include/linux/nfsd/nfsd.h test/linux/include/linux/nfsd/nfsd.h --- pre9/linux/include/linux/nfsd/nfsd.h Tue May 11 10:36:17 1999 +++ test/linux/include/linux/nfsd/nfsd.h Mon Sep 18 14:49:01 2000 @@ -4,7 +4,7 @@ * Hodge-podge collection of knfsd-related stuff. * I will sort this out later. * - * Copyright (C) 1995 Olaf Kirch <okir@monad.swb.de> + * Copyright (C) 1995-1997 Olaf Kirch <okir@monad.swb.de> */ #ifndef LINUX_NFSD_NFSD_H @@ -24,18 +24,21 @@ /* * nfsd version */ -#define NFSD_VERSION "0.4" +#define NFSD_VERSION "0.5" #ifdef __KERNEL__ /* * Special flags for nfsd_permission. These must be different from MAY_READ, * MAY_WRITE, and MAY_EXEC. */ -#define MAY_NOP 0 -#define MAY_SATTR 8 -#define MAY_TRUNC 16 -#if (MAY_SATTR | MAY_TRUNC) & (MAY_READ | MAY_WRITE | MAY_EXEC) -# error "please use a different value for MAY_SATTR or MAY_TRUNC." +#define MAY_NOP 0x00000000 +#define MAY_SATTR 0x00000008 +#define MAY_TRUNC 0x00000010 +#define MAY_LOCK 0x00000020 +#define NO_OWNER_OVERRIDE 0x00000040 + +#if (MAY_SATTR | MAY_TRUNC | MAY_LOCK | NO_OWNER_OVERRIDE) & (MAY_READ | MAY_WRITE | MAY_EXEC) +# error "please use a different value for MAY_SATTR or MAY_TRUNC or MAY_LOCK or NO_OWNER_OVERRIDE." #endif #define MAY_CREATE (MAY_EXEC|MAY_WRITE) #define MAY_REMOVE (MAY_EXEC|MAY_WRITE|MAY_TRUNC) @@ -61,6 +64,9 @@ * Procedure table for NFSv2 */ extern struct svc_procedure nfsd_procedures2[]; +#ifdef CONFIG_NFSD_V3 +extern struct svc_procedure nfsd_procedures3[]; +#endif /* CONFIG_NFSD_V3 */ extern struct svc_program nfsd_program; /* @@ -74,11 +80,20 @@ void nfsd_racache_shutdown(void); int nfsd_lookup(struct svc_rqst *, struct svc_fh *, const char *, int, struct svc_fh *); +#ifdef CONFIG_NFSD_V3 +int nfsd_access(struct svc_rqst *, struct svc_fh *, u32 *); +#endif /* CONFIG_NFSD_V3 */ int nfsd_setattr(struct svc_rqst *, struct svc_fh *, struct iattr *); int nfsd_create(struct svc_rqst *, struct svc_fh *, char *name, int len, struct iattr *attrs, int type, dev_t rdev, struct svc_fh *res); +#ifdef CONFIG_NFSD_V3 +int nfsd_create_v3(struct svc_rqst *, struct svc_fh *, + char *name, int len, struct iattr *attrs, + struct svc_fh *res, int createmode, + u32 *verifier); +#endif /* CONFIG_NFSD_V3 */ int nfsd_open(struct svc_rqst *, struct svc_fh *, int, int, struct file *); void nfsd_close(struct file *); @@ -90,7 +105,7 @@ char *, int *); int nfsd_symlink(struct svc_rqst *, struct svc_fh *, char *name, int len, char *path, int plen, - struct svc_fh *res); + struct svc_fh *res, struct iattr *); int nfsd_link(struct svc_rqst *, struct svc_fh *, char *, int, struct svc_fh *); int nfsd_rename(struct svc_rqst *, @@ -104,9 +119,13 @@ unsigned long size); int nfsd_readdir(struct svc_rqst *, struct svc_fh *, loff_t, encode_dent_fn, - u32 *buffer, int *countp); + u32 *buffer, int *countp, u32 *verf); int nfsd_statfs(struct svc_rqst *, struct svc_fh *, struct statfs *); +#ifdef CONFIG_NFSD_V3 +int nfsd_commit(struct svc_rqst *, struct svc_fh *, + off_t, unsigned long); +#endif /* CONFIG_NFSD_V3 */ int nfsd_notify_change(struct inode *, struct iattr *); int nfsd_permission(struct svc_export *, struct dentry *, int); @@ -146,6 +165,7 @@ nfserr_rofs, nfserr_mlink, nfserr_nametoolong, + nfserr_notempty, nfserr_dquot, nfserr_stale, nfserr_remote, diff -Naur pre9/linux/include/linux/nfsd/nfsfh.h test/linux/include/linux/nfsd/nfsfh.h --- pre9/linux/include/linux/nfsd/nfsfh.h Tue Jan 4 10:12:25 2000 +++ test/linux/include/linux/nfsd/nfsfh.h Mon Sep 18 14:49:01 2000 @@ -11,12 +11,15 @@ * Copyright (C) 1995-1999 Olaf Kirch <okir@monad.swb.de> */ -#ifndef NFSD_FH_H -#define NFSD_FH_H +#ifndef _LINUX_NFSD_FH_H +#define _LINUX_NFSD_FH_H -#include <linux/types.h> -#include <linux/string.h> -#include <linux/fs.h> +#include <asm/types.h> +#ifdef __KERNEL__ +# include <linux/types.h> +# include <linux/string.h> +# include <linux/fs.h> +#endif #include <linux/nfsd/const.h> #include <linux/nfsd/debug.h> @@ -83,12 +86,32 @@ struct knfs_fh fh_handle; /* FH data */ struct dentry * fh_dentry; /* validated dentry */ struct svc_export * fh_export; /* export pointer */ - size_t fh_pre_size; /* size before operation */ +#ifdef CONFIG_NFSD_V3 + unsigned char fh_post_saved; /* post-op attrs saved */ + unsigned char fh_pre_saved; /* pre-op attrs saved */ +#endif /* CONFIG_NFSD_V3 */ + unsigned char fh_locked; /* inode locked by us */ + +#ifdef CONFIG_NFSD_V3 + /* Pre-op attributes saved during fh_lock */ + __u64 fh_pre_size; /* size before operation */ time_t fh_pre_mtime; /* mtime before oper */ time_t fh_pre_ctime; /* ctime before oper */ - unsigned long fh_post_version;/* inode version after oper */ - unsigned char fh_locked; /* inode locked by us */ - unsigned char fh_dverified; /* dentry has been checked */ + + /* Post-op attributes saved in fh_unlock */ + umode_t fh_post_mode; /* i_mode */ + nlink_t fh_post_nlink; /* i_nlink */ + uid_t fh_post_uid; /* i_uid */ + gid_t fh_post_gid; /* i_gid */ + __u64 fh_post_size; /* i_size */ + unsigned long fh_post_blocks; /* i_blocks */ + unsigned long fh_post_blksize;/* i_blksize */ + kdev_t fh_post_rdev; /* i_rdev */ + time_t fh_post_atime; /* i_atime */ + time_t fh_post_mtime; /* i_mtime */ + time_t fh_post_ctime; /* i_ctime */ +#endif /* CONFIG_NFSD_V3 */ + } svc_fh; /* @@ -105,18 +128,11 @@ void fh_compose(struct svc_fh *, struct svc_export *, struct dentry *); void fh_update(struct svc_fh *); void fh_put(struct svc_fh *); -void nfsd_fh_flush(kdev_t); -void nfsd_fh_init(void); -void nfsd_fh_shutdown(void); -void nfsd_fh_free(void); - -void expire_all(void); -void expire_by_dentry(struct dentry *); static __inline__ struct svc_fh * fh_copy(struct svc_fh *dst, struct svc_fh *src) { - if (src->fh_dverified || src->fh_locked) { + if (src->fh_dentry || src->fh_locked) { struct dentry *dentry = src->fh_dentry; printk(KERN_ERR "fh_copy: copying %s/%s, already verified!\n", dentry->d_parent->d_name.name, dentry->d_name.name); @@ -133,6 +149,53 @@ return fhp; } +#ifdef CONFIG_NFSD_V3 +/* + * Fill in the pre_op attr for the wcc data + */ +static inline void +fill_pre_wcc(struct svc_fh *fhp) +{ + struct inode *inode; + + inode = fhp->fh_dentry->d_inode; + if (!fhp->fh_pre_saved) { + fhp->fh_pre_mtime = inode->i_mtime; + fhp->fh_pre_ctime = inode->i_ctime; + fhp->fh_pre_size = inode->i_size; + fhp->fh_pre_saved = 1; + } + fhp->fh_locked = 1; +} + +/* + * Fill in the post_op attr for the wcc data + */ +static inline void +fill_post_wcc(struct svc_fh *fhp) +{ + struct inode *inode = fhp->fh_dentry->d_inode; + + if (fhp->fh_post_saved) + printk("nfsd: inode locked twice during operation.\n"); + + fhp->fh_post_mode = inode->i_mode; + fhp->fh_post_nlink = inode->i_nlink; + fhp->fh_post_uid = inode->i_uid; + fhp->fh_post_gid = inode->i_gid; + fhp->fh_post_size = inode->i_size; + fhp->fh_post_blksize = inode->i_blksize; + fhp->fh_post_blocks = inode->i_blocks; + fhp->fh_post_rdev = inode->i_rdev; + fhp->fh_post_atime = inode->i_atime; + fhp->fh_post_mtime = inode->i_mtime; + fhp->fh_post_ctime = inode->i_ctime; + fhp->fh_post_saved = 1; + fhp->fh_locked = 0; +} +#endif /* CONFIG_NFSD_V3 */ + + /* * Lock a file handle/inode */ @@ -142,11 +205,10 @@ struct dentry *dentry = fhp->fh_dentry; struct inode *inode; - /* dfprintk(FILEOP, "nfsd: fh_lock(%x/%ld) locked = %d\n", - SVCFH_DEV(fhp), SVCFH_INO(fhp), fhp->fh_locked); - */ - if (!fhp->fh_dverified) { + SVCFH_DEV(fhp), (long)SVCFH_INO(fhp), fhp->fh_locked); + + if (!fhp->fh_dentry) { printk(KERN_ERR "fh_lock: fh not verified!\n"); return; } @@ -158,9 +220,11 @@ inode = dentry->d_inode; down(&inode->i_sem); - if (!fhp->fh_pre_mtime) - fhp->fh_pre_mtime = inode->i_mtime; +#ifdef CONFIG_NFSD_V3 + fill_pre_wcc(fhp); +#else fhp->fh_locked = 1; +#endif /* CONFIG_NFSD_V3 */ } /* @@ -169,25 +233,23 @@ static inline void fh_unlock(struct svc_fh *fhp) { - if (!fhp->fh_dverified) + if (!fhp->fh_dentry) printk(KERN_ERR "fh_unlock: fh not verified!\n"); if (fhp->fh_locked) { +#ifdef CONFIG_NFSD_V3 + fill_post_wcc(fhp); + up(&fhp->fh_dentry->d_inode->i_sem); +#else struct dentry *dentry = fhp->fh_dentry; struct inode *inode = dentry->d_inode; - if (!fhp->fh_post_version) - fhp->fh_post_version = inode->i_version; fhp->fh_locked = 0; up(&inode->i_sem); +#endif /* CONFIG_NFSD_V3 */ } } - -/* - * This is a long term cache to help find renamed files. - */ -void add_to_rename_cache(ino_t new_dirino, kdev_t dev, ino_t dirino, ino_t ino); - #endif /* __KERNEL__ */ -#endif /* NFSD_FH_H */ + +#endif /* _LINUX_NFSD_FH_H */ diff -Naur pre9/linux/include/linux/nfsd/stats.h test/linux/include/linux/nfsd/stats.h --- pre9/linux/include/linux/nfsd/stats.h Tue Jan 4 10:12:25 2000 +++ test/linux/include/linux/nfsd/stats.h Mon Sep 18 14:35:50 2000 @@ -13,12 +13,11 @@ unsigned int rchits; /* repcache hits */ unsigned int rcmisses; /* repcache hits */ unsigned int rcnocache; /* uncached reqs */ - unsigned int fh_cached; /* dentry cached */ - unsigned int fh_valid; /* dentry validated */ - unsigned int fh_fixup; /* dentry fixup validated */ unsigned int fh_lookup; /* new lookup required */ + unsigned int fh_anon; + unsigned int fh_nocache_nondir; + unsigned int fh_nocache_dir; unsigned int fh_stale; /* FH stale error */ - unsigned int fh_concurrent; /* concurrent request */ }; #ifdef __KERNEL__ diff -Naur pre9/linux/include/linux/nfsd/syscall.h test/linux/include/linux/nfsd/syscall.h --- pre9/linux/include/linux/nfsd/syscall.h Tue Jan 4 10:12:25 2000 +++ test/linux/include/linux/nfsd/syscall.h Mon Sep 18 14:49:02 2000 @@ -3,15 +3,18 @@ * * This file holds all declarations for the knfsd syscall interface. * - * Copyright (C) 1995 Olaf Kirch <okir@monad.swb.de> + * Copyright (C) 1995-1997 Olaf Kirch <okir@monad.swb.de> */ #ifndef NFSD_SYSCALL_H #define NFSD_SYSCALL_H -#include <linux/config.h> -#include <linux/types.h> -#include <linux/socket.h> +#include <asm/types.h> +#ifdef __KERNEL__ +# include <linux/config.h> +# include <linux/types.h> +# include <linux/in.h> +#endif #include <linux/posix_types.h> #include <linux/nfsd/const.h> #include <linux/nfsd/export.h> diff -Naur pre9/linux/include/linux/nfsd/xdr3.h test/linux/include/linux/nfsd/xdr3.h --- pre9/linux/include/linux/nfsd/xdr3.h Mon Apr 7 11:35:32 1997 +++ test/linux/include/linux/nfsd/xdr3.h Mon Sep 18 14:49:04 2000 @@ -3,17 +3,18 @@ * * XDR types for NFSv3 in nfsd. * - * Copyright (C) 1996, Olaf Kirch <okir@monad.swb.de> + * Copyright (C) 1996-1998, Olaf Kirch <okir@monad.swb.de> */ -#ifndef LINUX_NFSD_XDR3_H -#define LINUX_NFSD_XDR3_H +#ifndef _LINUX_NFSD_XDR3_H +#define _LINUX_NFSD_XDR3_H #include <linux/nfsd/xdr.h> struct nfsd3_sattrargs { struct svc_fh fh; struct iattr attrs; + int check_guard; time_t guardtime; }; @@ -88,7 +89,7 @@ struct nfsd3_readdirargs { struct svc_fh fh; - __u32 cookie; + __u64 cookie; __u32 dircount; __u32 count; __u32 * verf; @@ -97,7 +98,7 @@ struct nfsd3_commitargs { struct svc_fh fh; __u64 offset; - __u64 count; + __u32 count; }; struct nfsd3_attrstat { @@ -105,7 +106,8 @@ struct svc_fh fh; }; -struct nfsd3_lookupres { +/* LOOKUP, CREATE, MKDIR, SYMLINK, MKNOD */ +struct nfsd3_diropres { __u32 status; struct svc_fh dirfh; struct svc_fh fh; @@ -137,12 +139,6 @@ int committed; }; -struct nfsd3_createres { - __u32 status; - struct svc_fh dirfh; - struct svc_fh fh; -}; - struct nfsd3_renameres { __u32 status; struct svc_fh ffh; @@ -158,10 +154,11 @@ struct nfsd3_readdirres { __u32 status; struct svc_fh fh; - __u32 * list_end; + int count; + __u32 verf[2]; }; -struct nfsd3_statfsres { +struct nfsd3_fsstatres { __u32 status; struct statfs stats; __u32 invarsec; @@ -184,6 +181,8 @@ __u32 status; __u32 p_link_max; __u32 p_name_max; + __u32 p_no_trunc; + __u32 p_chown_restricted; __u32 p_case_insensitive; __u32 p_case_preserving; }; @@ -194,7 +193,7 @@ }; /* dummy type for release */ -struct nfsd3_fhandle2 { +struct nfsd3_fhandle_pair { __u32 dummy; struct svc_fh fh1; struct svc_fh fh2; @@ -213,16 +212,15 @@ struct nfsd3_linkargs linkargs; struct nfsd3_symlinkargs symlinkargs; struct nfsd3_readdirargs readdirargs; - struct nfsd3_lookupres lookupres; + struct nfsd3_diropres diropres; struct nfsd3_accessres accessres; struct nfsd3_readlinkres readlinkres; struct nfsd3_readres readres; struct nfsd3_writeres writeres; - struct nfsd3_createres createres; struct nfsd3_renameres renameres; struct nfsd3_linkres linkres; struct nfsd3_readdirres readdirres; - struct nfsd3_statfsres statfsres; + struct nfsd3_fsstatres fsstatres; struct nfsd3_fsinfores fsinfores; struct nfsd3_pathconfres pathconfres; struct nfsd3_commitres commitres; @@ -230,39 +228,87 @@ #define NFS3_SVC_XDRSIZE sizeof(union nfsd3_xdrstore) -void nfsxdr_init(void); - int nfs3svc_decode_fhandle(struct svc_rqst *, u32 *, struct svc_fh *); -int nfs3svc_decode_sattr3args(struct svc_rqst *, u32 *, +int nfs3svc_decode_sattrargs(struct svc_rqst *, u32 *, struct nfsd3_sattrargs *); -int nfs3svc_decode_dirop3args(struct svc_rqst *, u32 *, +int nfs3svc_decode_diropargs(struct svc_rqst *, u32 *, struct nfsd3_diropargs *); -int nfs3svc_decode_read3args(struct svc_rqst *, u32 *, +int nfs3svc_decode_accessargs(struct svc_rqst *, u32 *, + struct nfsd3_accessargs *); +int nfs3svc_decode_readargs(struct svc_rqst *, u32 *, struct nfsd3_readargs *); -int nfs3svc_decode_write3args(struct svc_rqst *, u32 *, +int nfs3svc_decode_writeargs(struct svc_rqst *, u32 *, struct nfsd3_writeargs *); -int nfs3svc_decode_create3args(struct svc_rqst *, u32 *, +int nfs3svc_decode_createargs(struct svc_rqst *, u32 *, + struct nfsd3_createargs *); +int nfs3svc_decode_mkdirargs(struct svc_rqst *, u32 *, struct nfsd3_createargs *); -int nfs3svc_decode_rename3args(struct svc_rqst *, u32 *, +int nfs3svc_decode_mknodargs(struct svc_rqst *, u32 *, + struct nfsd3_mknodargs *); +int nfs3svc_decode_renameargs(struct svc_rqst *, u32 *, struct nfsd3_renameargs *); -int nfs3svc_decode_link3args(struct svc_rqst *, u32 *, +int nfs3svc_decode_linkargs(struct svc_rqst *, u32 *, struct nfsd3_linkargs *); -int nfs3svc_decode_symlink3args(struct svc_rqst *, u32 *, +int nfs3svc_decode_symlinkargs(struct svc_rqst *, u32 *, struct nfsd3_symlinkargs *); -int nfs3svc_decode_readdir3args(struct svc_rqst *, u32 *, +int nfs3svc_decode_readdirargs(struct svc_rqst *, u32 *, struct nfsd3_readdirargs *); +int nfs3svc_decode_readdirplusargs(struct svc_rqst *, u32 *, + struct nfsd3_readdirargs *); +int nfs3svc_decode_commitargs(struct svc_rqst *, u32 *, + struct nfsd3_commitargs *); +int nfs3svc_encode_voidres(struct svc_rqst *, u32 *, void *); +int nfs3svc_encode_attrstat(struct svc_rqst *, u32 *, + struct nfsd3_attrstat *); +int nfs3svc_encode_wccstat(struct svc_rqst *, u32 *, + struct nfsd3_attrstat *); +int nfs3svc_encode_diropres(struct svc_rqst *, u32 *, + struct nfsd3_diropres *); +int nfs3svc_encode_accessres(struct svc_rqst *, u32 *, + struct nfsd3_accessres *); int nfs3svc_encode_readlinkres(struct svc_rqst *, u32 *, struct nfsd3_readlinkres *); int nfs3svc_encode_readres(struct svc_rqst *, u32 *, struct nfsd3_readres *); -int nfs3svc_encode_statfsres(struct svc_rqst *, u32 *, - struct nfsd3_statfsres *); +int nfs3svc_encode_writeres(struct svc_rqst *, u32 *, struct nfsd3_writeres *); +int nfs3svc_encode_createres(struct svc_rqst *, u32 *, + struct nfsd3_diropres *); +int nfs3svc_encode_renameres(struct svc_rqst *, u32 *, + struct nfsd3_renameres *); +int nfs3svc_encode_linkres(struct svc_rqst *, u32 *, + struct nfsd3_linkres *); int nfs3svc_encode_readdirres(struct svc_rqst *, u32 *, struct nfsd3_readdirres *); +int nfs3svc_encode_fsstatres(struct svc_rqst *, u32 *, + struct nfsd3_fsstatres *); +int nfs3svc_encode_fsinfores(struct svc_rqst *, u32 *, + struct nfsd3_fsinfores *); +int nfs3svc_encode_pathconfres(struct svc_rqst *, u32 *, + struct nfsd3_pathconfres *); +int nfs3svc_encode_commitres(struct svc_rqst *, u32 *, + struct nfsd3_commitres *); + int nfs3svc_release_fhandle(struct svc_rqst *, u32 *, - struct nfsd_fhandle *); + struct nfsd3_attrstat *); int nfs3svc_release_fhandle2(struct svc_rqst *, u32 *, - struct nfsd3_fhandle2 *); + struct nfsd3_fhandle_pair *); int nfs3svc_encode_entry(struct readdir_cd *, const char *name, - int namlen, unsigned long offset, ino_t ino); + int namlen, off_t offset, ino_t ino); +int nfs3svc_encode_entry_plus(struct readdir_cd *, const char *name, + int namlen, off_t offset, ino_t ino); + +#ifdef __KERNEL__ + +/* + * This is needed in nfs_readdir for encoding NFS3 directory cookies. + */ +static inline u32 * +enc64(u32 *p, u64 val) +{ + *p++ = htonl(val >> 32); + *p++ = htonl(val & 0xffffffff); + return p; +} + +#endif /* __KERNEL__ */ -#endif /* LINUX_NFSD_XDR3_H */ +#endif /* _LINUX_NFSD_XDR3_H */ diff -Naur pre9/linux/net/sunrpc/svc.c test/linux/net/sunrpc/svc.c --- pre9/linux/net/sunrpc/svc.c Tue Jan 4 10:12:27 2000 +++ test/linux/net/sunrpc/svc.c Mon Sep 18 14:35:50 2000 @@ -32,6 +32,9 @@ struct svc_serv *serv; xdr_init(); +#ifdef RPC_DEBUG + rpc_register_sysctl(); +#endif if (!(serv = (struct svc_serv *) kmalloc(sizeof(*serv), GFP_KERNEL))) return NULL; @@ -267,8 +270,8 @@ if (prog != progp->pg_prog) goto err_bad_prog; - versp = progp->pg_vers[vers]; - if (!versp || vers >= progp->pg_nvers) + if (vers >= progp->pg_nvers || + !(versp = progp->pg_vers[vers])) goto err_bad_vers; procp = versp->vs_proc + proc; --------------500E2001E72FE3DE433B11A0-- _______________________________________________ NFS maillist - NFS@lists.sourceforge.net http://lists.sourceforge.net/mailman/listinfo/nfs