From 367a5eb4e4217249aef19ab8f21502f515c937b1 Mon Sep 17 00:00:00 2001
From: Avi Kivity <avi@cloudius-systems.com>
Date: Mon, 19 Aug 2013 14:58:16 +0300
Subject: [PATCH] vfs: improve response time of dup3() closing the original
 file

If dup3() is called with oldfd pointing to an existing file, it will close
the file for us.  37988879086 converted fd operations to RCU, which caused
this close to be deferred until after the RCU grace period (43df74e7062 fixed
this, but only for close(), not for dup3).

The asynchronous operation of dup3() should be fine, except that it triggers
a bug in sys_rename(): if the reference count of the vnode for either the
source or destination is elevated, rename fails with EBUSY.  This is due to
the coupling between vnodes and pathnames and can be fixed with the move
to separate dentries.

The whole sequence looks like

0xffffc000de90c010  3   1376912988.418660 vfs_lseek            95 0x0 0
0xffffc000de90c010  3   1376912988.418661 vfs_lseek_ret        0x0
0xffffc000de90c010  3   1376912988.418689 vfs_dup3             93 95 0x0
0xffffc000de90c010  3   1376912988.418696 vfs_dup3_ret         95
0xffffc000de90c010  3   1376912988.418711 vfs_close            95
0xffffc000de90c010  3   1376912988.418711 vfs_close_ret
...
0xffffc000de90c010  3   1376912988.420573 vfs_close            95
0xffffc000de90c010  3   1376912988.420580 vfs_close_ret
0xffffc000de90c010  3   1376912988.420738 vfs_rename           "/usr/var/lib/cassandra/data/system/local/system-local-tmp-ic-1-Index.db" "/usr/var/lib/cassandra/data/system/l
ocal/system-local-ic-1-Index.db"
0xffffc000de90c010  3   1376912988.422302 vfs_pwritev_ret      0x56
0xffffc000de90c010  3   1376912988.422302 vfs_rename_err       16

fd 95 (as it was before dup3) is still open at the time of the rename.

Fix by not deferring the fdrop() in fdset(); 43df74e70626 already made fdrop()
safe to use directly.

Fixes failures with Cassandra.
---
 fs/vfs/kern_descrip.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/vfs/kern_descrip.cc b/fs/vfs/kern_descrip.cc
index 4476813db..a3dda9d82 100644
--- a/fs/vfs/kern_descrip.cc
+++ b/fs/vfs/kern_descrip.cc
@@ -102,7 +102,7 @@ int fdset(int fd, struct file *fp)
     }
 
     if (orig)
-        rcu_defer(fdrop, orig);
+        fdrop(orig);
 
     return 0;
 }
-- 
GitLab