- Jan 27, 2014
-
-
Nadav Har'El authored
Replace the old function condvar::wait(mutex*, uint64_t) with one taking a timepoint. This timepoint can use any clock which the timer supports, namely osv::clock::uptime or osv::clock::wall (as usual, wall-clock timers are not recommended, and are converted to an uptime timer at the point of instantiation). Leave a C-only function condvar_wait(convar*, mutex*, s64) but comment on what it takes. Reviewed-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Nadav Har'El authored
Fix sbwait implementation to use the new <osv/clock.hh> APIs and the monotonic clock. Reviewed-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Nadav Har'El authored
Reimplement the BSD functions getmicrotime(9), getmicrouptime(9) and variable "ticks", using the new clock APIs. getmicrotime() returns the system time ("wall clock"), while getmicrouptime and ticks return the time since boot. I believe this is the correct implementation according to the FreeBSD documentation, but our previous implementation didn't quite do this and it also worked ;-) The previous implementation pretended, according to getmicrouptime() and get_ticks(), that the system is up since 1970, and yet the variable "time_uptime" (which FreeBSD has) is never updated, and is fixed at 1 second :-) Reviewed-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Nadav Har'El authored
Fix msleep implementation to use the new <osv/clock.hh> APIs and the monotonic clock. Reviewed-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Nadav Har'El authored
Change callout implementation to use the new <osv/clock.hh> APIs and the monotonic clock. Since _callout.h now uses the C++ type osv::clock::uptime::time_point, it can only be used from C++ code. All the relevant code is already C++. Reviewed-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Jan 23, 2014
-
-
Zhi Yong Wu authored
BUILD SUCCESSFUL Total time: 35.396 secs make -r -C build/release/ all make[1]: Entering directory `/home/zwu/osv/build/release' CXX loader.o CXX runtime.o CXX drivers/vga.o CXX bsd/net.o CXX bsd/porting/networking.o /home/zwu/osv/bsd/porting/networking.cc: In function ‘int osv::if_set_mtu(std::string, u16)’: /home/zwu/osv/bsd/porting/networking.cc:43:32: error: missing braces around initializer for ‘char [16]’ [-Werror=missing-braces] cc1plus: all warnings being treated as errors make[1]: *** [bsd/porting/networking.o] Error 1 make[1]: Leaving directory `/home/zwu/osv/build/release' make: *** [all] Error 2 Signed-off-by:
Zhi Yong Wu <zwu.kernel@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Raphael S. Carvalho authored
This patch addresses a corner-case in our zfs_inactive which can potentially leak a znode object. *** Some background on znode/zfs_inactive *** - Used to deallocate fs-specific data. - Before destroying the znode, a DMU transaction is created to sync the znode to the backing store *if* its z_atime_dirty is set (Not relevant to this patch though). - When removing a link, zfs_remove sets the field zp->z_unlinked of the underlying znode if the number of links reached 0 (Simply put, not present in the fs anymore). *** The problem *** The actual problem shows up when zfs_inactive is used on znodes with the unlinked field set. The code wrapped around by this patch was previously added to speed up the call to vrecycle, whose name partially explains itself. Its first functionality is to eliminate all activity associated to the vnode, then put the vnode back into a list of free vnodes. OSv VFS layer doesn't support vrecycle, but our zfs_inactive is acting as if it were supported. Another thing is that vrecycle call was also removed. *** Solution *** Let's fix this problem by simply wrapping around the test which prevented zfs_inactive from working properly on unlinked znodes, thus leaking references to the underlying mount point afterwards. The commentary added into zfs_inactive also explains why these changes are needed. It would also make things easier when people look at it in the future, and try to understand why things are the way they are. Reviewed-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Raphael S. Carvalho authored
The zfs_remove() function calls zfs_dirent_lock, which in turn calls zfs_zget() which bumps up the underlying znode reference count once. However, neither zfs_remove() or zfs_rmdir() release the reference count after using it. This prevents zfs_zinactive() which is used to destroy the znode object from working properly. Another consequence is that each znode holds a reference to the underlying mount point, keeping it busy for unmount. Fix the znode refcnt by calling zfs_zinactive after znode usage. Reviewed-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Dmitry Fleytman authored
There were 2 places with ioctl definitions, Xen netfront driver was compiled with IOCTL definitions from wrong place. Fixed by changing include and deleting file with improper definitions Signed-off-by:
Dmitry Fleytman <dmitry@daynix.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Jan 22, 2014
-
-
Pekka Enberg authored
Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Jan 21, 2014
-
-
Avi Kivity authored
Instead of acquiring sockbuf::sb_mtx inside sblock() and sbunlock(), rely on the caller to take the lock for us. Expand existing lock hold regions in callers to make it so. This reduces acquisitions of sb_mtx. As a side effect, copies to and from userspace are done under the lock. This can affect MSG_NOWAIT with demand paging major faults, but these are screwed anyway. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
sb_rwlock is used to serialize concurrent writers (or readers) to the same socket buffer, but is quite expensive as it requires 4 atomic operations per transaction, even if there is no contention. Replace it with a waitqueue, and use the sockbuf::sb_mtx for serialization. This still has exactly the same cost, but we can later move sblock() and sbunlock() into contexts where the sockbuf::sb_mtx is already acquired. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
sblock() takes sb_rwlock for reading, and is the only such locker, so it clearly has no effect. Change it to acquire the lock for writing, so it serializes access to the socket buffer as intended. Bug introduced in 6296cbab. Reviewed-by:
Glauber Costa <glommer@cloudius-systems.com> Reviewed-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
This allows placing C++ objects in ifnet. Reviewed-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
Reviewed-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
Reviewed-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
Include the system header and remove duplicate definitions. Change some solaris imports to use _GNU_SOURCE to make rlim64_t defined consistently. Reviewed-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Avi Kivity authored
Allows making ifnet a C++ class. Reviewed-by:
Glauber Costa <glommer@cloudius-systems.com> Reviewed-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Jan 20, 2014
-
-
Avi Kivity authored
Add sockbuf::sb_cc_wq for using instead of msleep(&sb->sb_cc). This paves the way for lockless wakeups for net channels, as the wait primitive is now thread::wait_for() instead of msleep(). In addition this reduces lock acquisition and improves netperf bandwidth by about 10%. Reviewed-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
Avi Kivity authored
This was used to ensure all socket-using code was converted to C++, but not cleaned up later. Clean it up now. Reviewed-by:
Nadav Har'El <nyh@cloudius-systems.com> Signed-off-by:
Avi Kivity <avi@cloudius-systems.com>
-
- Jan 17, 2014
-
-
Dmitry Fleytman authored
This patch introduces support for MTU option as described in RFC2132, chapter 5.1. Interface MTU Option Amazon EC2 networking uses this option in some cases and it gives throughput improvement of about 250% on big instances with 10G networking. Netperf results for hi1.4xlarge instances, TCP_MAERTS test, OSv runs netserver: Send buffer size Throughput w/ patch (Mbps) Throughput w/o patch (Mbps) Improvement (%) 32 4912.29 1386.28 254 64 4832.01 1385.99 249 128 4835.09 1401.46 245 256 4746.41 1382.28 243 512 4849.04 1375.23 253 1024 4631.8 1356.69 241 2048 4859.59 1371.92 254 4096 4864.99 1383.67 252 8192 4627.07 1364.05 239 16384 4868.73 1366.48 256 32768 4822.69 1366.63 253 65536 4837.67 1353.87 257 Signed-off-by:
Dmitry Fleytman <dmitry@daynix.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Zhi Yong Wu authored
Total time: 30.77 secs make -r -C build/release/ all make[1]: Entering directory `/home/zwu/osv/build/release' CXX bsd/sys/xen/gnttab.o In file included from /home/zwu/osv/bsd/sys/xen/hypervisor.h:40:0, from /home/zwu/osv/bsd/sys/xen/gnttab.cc:29: /home/zwu/osv/bsd/machine/xen/hypercall.h: In function ‘int HYPERVISOR_set_trap_table(const trap_info_t*)’: /home/zwu/osv/bsd/machine/xen/hypercall.h:146:9: error: unable to find string literal operator ‘operator"" STR’ /home/zwu/osv/bsd/machine/xen/hypercall.h: In function ‘int HYPERVISOR_mmu_update(mmu_update_t*, unsigned int, unsigned int*, domid_t)’: /home/zwu/osv/bsd/machine/xen/hypercall.h:154:9: error: unable to find string literal operator ‘operator"" STR’ /home/zwu/osv/bsd/machine/xen/hypercall.h: In function ‘int HYPERVISOR_mmuext_op(mmuext_op*, unsigned int, unsigned int*, domid_t)’: ...... make[1]: *** [bsd/sys/xen/gnttab.o] Error 1 make[1]: Leaving directory `/home/zwu/osv/build/release' make: *** [all] Error 2 Reviewed-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Zhi Yong Wu <zwu.kernel@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Pekka Enberg authored
Add 'struct file' to VOP_READ API. This is needed for procfs which generates file contents at open() time and read() must operate on it, not the vnode. Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Jan 15, 2014
-
-
Avi Kivity authored
Microsecond values were passed as is instead of being scaled by hz. Signed-off-by:
Avi Kivity <avi@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Eduardo Piva authored
Change some printf calls on boot messages, so it will call the apropriate debug function. This will enable OSv to operate on silent mode. Added debug.h header so we can link debug functions to C files. Fixes #118 Signed-off-by:
Eduardo Piva <efpiva@gmail.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Jan 13, 2014
-
-
Glauber Costa authored
For every OSv specific ifdef we remove in ZFS, God ressurects a kitten. After some recent additions and fixes, that piece of code can now be compiled out. It is the reclaimer code, so it is very welcome. But beware: that does not means we are reclaiming yet. That only means that we wired up the ZFS ARC reclaiming process to the BSD notification system. We now need to somehow wire that notification system with the OSv shrinking infrastructure. That is the easy part. And after that, of course, balance the calls between ARC and balloon. That is the hard part. Signed-off-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Glauber Costa authored
ZFS will perform some checks to determine if the current calling "process" is the reclaimer. Export the address of the reclaimer thread so that test can work. Signed-off-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Glauber Costa authored
There is (currently commented out) code in ZFS that checks things like: if (curproc == pageproc) { /* Do something really great */ } The problem is that with our current implementation of curproc (designed for Xen) that will break, because we will return a pointer to a in-stack variable that is created on-demand and only contains the pid of the process. Returning the thread address will make those checks works, but we will be forced to give up on accessing fields inside it altogether. If we *really* must, we can have a structure that have the fields in the same offset as our thread class. But our thread class is defined in a .hh file, so *good luck* calculating the offset of a field (say, id) at compile time so we can include in this other .h file that contains exclusively C code. Since xen is the only user of the PID test, and our resistance to changing the xen code is quite low (if at all), I'll just go ahead and change it: storing the address of the process itself should allow us to do compare tests the way ZFS does and get everything working. Signed-off-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Glauber Costa authored
First of all, I am sorry. I am sorry Avi, Dor, Pekka, God, Dennis Ritchie, et caterva. I am so very sorry. This is probably one of the ugliest things ever written by a C programmer in the history of programming. The story is: ZFS defines its own mutex of type kmutex_t, which is basically just a OSv in our implementation. In a piece of code currently commented out (not for long), it calls: msleep(&needfree, &arc_reclaim_thr_lock, 0, "zfs:lowmem", 0); The problem is that our msleep implementation expects a "struct mtx" which is our own wrapper around mutex (Maybe that should be changed? Does anybody remember why it was done this way?) Keep in mind that we are going to great lenghts not to change ZFS (ifdefing code out is generally fine), so the casting solution could not be used. I've tried to change the for-ZFS definitions of mutex in the BSD glue code, but then, after a couple of hours I was still resolving conflicts with all the other parts that would break because they were expecting a certain type that was now changed. I eventually set for the current ugly but functional solution: code msleep in a way that it can accept any kind of mutex. That is really ugly because by "any kind of mutex" I really mean any kind of crap the user passes and good bye type safety altogether. But it works with minimal changes, and more importantly, with all the changes being in *our* glue code. If anybody have other ideas, I would be happy to try them out. But at this time, I believe that to be the best compromise. Signed-off-by:
Glauber Costa <glommer@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Dmitry Fleytman authored
Signed-off-by:
Dmitry Fleytman <dmitry@daynix.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
Dmitry Fleytman authored
synch_port::msleep: merge time-out and non-time-out cases into one conditional branch to avoid code duplication. This both simplifies the code and makes future implementation of interruption handling code for interruptable sleeps easier. Signed-off-by:
Dmitry Fleytman <dmitry@daynix.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Jan 10, 2014
-
-
Pekka Enberg authored
Simplify networking boot initialization message as suggested by Tzach. Suggested-by:
Tzach Livyatan <tzach@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Jan 09, 2014
-
-
Raphael S. Carvalho authored
This problem was found when running 'tests/tst-zfs-mount.so' multiple times. At the first time, all tests succeed, however, a subsequent run would fail at the test: 'mkdir /foo/bar', the error message reported that the target file already exists. The test basically creates a directory /foo/bar, rename it to /foo/bar2, then remove /foo/bar2. How could /foo/bar still be there? Quite simple. Our shutdown function calls unmount_rootfs() which will attempt to unmount zfs with the flag MNT_FOURCE, however, it's not being passed to zfs_unmount(), neither unmount_rootfs() tests itself the return status (which was always getting failures previously). So OSv is really being shutdown while there is remaining data waiting to be synced with the backing store. As a result, inconsitency. This problem was fixed by passing the flag to VFS_UNMOUNT which will now unmount the fs properly on sudden shutdowns. Signed-off-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-
- Jan 03, 2014
-
-
Raphael S. Carvalho authored
Currently, vflush is used in the unmount process to release remaining dentries. vflush in turn calls vevict that is releasing dentries that it doesn't own. This behavior is not correct neither good to the future of VFS. So Avi suggested switching to a different approach. We could only release those dentries owned by the mountpoint when unmounting it as there wouldn't be anything else in the dcache (given its functionality). The problem was fixed by doing the following steps: - Drop vflush calls in sys_umount2, make vevict an empty function, and remove vevict. - Created the function release_mp_dentries to release dentries of a mount point which will be called by VFS_UNMOUNT. It cannot be called before VFS_UNMOUNT as failures must be considered, neither after as the mount point would be considered busy. Don't respect this "rule", and that previously seen ZFS replay transaction error would happen. NOTE: vflush is currently duplicated in zfs unmount cases to address the problem above. This patch fixes this duplication as well. Signed-off-by:
Raphael S. Carvalho <raphaelsc@cloudius-systems.com> Signed-off-by:
Pekka Enberg <penberg@cloudius-systems.com>
-