Skip to content
Snippets Groups Projects
  1. Jan 27, 2014
  2. Jan 23, 2014
    • Zhi Yong Wu's avatar
      network: fix compile error · 1d641f38
      Zhi Yong Wu authored
      
      BUILD SUCCESSFUL
      
      Total time: 35.396 secs
      make -r -C build/release/ all
      make[1]: Entering directory `/home/zwu/osv/build/release'
        CXX loader.o
        CXX runtime.o
        CXX drivers/vga.o
        CXX bsd/net.o
        CXX bsd/porting/networking.o
      /home/zwu/osv/bsd/porting/networking.cc: In function ‘int osv::if_set_mtu(std::string, u16)’:
      /home/zwu/osv/bsd/porting/networking.cc:43:32: error: missing braces around initializer for ‘char [16]’ [-Werror=missing-braces]
      cc1plus: all warnings being treated as errors
      make[1]: *** [bsd/porting/networking.o] Error 1
      make[1]: Leaving directory `/home/zwu/osv/build/release'
      make: *** [all] Error 2
      
      Signed-off-by: default avatarZhi Yong Wu <zwu.kernel@gmail.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      1d641f38
    • Raphael S. Carvalho's avatar
      zfs: Fix zfs_inactive on unlinked znode cases · 08290fd5
      Raphael S. Carvalho authored
      
      This patch addresses a corner-case in our zfs_inactive which can potentially
      leak a znode object.
      
      *** Some background on znode/zfs_inactive ***
      - Used to deallocate fs-specific data.
      
      - Before destroying the znode, a DMU transaction is created to sync the znode
      to the backing store *if* its z_atime_dirty is set (Not relevant to this
      patch though).
      
      - When removing a link, zfs_remove sets the field zp->z_unlinked of the
      underlying znode if the number of links reached 0 (Simply put, not present in
      the fs anymore).
      
      *** The problem ***
      The actual problem shows up when zfs_inactive is used on znodes with the
      unlinked field set.
      
      The code wrapped around by this patch was previously added to speed up the call
      to vrecycle, whose name partially explains itself. Its first functionality is
      to eliminate all activity associated to the vnode, then put the vnode back into
      a list of free vnodes.
      
      OSv VFS layer doesn't support vrecycle, but our zfs_inactive is acting as if it
      were supported. Another thing is that vrecycle call was also removed.
      
      *** Solution ***
      Let's fix this problem by simply wrapping around the test which prevented
      zfs_inactive from working properly on unlinked znodes, thus leaking references
      to the underlying mount point afterwards.
      
      The commentary added into zfs_inactive also explains why these changes are
      needed. It would also make things easier when people look at it in the future,
      and try to understand why things are the way they are.
      
      Reviewed-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarRaphael S. Carvalho <raphaelsc@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      08290fd5
    • Raphael S. Carvalho's avatar
      zfs: Fix znode reference count leaks · 76c0caa7
      Raphael S. Carvalho authored
      
      The zfs_remove() function calls zfs_dirent_lock, which in turn calls
      zfs_zget() which bumps up the underlying znode reference count once.
      
      However, neither zfs_remove() or zfs_rmdir() release the reference count
      after using it. This prevents zfs_zinactive() which is used to destroy
      the znode object from working properly. Another consequence is that each
      znode holds a reference to the underlying mount point, keeping it busy
      for unmount.
      
      Fix the znode refcnt by calling zfs_zinactive after znode usage.
      
      Reviewed-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarRaphael S. Carvalho <raphaelsc@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      76c0caa7
    • Dmitry Fleytman's avatar
      netfront: get IOCTL definitions from proper place · 75311f28
      Dmitry Fleytman authored
      
      There were 2 places with ioctl definitions, Xen netfront driver
      was compiled with IOCTL definitions from wrong place.
      
      Fixed by changing include and deleting file with improper definitions
      
      Signed-off-by: default avatarDmitry Fleytman <dmitry@daynix.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      75311f28
  3. Jan 22, 2014
  4. Jan 21, 2014
  5. Jan 20, 2014
  6. Jan 17, 2014
    • Dmitry Fleytman's avatar
      DHCP: Support MTU option · 69bf74a7
      Dmitry Fleytman authored
      
      This patch introduces support for MTU option as described in
      RFC2132, chapter 5.1. Interface MTU Option
      
      Amazon EC2 networking uses this option in some cases and it gives
      throughput improvement of about 250% on big instances with 10G networking.
      
      Netperf results for hi1.4xlarge instances, TCP_MAERTS test, OSv runs netserver:
      
      Send buffer size     Throughput w/ patch (Mbps)     Throughput w/o patch (Mbps)     Improvement (%)
      
      32                   4912.29                        1386.28                         254
      64                   4832.01                        1385.99                         249
      128                  4835.09                        1401.46                         245
      256                  4746.41                        1382.28                         243
      512                  4849.04                        1375.23                         253
      1024                 4631.8                         1356.69                         241
      2048                 4859.59                        1371.92                         254
      4096                 4864.99                        1383.67                         252
      8192                 4627.07                        1364.05                         239
      16384                4868.73                        1366.48                         256
      32768                4822.69                        1366.63                         253
      65536                4837.67                        1353.87                         257
      
      Signed-off-by: default avatarDmitry Fleytman <dmitry@daynix.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      69bf74a7
    • Zhi Yong Wu's avatar
      xen: Fix "error: unable to find tring literal operator" · 6b5b6696
      Zhi Yong Wu authored
      
      Total time: 30.77 secs
      make -r -C build/release/ all
      make[1]: Entering directory `/home/zwu/osv/build/release'
        CXX bsd/sys/xen/gnttab.o
      In file included from /home/zwu/osv/bsd/sys/xen/hypervisor.h:40:0,
                       from /home/zwu/osv/bsd/sys/xen/gnttab.cc:29:
      /home/zwu/osv/bsd/machine/xen/hypercall.h: In function ‘int HYPERVISOR_set_trap_table(const trap_info_t*)’:
      /home/zwu/osv/bsd/machine/xen/hypercall.h:146:9: error: unable to find string literal operator ‘operator"" STR’
      /home/zwu/osv/bsd/machine/xen/hypercall.h: In function ‘int HYPERVISOR_mmu_update(mmu_update_t*, unsigned int, unsigned int*, domid_t)’:
      /home/zwu/osv/bsd/machine/xen/hypercall.h:154:9: error: unable to find string literal operator ‘operator"" STR’
      /home/zwu/osv/bsd/machine/xen/hypercall.h: In function ‘int HYPERVISOR_mmuext_op(mmuext_op*, unsigned int, unsigned int*, domid_t)’:
      ......
      make[1]: *** [bsd/sys/xen/gnttab.o] Error 1
      make[1]: Leaving directory `/home/zwu/osv/build/release'
      make: *** [all] Error 2
      
      Reviewed-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarZhi Yong Wu <zwu.kernel@gmail.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      6b5b6696
    • Pekka Enberg's avatar
      vfs: 'struct file' to VOP_READ · 9f68c2d9
      Pekka Enberg authored
      
      Add 'struct file' to VOP_READ API. This is needed for procfs which
      generates file contents at open() time and read() must operate on it,
      not the vnode.
      
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      9f68c2d9
  7. Jan 15, 2014
  8. Jan 13, 2014
    • Glauber Costa's avatar
      ZFS: remove one OSv ifdef · 622d8bba
      Glauber Costa authored
      
      For every OSv specific ifdef we remove in ZFS, God ressurects a kitten.
      
      After some recent additions and fixes, that piece of code can now be
      compiled out. It is the reclaimer code, so it is very welcome.
      
      But beware: that does not means we are reclaiming yet. That only means that we
      wired up the ZFS ARC reclaiming process to the BSD notification system.
      
      We now need to somehow wire that notification system with the OSv shrinking
      infrastructure. That is the easy part. And after that, of course, balance the
      calls between ARC and balloon. That is the hard part.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      622d8bba
    • Glauber Costa's avatar
      reclaim: export address of the OSV reclaimer · 5a60e13d
      Glauber Costa authored
      
      ZFS will perform some checks to determine if the current calling "process"
      is the reclaimer. Export the address of the reclaimer thread so that test
      can work.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      5a60e13d
    • Glauber Costa's avatar
      bsd glue: simplify curproc so it returns a pointer · 36c0cebc
      Glauber Costa authored
      
      There is (currently commented out) code in ZFS that checks things like:
      
          if (curproc == pageproc) {
              /* Do something really great */
          }
      
      The problem is that with our current implementation of curproc (designed for Xen)
      that will break, because we will return a pointer to a in-stack variable that is
      created on-demand and only contains the pid of the process.
      
      Returning the thread address will make those checks works, but we will be forced
      to give up on accessing fields inside it altogether. If we *really* must, we can
      have a structure that have the fields in the same offset as our thread class.
      
      But our thread class is defined in a .hh file, so *good luck* calculating the
      offset of a field (say, id) at compile time so we can include in this other .h
      file that contains exclusively C code. Since xen is the only user of the PID test,
      and our resistance to changing the xen code is quite low (if at all), I'll just go
      ahead and change it: storing the address of the process itself should allow us to
      do compare tests the way ZFS does and get everything working.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      36c0cebc
    • Glauber Costa's avatar
      msleep: make it accept any kind of mutex · 72d4d8f4
      Glauber Costa authored
      
      First of all, I am sorry. I am sorry Avi, Dor, Pekka, God, Dennis Ritchie, et caterva.
      I am so very sorry. This is probably one of the ugliest things ever written by a C
      programmer in the history of programming.
      
      The story is: ZFS defines its own mutex of type kmutex_t, which is basically just a OSv
      in our implementation. In a piece of code currently commented out (not for long), it calls:
      
          msleep(&needfree, &arc_reclaim_thr_lock, 0, "zfs:lowmem", 0);
      
      The problem is that our msleep implementation expects a "struct mtx" which is our own
      wrapper around mutex (Maybe that should be changed? Does anybody remember why it was
      done this way?)
      
      Keep in mind that we are going to great lenghts not to change ZFS (ifdefing code out
      is generally fine), so the casting solution could not be used. I've tried to change the
      for-ZFS definitions of mutex in the BSD glue code, but then, after a couple of hours
      I was still resolving conflicts with all the other parts that would break because they
      were expecting a certain type that was now changed.
      
      I eventually set for the current ugly but functional solution: code msleep in a
      way that it can accept any kind of mutex. That is really ugly because by "any
      kind of mutex" I really mean any kind of crap the user passes and good bye type
      safety altogether. But it works with minimal changes, and more importantly, with
      all the changes being in *our* glue code.
      
      If anybody have other ideas, I would be happy to try them out. But at this time,
      I believe that to be the best compromise.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      72d4d8f4
    • Dmitry Fleytman's avatar
      407301a5
    • Dmitry Fleytman's avatar
      msleep: prepare for interruptible sleep implementation · b03f081c
      Dmitry Fleytman authored
      
      synch_port::msleep: merge time-out and non-time-out cases
      into one conditional branch to avoid code duplication.
      
      This both simplifies the code and makes future implementation of
      interruption handling code for interruptable sleeps easier.
      
      Signed-off-by: default avatarDmitry Fleytman <dmitry@daynix.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      b03f081c
  9. Jan 10, 2014
  10. Jan 09, 2014
    • Raphael S. Carvalho's avatar
      zfs: Fix on-disk data inconsistency on shutdown · 2d93af3b
      Raphael S. Carvalho authored
      
      This problem was found when running 'tests/tst-zfs-mount.so' multiple times.
      At the first time, all tests succeed, however, a subsequent run would
      fail at the test: 'mkdir /foo/bar', the error message reported
      that the target file already exists.
      
      The test basically creates a directory /foo/bar, rename it to /foo/bar2,
      then remove /foo/bar2. How could /foo/bar still be there?
      
      Quite simple. Our shutdown function calls unmount_rootfs() which will
      attempt to unmount zfs with the flag MNT_FOURCE, however, it's not being
      passed to zfs_unmount(), neither unmount_rootfs() tests itself the
      return status (which was always getting failures previously).
      So OSv is really being shutdown while there is remaining data waiting to
      be synced with the backing store. As a result, inconsitency.
      
      This problem was fixed by passing the flag to VFS_UNMOUNT which will now
      unmount the fs properly on sudden shutdowns.
      
      Signed-off-by: default avatarRaphael S. Carvalho <raphaelsc@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      2d93af3b
  11. Jan 03, 2014
    • Raphael S. Carvalho's avatar
      vfs: change the approach of releasing dentries during unmount · af466dbc
      Raphael S. Carvalho authored
      
      Currently, vflush is used in the unmount process to release remaining
      dentries. vflush in turn calls vevict that is releasing dentries that
      it doesn't own.
      This behavior is not correct neither good to the future of VFS.
      
      So Avi suggested switching to a different approach. We could only
      release those dentries owned by the mountpoint when unmounting it as
      there wouldn't be anything else in the dcache (given its functionality).
      
      The problem was fixed by doing the following steps:
       - Drop vflush calls in sys_umount2, make vevict an empty function,
      and remove vevict.
      
       - Created the function release_mp_dentries to release dentries of a mount
      point which will be called by VFS_UNMOUNT. It cannot be called before
      VFS_UNMOUNT as failures must be considered, neither after as the mount point
      would be considered busy.
      Don't respect this "rule", and that previously seen ZFS replay transaction
      error would happen.
      
      NOTE: vflush is currently duplicated in zfs unmount cases to address the problem
      above. This patch fixes this duplication as well.
      
      Signed-off-by: default avatarRaphael S. Carvalho <raphaelsc@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      af466dbc
Loading