Skip to content
Snippets Groups Projects
  1. Mar 27, 2014
    • Raphael S. Carvalho's avatar
      drivers: Add zfs device to allow use of zfs commands · ff3534e2
      Raphael S. Carvalho authored
      
      Previously, zfs device was being only provided to allow the use of
      commands needed to create the zpool, and so the file system.
      At that time, doing so was quite enough, however, making zfs
      device, i.e. /dev/zfs part of every OSv instance would allow us
      to use commands that will help analysing, debugging, tuning
      the zpool and file systems there contained.
      
      The basic explanation is that those commands use libzfs which in
      turn relies on /dev/zfs to communicate with the zfs code.
      
      Commands example:
      zpool, zfs, zdb. The latter one not being ported to OSv yet.
      This patch will also be helpful for the ongoing ztest porting.
      
      Reviewed-by: default avatarNadav Har'El <nyh@cloudius-systems.com>
      Signed-off-by: default avatarRaphael S. Carvalho <raphaelsc@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      ff3534e2
    • Raphael S. Carvalho's avatar
      manifest: Add /etc/mnttab from the upload manifest process · 8078bc20
      Raphael S. Carvalho authored
      
      /etc/mnttab is required by libzfs to get running properly, so let's
      create it as an empty file.
      
      ryao from zfsonlinux and openzfs told me that an empty /etc/mnttab is used
      on Linux. Also reading the libzfs code shows that /etc/mnttab mostly used for
      management of the file itself, nothing that would prevent some libzfs
      functionality from working.
      
      Reviewed-by: default avatarNadav Har'El <nyh@cloudius-systems.com>
      Signed-off-by: default avatarRaphael S. Carvalho <raphaelsc@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      8078bc20
    • Pekka Enberg's avatar
      scripts/test.py: Blacklist tst-dns-resolver.so · 7f6529e7
      Pekka Enberg authored
      
      The tst-dns-resolver.so fails spuriously. Blacklist it until the problem
      is fixed to keep Jenkin builds running.
      
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      7f6529e7
    • Pekka Enberg's avatar
    • Raphael S. Carvalho's avatar
      cpiod.so: Unmount file systems mounted over the mkfs phase · 8e31549c
      Raphael S. Carvalho authored
      
      The root dataset and ZFS are mounted at the mkfs phase, but they aren't
      unmounted aftwards.
      
      Running mkfs with VERBOSE flag enabled shows the following:
      Running mkfs...
      VFS: mounting zfs at /zfs
      zfs: mounting osv from device osv
      VFS: mounting zfs at /zfs/zfs
      zfs: mounting osv/zfs from device osv/zfs
      
      The first mount happens when issuing:
      {"zpool", "create", "-f", "-R", "/zfs", "osv", "/dev/vblk0.1"}, &ret);
      It creates a pool called osv and mounts the root dataset at /zfs
      
      The latter mount happens when issuing:
      {"zfs", "create", "osv/zfs"}
      It creates a file system called zfs at the pool OSv and automatically
      mounts it at the root dataset mountpoint.
      
      No data inconsistency problem was seen up to now because both mkfs.so and
      cpiod.so do an explicit sync() at the end, thus ensuring everything was
      correctly flushed out to the stable storage.
      There is an expression in Dutch that says: prevention is better than cure.
      Thus, this patch changes cpiod.so to unmount both mount points when the
      /zfs/zfs prefix was passed. It cannot be done at mkfs.so itself because
      cpiod.so is called afterwards at the same OSv instance.
      
      Signed-off-by: default avatarRaphael S. Carvalho <raphaelsc@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      8e31549c
    • Zifei Tong's avatar
      scripts: make 'scripts/test.py' support Python3 · 1563c916
      Zifei Tong authored
      
      Python3 no longer allow implicitly conversion form bytes to string,
      add explicit decode() to convert input bytes.
      
      Tested with both Python2 and Python3.
      
      Reviewed-by: default avatarNadav Har'El <nyh@cloudius-systems.com>
      Signed-off-by: default avatarZifei Tong <zifeitong@gmail.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      1563c916
    • Glauber Costa's avatar
      zfsbuffers: reference count the arc buffer · 063a8a56
      Glauber Costa authored
      
      Gleb has noticed that the ARC buffers can go unshared too early. This will
      happen because we call the UNMAP operation on every put(). That is certainly
      not what we want, since the buffer only has to be unshared when the last
      reference is gone.
      
      Design decisions:
      1) We obviously can't use the arc natural reference count for that, since
      bumping it would make the buffer unevictable.
      2) We could modify the arc_buf structure itself to add another refcnt (minimum
      4 bytes).  However, I am trying to keep core-ZFS modifications to a minimum,
      and only to places where it is totally unavoidable.
      
      Therefore, the solution is to add another hash, which will hash the whole
      buffer instead of the physaddr like the one we have currently. In terms of
      memory usage, it will add only 8 bytes per buffer (+/- 128k each buffer), which
      makes for a memory usage of 64k per mapped Gb compared to the arc refcount
      solution. This is a good trade off.
      
      I am also avoiding adding a new vop_map/unmap style operation just to query the
      buffer address from its file attributes (needed for the put side). Instead, I
      am conventioning that an empty iovec means query, and a filled iov means
      unshare.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      063a8a56
    • Zifei Tong's avatar
      core: fix misbehaving debugf() · 7c8d7415
      Zifei Tong authored
      
      debugf() used to write log message with respect to the length of format
      string. This will cause the messages wrongly truncated.
      
      Also change confusing variable names: exchange 'fmt' and 'msg'.
      
      Reviewed-by: default avatarNadav Har'El <nyh@cloudius-systems.com>
      Signed-off-by: default avatarZifei Tong <zifeitong@gmail.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      7c8d7415
    • Glauber Costa's avatar
      zfsbuffers: kill bogus variable · fd233dd7
      Glauber Costa authored
      
      Spotted by code review. Gleg had spotted one improper use of "i", but
      there was another. In this case we iterate over nothing, and i is always 0.
      It is uninitialized to begin with, and the code works just because it is
      being set to 0 by luck.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      fd233dd7
    • Glauber Costa's avatar
      tests: improve zfs shared buffers tests · 28186c29
      Glauber Costa authored
      
      We have seen bugs with mmap shared/file handling for small files. This patch tests
      some of the corner scenarios to find those problems.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      28186c29
    • Glauber Costa's avatar
      zfsbuffers: Do not truncate files · d3ed5bef
      Glauber Costa authored
      
      There is a problem with the way ZFS currently handles its buffers, which is
      actually a limitation of our allocator: buffers smaller than a page won't be
      page aligned even if we ask for it. Therefore, if the buffer we are mapping
      falls into this category, we will map the wrong location.
      
      The way I solved this problem was so stupid, that in retrospect I can't even
      believe I did it: when the file would run out of size, we would truncate the
      file. This is obviously wrong because reading a file is not expected to change
      its size in any circumstance, and if anybody relied in the actual size, we will
      be crashing something. This is the bug that plagued Cassandra.
      
      Not truncating, however, brings back the original problem. One solution I have
      considered is to always allocate at least a page for data allocations (leaving
      metadata alone), but that would deviate from ZFS and harm many-small-files
      workloads.
      
      However, During testing, I have noticed though that ZFS will allocate small
      buffers only when the file itself is small. This means that we can just avoid
      using the special shared mapping for small files - which makes sense anyway.
      
      For instance, if we have a file that is 128k + 1byte (remember 128k is ZFS's
      maximum buffer size), both buffers will be large enough to be aligned. And if I
      that ever fails to hold, we will now see an assertion hit instead of a random
      bug. In time, we should fix our allocator to provide alignment guarantees.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      d3ed5bef
    • Gleb Natapov's avatar
      mmu: fix how mappings into ARC buffer are tracked · 054997f5
      Gleb Natapov authored
      
      Currently all mapping are keyed on ARC buffer start when mapping is
      added, but on remove pointer into ARC buffer is used, so remove may
      leave no longer valid mappings in the database. This patch fixes it
      by using a pointer into ARC as a key, the same pointer that is used
      during removal.
      
      Signed-off-by: default avatarGleb Natapov <gleb@cloudius-systems.com>
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      054997f5
  2. Mar 26, 2014
  3. Mar 25, 2014
  4. Mar 24, 2014
Loading