Skip to content
Snippets Groups Projects
  1. Jul 17, 2013
  2. Jul 15, 2013
  3. Jul 12, 2013
  4. Jul 11, 2013
    • Avi Kivity's avatar
      Merge branch 'v2p-debug-3' · b49f066c
      Avi Kivity authored
      Fix various issues with the debug allocator.
      b49f066c
    • Avi Kivity's avatar
      mmu: fix map_file() deadlock · 8919c66a
      Avi Kivity authored
      map_file() takes the vm lock, then calls read() to pre-fault the data.
      However read() may cause allocations, which then require the vm lock as well.
      
      Fix by faulting in the data after dropping the lock.
      8919c66a
    • Avi Kivity's avatar
      memory: let the debug allocator mimic the standard allocator more closely · 1ea5672f
      Avi Kivity authored
      The standard allocator returns page-aligned addresses for large allocations.
      Some osv code incorrectly relies on this.
      
      While we should fix the incorrect code, for now, adjust the debug allocator
      to return aligned addresses.
      
      The debug allocator now uses the following layout:
      
        [header page][guard page][user data][pattern tail][guard page]
      1ea5672f
    • Avi Kivity's avatar
      virtio: explicitly request contiguous memory for the virtio ring · 79aa5d28
      Avi Kivity authored
      Required by the virtio spec.
      79aa5d28
    • Avi Kivity's avatar
      memory: add alloc_phys_contiguous_aligned() API · b15db045
      Avi Kivity authored
      Virtio and other hardware needs physically contiguous memory, beyond one page.
      It also requires page-aligned memory.
      Add an explicit API for contiguous and aligned memory allocation.
      
      While our default allocator returns physically contiguous memory, the debug
      allocator does not, causing virtio devices to fail.
      b15db045
    • Avi Kivity's avatar
    • Dor Laor's avatar
      Move from a request array approach back to allocation. · 5bcb95d9
      Dor Laor authored
      virtio_blk pre-allocates requests into a cache to avoid re-allocation
      (possibly an unneeded optimization with the current allocator).  However,
      it doesn't take into account that requests can be completed out-of-order,
      and simply reuses requests in a cyclic order. Noted by Avi although
      I had it made using a peak into the index ring but its too complex
      solution. There is no performance degradation w/ smp due to the good
      allocator we have today.
      5bcb95d9
    • Nadav Har'El's avatar
      Fix socket poll() deadlock · 1e65eb54
      Nadav Har'El authored
      In commit 7ecbf29f I added to the
      poll_install() stage of poll() a check of the current state of the file -
      to avoid the sleep if the file became ready before we managed to "install"
      its poll request.
      
      However, I wrongly believed it was necessary to put this check inside
      the FD_LOCK together with the request installation. In fact, it doesn't
      need to be in the same lock - all we need is for the check to happen
      *after* the installation. The call to fo_poll() doesn't need to be in
      the same FD_LOCK or even in an FD_LOCK at all.
      
      Moreover, as it turns out, it must NOT be in an FD_LOCK() because this
      results in a deadlock when polling sockets, caused by two different
      code paths taking locks in opposite order:
      
      1. Before this fix, poll() took FD_LOCK and called fo_poll() which
         called sopoll_generic() which took a SOCKBUF_LOCK
      
      2. In the wake path, SOCKBUF_LOCK was taken, then so_wake_poll()
         is called which calls poll_wake() which takes FD_LOCK.
      1e65eb54
    • Nadav Har'El's avatar
      Fix hang in virtio_driver::wait_for_queue · 8ebb1693
      Nadav Har'El authored
      virtio_driver::wait_for_queue() would often hang in a memcached and
      mc_benchmark workload, waiting forever for received packets although
      these *do* arrive.
      
      As part of the virtio protocol, we need to set the host notification
      flag (we call this, somewhat confusingly, queue->enable_interrupts())
      and then check if there's anything in the queue, and if not, wait
      for the interrupt.
      
      This order is important: If we check the queue and only then set the
      notification flag, and data came in between those, the check will be
      empty and an interrupt never sent - and we can wait indefinitely for
      data that has already arrived.
      
      We did this in the right order, but the host code, running on a
      different CPU, might see memory accesses in a different order!
      We need a memory fence to ensure that the same order is also seen
      on other processors.
      
      This patch adds a memory fence to the end of the enable_interrupts()
      function itself, so we can continue to use it as before in
      wait_for_queue(). Note that we do *not* add a memory fence to
      disable_interrupts() - because no current use (and no expected use)
      cares about the ordering of disable_interrupts() vs other memory
      accesses.
      8ebb1693
    • Nadav Har'El's avatar
      Fix missed wakeups in so_wake_poll · fd28e12d
      Nadav Har'El authored
      This patch fixes two bugs in so_wake_poll(), which caused us missing
      some poll wakeups, resulting in poll()s that never wake up. This can be
      seen as a hang in the following simple loop exercising memcached:
               (while :; do; echo "stats" | nc 192.168.122.100 11211; done)
      
      The two fixes are:
      
      1. If so_wake_poll() decides *not* to call poll_wake() - because it sees
         zero data on this packet - it mustn't reset the SB_SEL flag on the
         socket, or we will ignore the next event even when it does have data.
      
      2. To see if the socket is readable, we need to call soreadable(), not
         soreadabledata() - the former adds the connection close event to the
         readability. See sopoll_generic(), which also sets a readability
         event in that case.
      fd28e12d
    • Nadav Har'El's avatar
      Revert 4c1dd505 · 8d48ef43
      Nadav Har'El authored
      I'm returning Dor's original virtio_driver::wait_for_queue().
      
      The rewrite just masked, with its slightly different timing and redundant
      second check before waiting, the real bug which a missing memory barrier
      (see separate patch fixing that).
      
      Dor's original code has the good feature that after waking up from a
      sleep - when presumably we already have something in the queue - we
      check the queue before pessimisticly enabling the host notifications.
      So let's use Dor's original code.
      8d48ef43
    • Glauber Costa's avatar
      add PAGE_SHIFT macro · 7ab8afa5
      Glauber Costa authored
      It is usually very useful, together with PAGE_SIZE
      7ab8afa5
    • Glauber Costa's avatar
      stub mlock and mlockall · e0948398
      Glauber Costa authored
      No plans in sight to page out anonymous memory, so for now, stub this.
      We may have to revisit this once we support proper mmap semantics, specially
      with ranged mlock/munlock
      e0948398
    • Glauber Costa's avatar
      Allow NULL arguments for dlopen · d9aa9d63
      Glauber Costa authored
      From dlopen man page:
      "If filename  is  NULL,  then the returned handle is for the main program."
      
      What I do in this patch is exactly this
      d9aa9d63
    • Glauber Costa's avatar
      xen: massage xen.cc · 81f42426
      Glauber Costa authored
      Since will now have the xen interface files for BSD anyway, let's
      use their more readable definitions instead of hardcoded numbers and duplicated
      strings.
      81f42426
    • Glauber Costa's avatar
      delete xen buggy hypercall · 5da2d022
      Glauber Costa authored
      This was badly adapted from Avi's 5 argument hypercall. It is not in use for now,
      so let's just delete it.
      5da2d022
    • Glauber Costa's avatar
      xen: include xen files · bf2061fb
      Glauber Costa authored
      verbatim header files import, plus the code in build.mk to find them
      bf2061fb
  5. Jul 10, 2013
    • Nadav Har'El's avatar
      rewrite virtio_driver::wait_for_queue · 4c1dd505
      Nadav Har'El authored
      In my memcached tests (with mc_benchmark as the driver), I saw
      virtio_driver::wait_for_queue appears to have some bug or race condition -
      in some cases it hangs on waiting for the rx queue - and simply never
      returns.
      
      I can't say I understand what the bug in this code is, however.
      Instead, I just wrote it from scratch in a different way, which I think
      is much clearer - and this code no longer exhibits this bug.
      
      I can't put my finger on why my new version is more correct than
      the old one - or even just difference... Dor, maybe you can find a
      difference? But it definitely behaves differently.
      4c1dd505
    • Dor Laor's avatar
      Add vhost invocation option · 0cdb5741
      Dor Laor authored
      0cdb5741
    • Dor Laor's avatar
      Allow parallel execution of {add|get}_buff, prevent fast path allocs · 350fa518
      Dor Laor authored
      virtio-vring and it's users (net/blk) were changed so no request
      header will be allocated on run time except for init. In order to
      do that, I have to change get_buf and break it into multiple parts:
      
              // Get the top item from the used ring
              void* get_buf_elem(u32 *len);
              // Let the host know we consumed the used entry
              // We separate that from get_buf_elem so no one
              // will re-cycle the request header location until
              // we're finished with it in the upper layer
              void get_buf_finalize();
              // GC the used items that were already read to be emptied
              // within the ring. Should be called by add_buf
              // It was separated from the get_buf flow to allow parallelism of the two
              void get_buf_gc();
      
      As a result, it was simple to get rid of the shared lock that protected
      _avail_head variable before. Today only the thread that calls add_buf
      updates this variable (add_buf calls get_buf_gc internally).
      
      There are two new locks instead:
        - virtio-net tx_gc lock - very rarely it can be accessed
          by the tx_gc thread or normally by the tx xmit thread
        - virtio-blk make_requests - there are parallel requests
      350fa518
    • Dor Laor's avatar
      Trivial: Move code above, preparation for preventing past path allocations for... · cc8cc19e
      Dor Laor authored
      Trivial: Move code above, preparation for preventing past path allocations for the virtio request data
      cc8cc19e
    • Christoph Hellwig's avatar
      9e35c5c0
  6. Jul 09, 2013
    • Nadav Har'El's avatar
      Fix thread-exit issues in msleep/wakeup · 87c7f232
      Nadav Har'El authored
      wakeup() and wakeup_one() dropped the lock before waking the thread(s).
      But if a thread timed out at the same time we are planning to wake it,
      and if the thread imediately exited, our attempt to wake() it could crash.
      
      Therefore wakeup() needs to continue holding the lock until after the
      wake(). When msleep() times out, before it returns it reacquires the
      lock, ensuring it cannot return before a concurrent wakeup() does
      its wakes.
      
      Note that in the ordinary wakeup() case (not racing with a timeout),
      we are *not* negatively affected by holding the lock while wake()ing
      because the waking msleep() does not try to reacquire the lock in the
      non-timeout case.
      
      Additionally, we use wake_with() instead of separate set and wake
      instructions. Without wake_with() we have the (exceedingly rare)
      possibility that merely setting _awake=true causes msleep() to
      return, and its caller then decides to exit the thread, the
      wake() right after the _awake=true would crash.
      87c7f232
    • Glauber Costa's avatar
      bsd: stubs for sx_xlock · 29d05537
      Glauber Costa authored
      Aside from mutex, BSD always implements sx locks. They are a form of rwlock,
      and according to BSD's sx.h, the differences from their own rwlock are an
      implementation detail. Let's just use them as rwlocks for now.
      
      The declarations are uglier than I wanted. But this file ends up being included
      from c and cc code, and rwlock.h calls condvar.h inside - which is full of
      sync_stub.h - and as a result condvar.h itself - are usually included in extern
      "C" blocks. Because we are not expected to use sxlocks from C, it should be
      fine.
      29d05537
Loading