Skip to content
Snippets Groups Projects
  1. Nov 25, 2013
    • Nadav Har'El's avatar
      Fix possible deadlock in condvar · 15a32ac8
      Nadav Har'El authored
      
      When a condvar's timeout and wakeup race, we wait for the concurrent
      wakeup to complete, so it won't crash. We did this wr.wait() with
      the condvar's internal mutex (m) locked, which was fine when this code
      was written; But now that we have wait morphing, wr.wait() waits not
      just for the wakeup to complete, but also for the user_mutex to become
      available. With m locked and us waiting for user_mutex, we're now in
      deadlock territory - because a common idiom of using a condvar is to
      do the locks in opposite order: lock user_mutex first and then use the
      condvar, which locks m.
      
      I can't think of an easy way to actually demonstrate this deadlock,
      short of having a locked condvar_wait timeout racing with condvar_wake_one
      racing and then an additional locked condvar operation coming in
      concurrently, but I don't have a test case demonstrating this.
      I am hoping it will fix the lockups that Pekka is seeing in his
      Cassandra tests (which are the reason I looked for possible condvar
      deadlocks in the first place).
      
      Signed-off-by: default avatarNadav Har'El <nyh@cloudius-systems.com>
      Tested-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      15a32ac8
    • Glauber Costa's avatar
      sched: delay initialization of early threads · d91d7799
      Glauber Costa authored
      
      The problem with sleep, is that we can initialize early threads before the
      cpu itself is initialized. If we note what goes on in init_on_cpu, it should
      become clear:
      
      void cpu::init_on_cpu()
      {
          arch.init_on_cpu();
          clock_event->setup_on_cpu();
      }
      
      When we finally initialize the clock_event, it can get lost if we already have
      pending timers of any kind - which we may, if we have early threads being
      start()ed before that. I have played with many potential solutions, but in the
      end, I think the most sensible thing to do is to delay initialization of early
      threads to the point when we are first idle. That is the best way to guarantee
      that everything will be properly initialized and running.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarAvi Kivity <avi@cloudius-systems.com>
      d91d7799
  2. Nov 22, 2013
  3. Nov 21, 2013
  4. Nov 20, 2013
  5. Nov 19, 2013
    • Nadav Har'El's avatar
      Explicitly request alignment when allocating per-cpu area · e9549266
      Nadav Har'El authored
      
      Commit ed808267 used malloc() to allocate
      the per-cpu variables area. As Avi pointed out, we need this area to be
      aligned like the strictest alignment of any per-cpu variable. The strictest
      alignment we need is probably CACHELINE_ALIGNED (64 bytes), but it's easiest
      just to require 4096-byte alignment, and this is what the code prior to the
      above patch did.
      
      The above commit worked because luckily enough, our malloc() does return
      page-aligned memory for large allocations. But it's possible that this will
      not be the case in the future. So this patch switches to use aligned_alloc()
      instead, explicitly requesting a 4096-byte-aligned block of memory.
      
      Signed-off-by: default avatarNadav Har'El <nyh@cloudius-systems.com>
      Reviewed-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      Signed-off-by: default avatarAvi Kivity <avi@cloudius-systems.com>
      e9549266
    • Nadav Har'El's avatar
      Partial implementation of aligned_alloc() and posix_memaligned(). · 7e06bd33
      Nadav Har'El authored
      
      This patch provides a trivial implementation of two similar functions for
      allocating aligned memory blocks: aligned_alloc() (from the C11 standard)
      and posix_memaligned() (from POSIX). Memory returned by either function
      can be freed with the ordinary free().
      
      This trivial implementation just calls malloc(), and assert()s that it got
      the desired alignment, aborting if not. In many cases this is good enough
      because malloc() already returns 4096-byte-aligned blocks for large
      allocations. In particular we'll use these functions in the next patch for
      allocating the large page-aligned per-cpu areas.
      
      If we ever fail on this assertion, we can replace these functions by a
      full implementation (see issue #87).
      
      Signed-off-by: default avatarNadav Har'El <nyh@cloudius-systems.com>
      Reviewed-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      Signed-off-by: default avatarAvi Kivity <avi@cloudius-systems.com>
      7e06bd33
    • Takuya ASADA's avatar
      Add negotiation flag check for FLUSH · a0ce1b50
      Takuya ASADA authored
      Some older version of qemu-nbd cuases error exit with nbd_client.py.
      (Look at: https://groups.google.com/d/msg/osv-dev/EW5BtNFNfzs/I33BeFXg2f0J
      
      )
      This is because nbd_client.py is sending FLUSH command unconditionally, but it's extended feature, nbd client should check nbd server has the capability to accept FLUSH.
      nbd server sends capability flags on negotiation stage, it sends HAS_FLAGS(0x1) and SEND_FLUSH(0x4) when server supports FLUSH.
      
      This patch adds these capability check, and skips to send FLUSH if server doesn't support it.
      
      Signed-off-by: default avatarTakuya ASADA <syuu@dokukino.com>
      Reviewed-by: default avatarBenoît Canet <benoit.canet@irqsave.net>
      Signed-off-by: default avatarAvi Kivity <avi@cloudius-systems.com>
      a0ce1b50
    • Nadav Har'El's avatar
      percpu: Reduce size of .percpu section · ed808267
      Nadav Har'El authored
      
      This patch reduces the size of the .percpu section 64-fold from about
      5 MB to 70 KB, and solves issue #95.
      
      The ".percpu" section is part of the .data section of our executable
      (loader-stripped.elf). In our 15 MB executable, roughly 7 MB is text
      (code), and 7 MB is data, and out of that, a whopping 5 MB is the
      ".percpu" section. The executable is read in real mode, and this is
      especially slow on Amazon EC2, hence our wish to make the executable
      as small as possible.
      
      The percpu section starts with all the PERCPU variables defined in the
      program. We have about 70 KB of those, and believe it or not, most of
      this 70 KB is just a single variable, the 65K dynamic_percpu_buffer
      (see percpu.cc).
      
      But then, we need a copy of these variables for each CPU. The unpatched
      code duplicated this 70KB section 64 times in the executable file (!),
      and then used these memory locations for up-to-64 cpus. But there is
      no reason to duplicate this data in the executable! All we need to do
      is to dynamically allocate a copy of this section for each CPU, and
      this is what this patch does.
      
      This patch removes about 5 MB from our executable: After this patch,
      our loader-stripped.elf is just 9.7 MB, and its data section's size is
      just 2.8 MB.
      
      Reviewed-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarNadav Har'El <nyh@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      ed808267
    • Raphael S. Carvalho's avatar
      vfs: Introduce vop_eperm · f1ee72ed
      Raphael S. Carvalho authored
      
      vop_eperm allows more code reuse (suggested by Glauber Costa)
      
      Signed-off-by: default avatarRaphael S. Carvalho <raphaelsc@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      f1ee72ed
  6. Nov 18, 2013
  7. Nov 15, 2013
  8. Nov 14, 2013
  9. Nov 13, 2013
Loading