Skip to content
Snippets Groups Projects
  1. Feb 12, 2014
  2. Feb 11, 2014
    • Nadav Har'El's avatar
      epoll: Support epoll()'s EPOLLET · d41d748f
      Nadav Har'El authored
      
      This patch adds support for epoll()'s edge-triggered mode, EPOLLET.
      Fixes #188.
      
      As explained in issue #188, Boost's asio uses EPOLLET heavily, and we use
      that library in our management http server, and also in our image creation
      tool (cpiod.so). By ignoring EPOLLET, like we did until now, the code worked,
      but unnecessarily wasted CPU when epoll_wait() always returned immediately
      instead of waiting until a new event.
      
      This patch works within the confines of our existing poll mechanisms -
      where epoll() call poll(). We do not change this in this patch, and it
      should be changed in the future (see issue #17).
      
      In this patch we add to each struct file a field "poll_wake_count", which
      as its name suggests counts the number of poll_wake()s done on this
      file. Additionally, epoll remembers the last value it saw of this counter,
      so that in poll_scan(), if we see that an fp (polled with EPOLLET) has
      an unchanged counter from last time, we do not return readiness on this fp
      regardless on whether or not it has readable data.
      
      We have a complication with EPOLLET on sockets. These have an "SB_SEL"
      optimization, which avoids calling poll_wake() when it thinks the new
      data is not interesting because the old data was not yet consumed, and
      also avoids calling poll_wake() if fp->poll() was not previously done.
      This optimization is counter-productive for EPOLLET (and causes missed
      wakeups) so we need to work around it in the EPOLLET case.
      
      This patch also adds a test for the EPOLLET case in tst-epoll.cc. The test
      runs on both OSv and Linux, and can confirm that in the tested scenarios,
      Linux and OSv behave the same, including even one same false-positive:
      When epoll_wait() tells us there is data in a pipe, and we don't read it,
      but then more data comes on a pipe, epoll_wait() will again return a new
      event, despite this is not really being an edge event (the pipe didn't
      change from empty to not-empty, as it was previously not-empty as well).
      
      Concluding remarks:
      
      The primary goal of this implementation is to stop EPOLLET epoll_wait()
      from returning immediately despite nothing have happened on the file.
      That was what caused the 100% CPU use before this patch. That being said,
      the goal of this patch is NOT to avoid all false-positives or unnecessary
      wakeups; When events do occur on the file, we may be doing a bit more
      wakeups than strictly necessary. I think this is acceptable (our epoll()
      has worse problems) but for posterity, I want to explain:
      
      I already mentioned above one false-positive that also happens on Linux.
      Another false-positive wakeup that remains is in one of EPOLLET's classic
      use cases: Consider several threads sleeping on epoll() on the same socket
      (e.g., TCP listening socket, or UDP socket). When one packet arrives, normal
      level-triggered epoll() will wake all the threads, but only one will read
      the packet and the rest will find they have nothing to read. With edge-
      triggered epoll, only one thread should be woken and the rest would not.
      But in our implementation, poll_wake() wakes up *all* the pollers on this
      file, so we cannot currently support this optimization.
      
      Signed-off-by: default avatarNadav Har'El <nyh@cloudius-systems.com>
      Signed-off-by: default avatarAvi Kivity <avi@cloudius-systems.com>
      d41d748f
    • Vlad Zolotarov's avatar
      msix: thread affinity · b4e8d47d
      Vlad Zolotarov authored
      
      Instead of binding all msix interrupts to cpu 0, have them chase the
      interrupt service routine thread and pin themselves to the same cpu.
      
      This patch is based on the patch from Avi Kivity <avi@cloudius-systems.com>
      and used some ideas of Nadav Har'El <nyh@cloudius-systems.com>.
      
      It improves the performance of the single thread Rx netperf test by 16%:
      before - 25694 Mbps
      after  - 29875 Mbps
      
      New in V2:
       - Dropped the functor class - use lambda instead.
       - Fixed the race in a waking flow.
       - Added some comments.
       - Added the performance numbers to the patch description.
      
      Signed-off-by: default avatarVlad Zolotarov <vladz@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      b4e8d47d
    • Nadav Har'El's avatar
      gdb: fix "osv mmap" and friends · ce177a50
      Nadav Har'El authored
      
      It appears that in GDB, (mmu::vma*)0 does not work, and one needs to enclose
      the type's name in single quotes: ('mmu::vma'*)0. This broke the vma_list
      function in scripts/loader.py, and caused an exception in "osv mmap" and
      other commands using the vma_list function.
      
      This patch adds the missing single-quotes.
      
      I don't understand how this code ever worked for anybody...
      I'm using gdb-7.6.1 from Fedora 19, if it matters.
      
      Signed-off-by: default avatarNadav Har'El <nyh@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      ce177a50
    • Claudio Fontana's avatar
      loader: move x64-specific stuff from premain · a06c22d7
      Claudio Fontana authored
      
      move the arch-specific stuff in premain to
      arch/x64/arch-setup.cc.
      
      Introduce arch_init_premain() and arch_setup_tls().
      
      arch_init_premain() is supposed to perform arch-specific
      initialization before the common premain code is run.
      
      arch_setup_tls() is run _after_ the common setup_tls code.
      
      Reviewed-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      Signed-off-by: default avatarClaudio Fontana <claudio.fontana@huawei.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      a06c22d7
    • Avi Kivity's avatar
      net: fix deadlock in net channel poll support · 16086d77
      Avi Kivity authored
      
      Path 1:
      
        poll()
         take file lock
         file::poll_install
           take socket lock
      
      Path 2:
      
        sowakep() (holding socket lock)
          so_wake_poll()
            take file lock
      
      Fix by running poll_install() outside the file lock (which isn't really
      needed).
      
      Reviewed-by: default avatarNadav Har'El <nyh@cloudius-systems.com>
      Signed-off-by: default avatarAvi Kivity <avi@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      16086d77
    • Raphael S. Carvalho's avatar
      elf: Fix program::lookup · c9b32230
      Raphael S. Carvalho authored
      
      Found the problem while running tst-resolve.so, follow the output:
      Success: nonexistant = 0
      Failed: debug = -443987883
      Failed: condvar_wait = -443987883
      The time: 1392070630
      2 failures.
      
      Bisect pointed to the commit 1dc81fe5.
      After understanding the actual purpose of the changes introduced by this
      commit, I figured out that program::lookup simply lacks a return when the
      target symbol is found from the underlying module.
      
      Reviewed-by: default avatarNadav Har'El <nyh@cloudius-systems.com>
      Signed-off-by: default avatarRaphael S. Carvalho <raphaelsc@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      c9b32230
  3. Feb 10, 2014
Loading