Skip to content
Snippets Groups Projects
  1. Jun 23, 2013
  2. Jun 21, 2013
  3. Jun 20, 2013
    • Dor Laor's avatar
      Limit the usage of indirect buffers · 5b751612
      Dor Laor authored
      Indirect is good for very large SG list but isn't required
      in case there is enough place on the ring or the SG list is tiny.
      For the time being there is barely use of it so I set it off
      by default
      5b751612
    • Dor Laor's avatar
      Add mergeable buffers support for virtio-net · d487ffd1
      Dor Laor authored
      The feature allows the hypervisor to batch several packets together
      as one large SG list. Once such header is received, the guest rx
      routine interates over the list and assembles a mega mbuf.
      
      The patch also simplifies the rx path by using a single buffer for
      the virtio data and its header. This shrinks the sg list from size of
      two into a single one.
      
      The issue is that at the moment I haven't seen packets w/ mbuf > 1
      being received. Linux guest does receives such packets here and there.
      It may be due to the use of offload features that enalrge the packet size
      d487ffd1
  4. Jun 19, 2013
    • Guy Zana's avatar
      rwlock: initial implementation for a rwlock · 6af9f16d
      Guy Zana authored
      this rwlock gives precedence to writers, it relies on a mutex and 2 condvars
      for it's implementation.
      
      it also supports taking the lock recursively for both readers and writers.
      
      this implementation is not fully tested but yet the TCP stack uses it
      extensively, so far without any seen races (tested TCPDownload and netperf).
      6af9f16d
    • Guy Zana's avatar
      bsd: avoid using extern "C" in c++ files · 8536721b
      Guy Zana authored
      1. it is much cleaner that the header files perform extern "C" themselves,
         so they can be included both from C and C++ code.
      
      2. when doing extern "C" from a C++ file then __cplusplus is also defined,
         and compilation can break in some situations.
      
      3. as a bonus, this patch increase compilation time.
      8536721b
    • Nadav Har'El's avatar
      bsd netport: rename log() to bsd_log() · bdd2b656
      Nadav Har'El authored
      netport.h defines a log() macro, which is an unfortunate choice of name
      because log is also a pretty-well-known mathematical function, and this
      
      So rename this macro bsd_log(), and change the dozen files which used
      log() to use bsd_log().
      bdd2b656
    • Nadav Har'El's avatar
      console: implement FIOREAD · ec2e5ebc
      Nadav Har'El authored
      Java's os::available() requires the FIONREAD on fds which do not
      implement seek. So we need to support this ioctl for the console.
      ec2e5ebc
    • Nadav Har'El's avatar
      lock-free mutex: use wake_with() · b4670b9c
      Nadav Har'El authored
      Use the new wake_with() in lock-free mutex
      b4670b9c
    • Nadav Har'El's avatar
      condvar: improve wake performance while maintaining correctness · 43e34bbf
      Nadav Har'El authored
      Before commit 1b53ec56,
      condvar_wake_all() had a crash which could be seen tst-pipe.so (which
      apparently tests some condvar code paths that weren't tested by
      tst-condvar.so).
      
      The fix in this commit was for condvar_wait() to regain the condvar
      internal lock after the wait, even if not needed (it's only really
      needed in case of timeout). This masked the bug (see details below)
      but also deteriorated performance: the woken up thread will now often
      goes back to sleep to wait for the lock which is still held by
      condvar_wake().
      
      This patch reverts that commit, i.e., condvar_wait() does not retake
      the lock when woken. Instead, we fix the real bug: The bug was in
      condvar_wake_all() which did:
      
              sched::thread *t = wr->t;
              wr->t = nullptr;
              t->wake();
              wr = wr->newer;
      
      but after the wake(), wr is no longer valid (the waiter, being woken,
      would quickly exit the condvar_wait() function which held wr on the stack).
      
      However, by not taking the lock after the wait we also have another
      potential for bug - in rare cases merely doing wr->t = nullptr can
      cause the thread t to start running, and if it not only stops waiting
      but also exits - the call to t->wake() will refer to an invalid thread
      and may crash. So we need to use the new wake_with() thread method
      introduced in a previous patch.
      43e34bbf
    • Nadav Har'El's avatar
      tst-wake: test for wake_with() · d0b56169
      Nadav Har'El authored
      Added a test for wake_with(). It tries to ensure that the problematic
      case solved by wake_with() actually happens quickly, by:
       1. Spin a long time between the setting of the flag and t->wake()
       2. Do a spurious wake() to ensure that the waiting thread is woken
          up right after setting the flag, before the intended wake.
       3. Use mprotect() to ensure that working with an already join()ed
          thread crashes immediately, instead of just maybe crashing.
      
      This test fails when wake_with() doesn't use ref()/unref(), and succeeds
      with the full wake_with().
      
      tst-wake contains a second test, which does the same thing but without
      the additional measures we used to show the bug (spinning, spurious
      wake and mprotect). Without these additional measures the test iteration
      is much faster, which allows us to stress wake/join much more.
      d0b56169
    • Nadav Har'El's avatar
      sched::thread: wake_with() to wake a wait_until() · f99c5ccc
      Nadav Har'El authored
      When we use wait_until(), e.g.,
      
              wait_until([&] { return *x == 0; })
      
      We used (in a bunch of places in the code, including condvar) the
      following "obvious" idiom to wake it up:
      
              *x = 0;
              t->wake();
      
      This does the right thing in *almost* all situations. But there's still
      one rare (but very possible) scenario where this is wrong. The problem is
      that the first line (*x = 0) may already cause the wait_until to return.
      This can happen when wait_until didn't yet check the condition, or if it
      was sleeping and by rare coincidence, got woken up by a spurious interrupt
      at the same time we did *x = 0. Now, consider the case that the waiting
      thread decides to exit after the wait_until... So the "*x = 0" causes the
      thread to exit, and when we want to do "t->wake()" the thread no longer
      exists, and the statement crashes.
      
      This patch adds two new thread methods: t->ref() increments a counter
      preventing a thread's destruction, until a matching t->unref().
      With these methods, the correct way to wake the above wait_until() is:
      
              t->ref();
              *x = 0;
              t->wake();
              t->unref();
      
      This patch also adds a one-line shortcut to the above 4 lines, with syntax
      mirroring that of wait_until:
      
              t->wake_with([&] { *x = 0; });
      
      The ref()/unref() methods are marked private, to encourage the use of
      wake_with(), and also to allow wake_with() in the future to be optimized
      to avoid calling ref()/unref() when not needed. For example, when the thread
      is on the same CPU as the current thread, merely disabling preemption (a
      very fast operation) prevents the thread from running - and exiting - and
      ref()/unref() are not necessary.
      
      Unfortunately, while this patch solves one bug, it does not solve two
      additional bugs that existed before, and continue to exist after this
      patch:
      
      1. When a thread completes (see thread::complete()) it wakes a thread
         waiting on join() (if there is one) and this join() deletes the thread
         and its stack. The problem is that if the timing is right (or wrong ;-)),
         the joiner thread may delete the stack while complete() is still
         running on this stack, and can cause a crash.
      
      2. If join() races with the thread's completion, it is possible that
         the thread thinks nobody is waiting for it so notifies nobody, but
         at the same time join() starts to wait, and will never be woken up.
      
      Added two "FIXME" about these remaining bugs.
      f99c5ccc
    • Nadav Har'El's avatar
      Don't crash on lseek() of non-regular file · 0c8ef37c
      Nadav Har'El authored
      lseek() crashes when used on pipes, sockets, and now also fd 0, 1 or 2
      (the console), because they don't have an underlying vnode. No reason
      to assert() in this case, should just return ESPIPE (like Linux does
      for pipes, sockets and ttys).
      
      Similarly, fsync, readdir and friends, fchdir and fstatfs shouldn't
      crash if given a fd without a vnode, and rather should return the
      expected error.
      0c8ef37c
    • Nadav Har'El's avatar
      Fix concurrent console read and write bug · 907e6336
      Nadav Har'El authored
      We had a bug where a read() on the console (fd 0) would block writes to
      the console (fd 1 or 2). This was most noticable when background threads
      in the CLI tried to write output, and were blocked until the next keypress
      because the blocking read() would lock the writes out.
      
      The bug happens because we opened the console using open("/dev/console")
      and dup()'ed the resulting fd, but this results, in the current code, in
      every read and write to these file descriptors to pass through vfs_read()/
      vfs_write(), which lock a single vnode lock for all three file descriptors -
      leading to write on fd 1 blocking while read is ongoing on fd 0.
      
      This patch doesn't fix this vnode lock issue, which remains - and should
      be fixed when the devfs or vfs layers are rewritten. Instead, this patch
      adds a *second API* for opening a console which doesn't go through the
      vnode or devfs layers:
      
      A new console::open() function returns a file descriptor which implements
      the correct file operations, and is not associated with any vnode.
      
      The new implementation works well with write() while read() is ongoing.
      
      Note that poll() support was missing from the old implementation (it
      seems it can't be done with the vnode abstraction?) and is still missing
      in the new implementation, although now shouldn't be hard to add
      (need to implement the poll fileops, and to use poll_wake() in the
      line-discipline function console_poll).
      907e6336
    • Avi Kivity's avatar
      cli: fix tab completion · 79a89f95
      Avi Kivity authored
      tab completion relies on a global 'ls' object, re-add it.
      
      Broken by 4bfe157b.
      79a89f95
    • Nadav Har'El's avatar
      Add unsupported_poll · 29652e61
      Nadav Har'El authored
      Sorry, missing unsupported_poll broke compilation after the previous patch
      29652e61
    • Nadav Har'El's avatar
      Temporary, inefficient, epoll implementation · ad26fb4b
      Nadav Har'El authored
      This is an epoll_*() implementation which calls poll() to do the real work.
      This is of course a terrible implementation, which makes epoll() less
      efficient, instead of more efficient, then poll(). However, it allows me
      to progress with running Jetty in parallel with perfecting epoll.
      ad26fb4b
    • Nadav Har'El's avatar
      Add todo/dns · 15386921
      Nadav Har'El authored
      It's not clear if our DNS resolver works or not - need to test and fix
      if needed.
      15386921
    • Nadav Har'El's avatar
      BSD porting: implement mtx_assert() · 1b9a3b5b
      Nadav Har'El authored
      Trivially implement mtx_assert(). This would catch the "ifconfig" bug
      fixed in the previous patch - where ifconfig called sofree() without
      the accept lock.
      1b9a3b5b
    • Nadav Har'El's avatar
      Fix "ifconfig" corrupting accept_mtx · 073d9ea7
      Nadav Har'El authored
      ifconfig used to call sofree(), which assumed accept_mtx was taken, which
      wasn't true, resulting in either an assertion failure (if we implement
      assert_mtx - see next patch) or a mutex corruption (if assert_mtx does
      nothing).
      
      Instead, we should call soclose(). This wasn't very hard to figure out,
      given the comment in socreate() saying "The socket should be closed with
      soclose()." :-)
      073d9ea7
  5. Jun 18, 2013
Loading