Skip to content
Snippets Groups Projects
  1. Dec 24, 2013
    • Nadav Har'El's avatar
      sched: Overhaul sched::thread::attr construction · eb48b150
      Nadav Har'El authored
      
      We use sched::thread::attr to pass parameters to sched::thread creation,
      i.e., create a thread with non-default stack parameters, pinned to a
      particular CPU, or a detached thread.
      
      Previously we had constructors taking many combinations of stack size
      (integer), pinned cpu (cpu*) and detached (boolean), and doing "the
      right thing". However, this makes the code hard to read (what does
      attr(4096) specify?) and the constructors hard to expand with new
      parameters.
      
      Replace the attr() constructors with the so-called "named parameter"
      idiom: attr now only has a null constructor attr(), and one modifies
      it with calls to pin(cpu*), detach(), or stack(size).
      
      For example,
          attr()                                  // default attributes
          attr().pin(sched::cpus[0])              // pin to cpu 0
          attr().stack(4096).pin(sched::cpus[0])  // pin and non-default stack
          and so on.
      
      Signed-off-by: default avatarNadav Har'El <nyh@cloudius-systems.com>
      Signed-off-by: default avatarAvi Kivity <avi@cloudius-systems.com>
      eb48b150
  2. Dec 19, 2013
  3. Dec 16, 2013
  4. Nov 11, 2013
  5. Nov 08, 2013
  6. Oct 16, 2013
  7. Oct 13, 2013
  8. Oct 10, 2013
    • Avi Kivity's avatar
      build: define _KERNEL everywhere · 95ce17e3
      Avi Kivity authored
      We have _KERNEL defines scattered throughout the code, which makes
      understanding it difficult.
      
      Define it just once, and adjust the source to build.
      
      We define it in an overridable variable, so that non-kernel imported code
      can undo it.
      95ce17e3
  9. Oct 03, 2013
  10. Sep 20, 2013
  11. Sep 17, 2013
  12. Sep 16, 2013
    • Glauber Costa's avatar
      routing: provide a valid MTU in route messages · 295d80ff
      Glauber Costa authored
      Right now we send route messages with MTUs zeroed out. This can lead
      to the following assert in ip_output.c (~line 308) triggering:
      
          KASSERT(mtu > 0, ("%s: mtu %d <= 0, rte=%p (rt_flags=0x%08x) ifp=%p",
              __func__, mtu, rte, (rte != NULL) ? rte->rt_flags : 0, ifp));
      
      This happens because the code will assume that if there is a valid route, that
      route will have a valid MTU. And in this case, will always use the route MTU
      instead of the interface one.
      
      When we allocate the route it has a valid MTU. But when we send the route
      message, we will overwrite it with the value we see in the route message.  This
      is done in rtsock.c:rt_setmetrics.
      
      With this patch, those assertion stops happening.
      
      A note: this wasn't been seen in local instalations, only on EC2. Looking at
      it, there is nothing Xen specific. The reason it was not happening on local, is
      that local traffic does not go through the default route, but rather through
      the local 192.168.100.0/24 route. That one seems to take a different
      configuration path, and thus sets the MTU correctly.
      295d80ff
  13. Sep 15, 2013
    • Nadav Har'El's avatar
      Add copyright statements in bsd/ · fe9e6a82
      Nadav Har'El authored
      Added our copyright statements to some of the files in the top bsd/
      directory, and in bsd/porting.
      
      I only added our copyright to files which were completely by us - I did
      not attempt to hunt which bsd or solaris files we modified to add our
      copyright to them, I don't think this is important (or, we can do this
      later).
      
      I also found one header file (uma_stub.h) that had large chunks copied from
      freebsd, so I added both the freebsd copyright and ours.
      fe9e6a82
  14. Sep 14, 2013
    • Glauber Costa's avatar
      Do not overwrite the buffer on writes. · 0e62d585
      Glauber Costa authored
      Even Lords make brown paper bag mistakes. This is a left over code from my
      initial testing, where the buffer where set with pre existing values to make
      sure they were going through.  I forgot to remove them. As a result reads were
      fine, but writes would just wipe the previous data from the buffer.
      Incidentally, the "write-then-read-the-data-back" test I was doing would also
      obviously pass, so I haven't noticed this so far.
      
      Fix is to just leave the buffer alone.
      0e62d585
    • Nadav Har'El's avatar
      Change "hz" to fix poll() premature timeout · 26a30376
      Nadav Har'El authored
      msleep() measure times in units of 1/hz seconds. We had hz = 1,000,000,
      which gives excellent resolution (microsecond) but a terible range
      (limits msleep()'s timeout to 35 minutes).
      
      We had a program (Cassandra) doing poll() with a timeout of 2 hours,
      which caused msleep to think we gave a negative timeout.
      
      This patch reduces hz to 1,000, i.e., have msleep() operate in the same units
      as poll(). Looking at the code, I don't believe this change will have any
      ill-effects - we don't need higher resolution (freebsd code is used to
      hz=1,000, which is the default there), and the code converts time units to
      hz's correctly, always using the hz macro. The allowed range for timeouts will
      grow to over 24 days - and match poll()'s allowed range.
      26a30376
  15. Aug 28, 2013
    • Glauber Costa's avatar
      mbufs: use an entire page for jumbop zone allocations · 0d466fab
      Glauber Costa authored
      Xen has hard requirements on page transfers, and how to feed the grant tables.
      The address need to be page aligned, since the pfns and not addresses are used,
      and we need to provide at least a full page per buffer, since the hypervisor is
      free to fill any data within the page.
      
      To achieve that, the netfront driver will use m_cljget to attach an extended
      buffer to the mbuf, from the jumbop zone, since they are page-sized. However,
      two problems arise from this:
      
      1) Allocating a page goes through malloc_large. Our implementation of malloc_large
      is currently terribly inefficient, and that creates a very heavy contention site.
      
      What I am doing with this patch is to switch our uma implementation to
      alloc_page / free_page instead of malloc if the caller of zcreate so specified
      (and then of course, specify it for the jumbop cache)
      
      2) The refcount that is attached in the end of the buffer would either extend the
      buffer to 4100 bytes - defeating our purpose, or then the buffer would have to be
      PAGE_SIZE - 4, to accomodate for the refcount. But since the hypervisor will write
      to the whole page, it will eventually overwrite the refcount.
      
      To address that, I am allocating an external reference counter. BSD already
      have some infrastructure to do that, and I am taking advantage of this.
      However, I have found no way of implementing this in a way in which the
      reference count can be easily deduceable from the address of the extended
      buffer, without having the supporting mbuf to start from. Any external data
      structure such as hashes would probably make freeing way too slow. Thankfully,
      uma_find_refcnt and the UMA_ZONE_REFCNT seems to be used mostly in the
      setup/destruction phase (the mbuf refcount is used directly, open coded). So my
      proposal here is to remove the UMA_ZONE_REFCNT for that zone.
      0d466fab
  16. Aug 18, 2013
    • Avi Kivity's avatar
      osv_start_if: set address instead of adding a new one · 91cd1e4c
      Avi Kivity authored
      SIOCAIFADDR appends an address to the interface's address list instead of
      replacing it.  This causes 'ifconfig' to display 0.0.0.0 (the first address
      configured) instead of the correct address obtained by dhcp.
      
      Fix by also deleting the existing address, if it exists.
      91cd1e4c
  17. Aug 14, 2013
  18. Aug 13, 2013
  19. Aug 12, 2013
  20. Jul 31, 2013
    • Avi Kivity's avatar
      uma: cache initialized objects · 776221bd
      Avi Kivity authored
      The init and fini functions are fairly expensive (for networking).  Cache
      initialized objects in percpu pools to save this cost.
      
      The implementation is imperfect since if we're allocating on one cpu and
      freeing on another, reuse is low.  This can be improved in the future, or
      made unnecessary with VJ rings.
      
      Increases netperf from ~14.6Gbps to ~17.8Gbps on my machine.
      776221bd
    • Avi Kivity's avatar
      uma: honor M_ZERO · 88e3d43e
      Avi Kivity authored
      M_ZERO requests zeroing of the object regardless of any constructor; honor it.
      
      It works now because we bzero() all objects unconditionally, but we soon
      won't.
      88e3d43e
    • Avi Kivity's avatar
      uma: dynamically allocate zones · 5a6ca26e
      Avi Kivity authored
      We're going to make zones more expensive to allocate, so allocate only
      as many as we need.
      5a6ca26e
    • Avi Kivity's avatar
      bsd: switch uma to C++ · 13d28a21
      Avi Kivity authored
      Allows integrating with our mempools.
      13d28a21
    • Avi Kivity's avatar
      dhcp: remove M_ZERO from mbuf allocation · 4ac239de
      Avi Kivity authored
      M_ZERO requests zeroing of the entire mbuf, which clears the fields initialized
      by the init function.  It only works now because we don't honor M_ZERO.
      
      Remove M_ZERO and replace with bzero() for the packet data only.
      4ac239de
    • Glauber Costa's avatar
      msleep: respect BSD's PDROP · 322bd4fb
      Glauber Costa authored
      So after all the BSD code is not buggy, it is just their semantics that is
      slightly different from our version of mlock (Thanks Christoph). Whether or not
      we will drop the lock will be controlled by the value of the PDROP flag.
      
      Our msleep queues are protected by an internal lock which is not the same lock
      the user passed to the msleep call. Therefore, we can just a version of
      wait_until that does not take a lock argument, and do the locking manually
      ourselves. We may then lock it back or not, depending on the presence of the
      PDROP flag.
      322bd4fb
  21. Jul 30, 2013
  22. Jul 29, 2013
    • Glauber Costa's avatar
      bsd: pcpu stubs · 05f014f2
      Glauber Costa authored
      BSD register a structure with its per cpu data. We can do the same, using
      just the fields we need.
      05f014f2
    • Glauber Costa's avatar
      kthread: also allow for process creation · 4d2f9bfe
      Glauber Costa authored
      Xen BSD code will attempt to create processes to serve as deamons for xenstore.
      We can basically emulate this by creating threads.
      
      The only thing I am doing differently from the already existent thread creation
      layer that we have, is that callers will expect a struct proc to be returned. We
      had this as a stub, now I am creating a small struct with just the PID to serve
      as a return placeholder. The listener processes in xenstore never dies, so I am
      not implementing a deallocate routine for them
      4d2f9bfe
Loading