Skip to content
Snippets Groups Projects
  1. Jul 28, 2013
  2. Jul 27, 2013
    • Glauber Costa's avatar
      bsd: add fls implementation · 7445c1e0
      Glauber Costa authored
      Because this is arch specific, I am adding it to a newly created file in
      arch/x64. I am making it available to BSD through netport for the lack of a
      better place
      7445c1e0
  3. Jul 18, 2013
  4. Jul 11, 2013
    • Glauber Costa's avatar
      xen: massage xen.cc · 81f42426
      Glauber Costa authored
      Since will now have the xen interface files for BSD anyway, let's
      use their more readable definitions instead of hardcoded numbers and duplicated
      strings.
      81f42426
    • Glauber Costa's avatar
      delete xen buggy hypercall · 5da2d022
      Glauber Costa authored
      This was badly adapted from Avi's 5 argument hypercall. It is not in use for now,
      so let's just delete it.
      5da2d022
  5. Jul 08, 2013
    • Avi Kivity's avatar
      percpu: speed up percpu base address calculations · 571d39dd
      Avi Kivity authored
      Currently, we look up the current thread, then the current cpu, then the cpu
      id, then the percpu base (through a vector).  This is slow.
      
      Speed this up by storing the percpu base in a thread-local variable; this
      variable is updated when the thread is started or migrated.
      571d39dd
    • Guy Zana's avatar
      mempool: convert the memory allocator to be per-cpu · 3d4653e7
      Guy Zana authored
      The new code paritions the free list of pages of each pool to be
      per cpu, allocations and deallocations are done locklessly.
      
      uses worker items to avoid a problem where free() for a buffer can
      be done from a cpu that is different than the one which allocated
      that buffer, we use N^2 rings which are used for communicating
      between the threads and worker items. The worker item will actually
      do the free() for a buffer from the same cpu it was allocated on.
      3d4653e7
    • Guy Zana's avatar
      percpu: map percpu variables to the same static elf section · f13e49c3
      Guy Zana authored
      we don't want to use malloc() in the percpu framework since the allocator
      itself will have a percpu design.
      f13e49c3
    • Guy Zana's avatar
      mmu: handle static variable addresses in virt_to_phys() and phys_to_virt() · 96ee87ba
      Guy Zana authored
      the elf is mapped in a 1:1 fashion, so this patch allows addresses
      of static variables to be translated as well (needed for next patch).
      96ee87ba
    • Guy Zana's avatar
      pcpu-worker: add a per cpu worker thread that can execute work items · 45e40421
      Guy Zana authored
      simply allows setting up and execution of a handler in the context of
      a specified CPU, the handler is defined staticly in compile time, and
      is invoked when the worker_item is signaled for a specied CPU.
      
      doesn't use locks to avoid unnecessary contention.
      
      needed for the per-cpu memory allocator, instead of creating additional
      n threads (per each cpu), the plan is to define and register a simple
      handler (lambda function).
      
      example of usage:
      
      void say_hello()
      {
          debug("Hello, world!");
      }
      
      // define hello_tester as a worker_item
      PCPU_WORKERITEM(hello_tester, [] { say_hello(); });
      
      .
      .
      .
      
      // anywhere in the code:
      hello_tester.signal(sched::cpus[1]);
      
      // will invoke say_hello() in the context of cpu 1
      
      Thanks to Avi for adding code that I was able to copy & paste :)
      45e40421
    • Guy Zana's avatar
      arch: add CACHELINE_ALIGNED macro · 2f6cf02f
      Guy Zana authored
      2f6cf02f
  6. Jul 01, 2013
    • Avi Kivity's avatar
      sched: set up thread::current() for new threads earlier · 667e1c10
      Avi Kivity authored
      Currently, current() is set during the thread initialization sequence,
      which means that preemption prior to that point will see the wrong current().
      There's an irq_enable() there, but it's not very effective since interrupts
      are only disabled in that place during early smp bringup, and it's not trivial
      to disable interrupts for all new threads (we?
      667e1c10
    • Avi Kivity's avatar
      sched: late initialize idle thread · 802d3633
      Avi Kivity authored
      Currenly we initialize the idle thread in the cpu's constructor, which leads
      to a cycle, since a thread's initialization needs the cpu.  This works out
      somehow now, but is fragile and will break with succeeding patches.
      
      Defer idle thread initialization to a later stage.
      802d3633
    • Avi Kivity's avatar
      sched: start with preemption disabled · 695375f6
      Avi Kivity authored
      Usually we don't care if threads are started with preemption enabled or
      disabled, since interrupts are disabled and no preemption can take place
      during thread startup.  However during system bringup we want to avoid
      calls into the scheduler while it is being initialized due to stray
      preempt_enable() calls.
      
      Do this by initializing preempt_counter to 1, and dropping it during
      thread start-up.
      695375f6
    • Avi Kivity's avatar
      x64: add a separate interrupt stack · b81a47a5
      Avi Kivity authored
      This makes it easier to debug faults that happen in interrupts (e.g. in the
      scheduler)
      
      Note that this means we cannot schedule within an exception (e.g. a page
      fault) for now, since the exception stack is per-cpu while the interrupt
      stack is per-thread (which allow scheduling).  If/when we implement
      scheduling within the page fault handler, we'll either need to trampoline
      to the interrupt stack, or have a per-thread exception stack.
      b81a47a5
  7. Jun 26, 2013
    • Glauber Costa's avatar
      xen: implement paravirtual clock driver for xen · 70583a58
      Glauber Costa authored
      Unlike KVM, we won't use percpu variables because Xen already lays down
      statically the shared info structure, that includes the vcpu info pointer
      for each cpu.
      
      We could in theory use percpu variables to store pointers to the current cpu
      vcpu info, but I ended up giving up this route.  Since our pcpu implementation
      have the overhead of computing addresses anyway, we may as well pay the price
      and compute it directly from the xen shared info.
      
      One of the things that comes with it, is that we can compute precise timings
      using xenclock very early. Since we don't have *that* much to do early, it is
      unclear if KVM needs to be improved in this regard (probably not), so this
      becomes just a slight bonus.
      70583a58
  8. Jun 25, 2013
    • Glauber Costa's avatar
      xen: negotiate usage of xen pci · 88d39cb7
      Glauber Costa authored
      Xen defines a protocol for defining whether or not PV drivers are available in
      an HVM guest. Upon successful negotiation, the documentation states that:
      
      "The relevant emulated devices then disappear from the relevant buses.  For
      most guest operating systems, you want to do this before device enumeration
      happens."
      
      This patch basically follows this protocol and stores the result for future usage.
      
      See more at: docs/misc/hvm-emulated-unplug.markdown
      88d39cb7
    • Glauber Costa's avatar
      buildfix: pvclock xen definitions · a8b8cf06
      Glauber Costa authored
      a8b8cf06
    • Glauber Costa's avatar
      xen: lay down detection basic infrastructure · 0c8f8f89
      Glauber Costa authored
      
      Xen's information can be in a variety of MSRs. We need to test for them all
      and figure out in which of them lays the informations we want.
      
      Once we determine that, xen initalization code is ready to be executed.
      This needs to run as early as possible, because all xen drivers will
      make use of it one way or another.
      
      The hypercall code is heavily inspired (aka mostly copied) from Avi's
      xen tentative patch, with the 5-argument hypercall removed (delayed until
      we need it)
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      0c8f8f89
    • Glauber Costa's avatar
      loader: skip reading disk on errors · 9fe8d677
      Glauber Costa authored
      
      This is not a very serious issue, but goes like this: The very simple read
      method we are attempting right now in the loader, will keep reading from the
      disk until we reach a pre-determined max size. However, the disk is usually
      smaller than this. If this is the case, XEN dmesg logs are filled with messages
      indicated that we are trying to read from invalid LBAs, to the point of making
      the log useless for me.
      
      So although the annoyance is minor, the patch itself is minor too. If nobody
      opposes, I can apply it.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      9fe8d677
  9. Jun 17, 2013
    • Avi Kivity's avatar
      percpu: per-cpu variables · 63ab89b6
      Avi Kivity authored
      Per-cpu variables can be used from contexts where preemption is disabled
      (such as interrupts) or when migration is impossible (pinned threads) for
      managing data that is replicated for each cpu.
      
      The API is a smart pointer to a variable which resolves to the object's
      location on the current cpu.
      
      Define:
      
         #include <osv/percpu.hh>
      
         PERCPU(int, my_counter);
         PERCPU(foo, my_foo);
      
      Use:
      
         ++*my_counter;
         my_foo->member = 7;
      63ab89b6
  10. Jun 16, 2013
  11. Jun 12, 2013
  12. Jun 11, 2013
    • Avi Kivity's avatar
      x64: prevent nested exceptions from corrupting the stack · 1dbddc44
      Avi Kivity authored
      Due to the need to handle the x64 red zone, we use a separate stack for
      exceptions via the IST mechanism.  This means that a nested exception will
      reuse the parent exception's stack, corrupting it.  It is usually very hard
      to figure out the root cause when this happens.
      
      Prevent this by setting up a separate stack for nested exceptions, and
      aborting immediately if a nested exception happens.
      1dbddc44
  13. Jun 10, 2013
    • Avi Kivity's avatar
      x64: switch ifunc resolvers to processor::features() · 48323f23
      Avi Kivity authored
      Now that processor::features() is initialized early enough, we can use it
      in ifunc dispatchers.
      48323f23
    • Avi Kivity's avatar
      x64: make processor::features usable early on · 7fb119b0
      Avi Kivity authored
      cpuid is useful for ifunc-dispatched functions (like memcpy), so we can
      select the correct function based on available processor features.  Make
      processor::features available early to support this.
      
      We use a static function-local variable to ensure it is initialized early
      enough.
      7fb119b0
    • Avi Kivity's avatar
      libc: optimized memcpy() · 06dd5386
      Avi Kivity authored
      If the cpu supports "Enhanced REP MOVS / STOS" (ERMS), use an rep movsb
      instruction to implement memcpy.  This speeds up copies significantly,
      especially large misaligned ones.
      06dd5386
  14. Jun 09, 2013
    • Nadav Har'El's avatar
      Add, and use, new abort(msg) function · e6208f1e
      Nadav Har'El authored
      Recently Guy fixed abort() so it will *really* not infinitely recurse trying
      to print a message, using a lock, causing a new abort, ad infinitum.
      
      Unfortunately, that didn't fix one remaining case: DUMMY_HANDLER (see
      exceptions.cc) used the idiom
      
              debug(....); abort();
      
      which can again cause infinite recursion - a #GP calls debug() which causes a
      new #GP, which again calls debug, etc.
      
      Instead of the above broken idiom, created a new function abort(msg), which is
      just like the familiar abort(), just changes the "Aborted" message to some
      other message (a constant string). Like abort(), the new variant abort(msg) will
      only print the message once even if called recursively - and uses a lockless
      version of debug().
      
      Note that the new abort(msg) is a C++-only API. C will only see the abort(void)
      which is extern "C". At first I wanted to call the new function panic(msg) and
      export it to C, but gave when I saw the name panic() was already in use in a
      bunch of BSD code.
      e6208f1e
  15. Jun 05, 2013
    • Avi Kivity's avatar
      trace: improve fast path · b03979d9
      Avi Kivity authored
      When a tracepoint is disabled, we want it to have no impact on running code.
      
      This patch changes the fast path to be a single 5-byte nop instruction.  When
      a tracepoint is enabled, the nop is patched to a jump instruction to the
      out-of-line slow path.
      b03979d9
  16. May 27, 2013
    • Nadav Har'El's avatar
      Add "memory clobber" to STI and CLI instructions · a200bb7a
      Nadav Har'El authored
      When some code section happens to be called from both thread context and
      interrupt context, and we need mutual exclusion (we don't want the interrupt
      context to start while the critical section is in the middle of running in
      thread context), we surround the critical code section with CLI and STI.
      
      But we need the compiler to assure us that writes to memory done between
      the calls to CLI and STI stay between them. For example, if we have
      
          thread context:                 interrupt handler:
      
            CLI;                          a--;
            a++;
            STI;
      
      We don't want the a++ to be moved by the compiler before the CLI. We also
      don't want the compiler to save a's value in a register and only actually
      write it back to the memory location 'a' after the STI (when an interrupt
      handler might be concurrently writing). We also don't want the compiler
      to remember a's last value in a register and use it again after the next
      CLI.
      
      To ensure these things, we need the "memory clobber" option on both the CLI
      and STI instructions. The "volatile" keyword is not enough - it guarantees
      that the instruction isn't deleted or moved, but not that stuff that
      should have been in memory isn't just in registers.
      
      Note that Linux also has these memory clobbers on sti() and cli().
      Linus Torvals explains in a post from 1996 why these were necessary:
      http://lkml.indiana.edu/hypermail/linux/kernel/9605/0214.html
      
      All that being said, we never noticed a bug caused by the missing
      "memory" clobbers. But better safe than sorry....
      a200bb7a
  17. May 26, 2013
    • Avi Kivity's avatar
      x64: use wrfsbase for faster context switching, when available · 3c9ba28d
      Avi Kivity authored
      Drops context switch time by ~80ns.
      3c9ba28d
    • Avi Kivity's avatar
      x64: add wrfsbase accessor · bb33c998
      Avi Kivity authored
      Faster way to write fsbase on newer processors.
      bb33c998
    • Nadav Har'El's avatar
      signal handling: fix FPU clobbering bug · 94a7015e
      Nadav Har'El authored
      This patch adds missing FPU-state saving when calling signal handlers.
      The state is saved on the stack, to allow nesting of signal handling
      (delivery of a second signal while a first signal's handler is running).
      
      In Linux calling conventions, the FPU state is caller-saved, i.e., a
      called function can use FPU at will because the caller is assumed to have
      saved it if needed. However, signal handlers are called asynchronously,
      possibly in the middle of some FPU computation without that computation
      getting a chance to save its state. So we must save this state before calling
      the signal handling function.
      
      Without this fix, we had problems even if the signal handlers themselves
      did not use the FPU. A typical scenario - which we encountered in the
      "sunflow" benchmark - is that the signal handler does something which uses
      a mutex (e.g., malloc()) and causes a reschedule. The reschedule, not a
      preempt(), thinks it does not need to save the FPU state, and the thread
      we switch to clobbers this state.
      94a7015e
  18. May 18, 2013
  19. May 07, 2013
  20. May 06, 2013
  21. May 01, 2013
    • Nadav Har'El's avatar
      Unify "mutex_t" and "mutex" types · 3c692eaa
      Nadav Har'El authored
      Previously we had two different mutex types - "mutex_t" defined by
      <osv/mutex.h> for use in C code, and "mutex" defined by <mutex.hh>
      for use in C++ code. This is difference is unnecessary, and causes
      a mess for functions that need to accept either type, so they work
      for both C++ and C code (e.g., consider condvar_wait()).
      
      So after this commit, we have just one include file, <osv/mutex.h>
      which works both in C and C++ code. This results in the same type
      and same functions being defined, plus some additional conveniences
      when in C++, such as method variants of the functions (e.g.,
      m.lock() in addition to mutex_lock(m)), and the "with_lock" function.
      
      The mutex type is now called either "mutex_t" or "struct mutex" in
      C code, or can also be called just "mutex" in C++ code (all three
      names refer to an identical type - there's no longer a different
      mutex_t and mutex type).
      
      This commit also modifies all the includers of <mutex.hh> to use
      <osv/mutex.h>, and fixes a few miscelleneous compilation issues
      that were discovered in the process.
      3c692eaa
Loading