Skip to content
Snippets Groups Projects
  1. Jun 17, 2013
    • Avi Kivity's avatar
      java: add tracepoint interface class · 29a4b2dd
      Avi Kivity authored
      Exposes tracepoints and counters
      29a4b2dd
    • Avi Kivity's avatar
      trace: add probe functions · e58366ac
      Avi Kivity authored
      Add a facility to run functions when a tracepoint is hit.  This is independent
      of logging; you can add a probe function with logging disabled or enabled.
      e58366ac
    • Avi Kivity's avatar
      percpu: add allocatable per-cpu counters · 43ded08d
      Avi Kivity authored
      43ded08d
    • Avi Kivity's avatar
      kvmclock: make per-cpu · d0c2b805
      Avi Kivity authored
      The kvmclock ABI requires it to calculate system time using values for the cpu
      it is running on.
      
      Do this by:
        - changing the system time structure to be per-cpu
        - adding a cpu notifier so that per-cpu MSRs are initialized for each cpu
        - hacking around initialization order issues
      d0c2b805
    • Avi Kivity's avatar
      sched: add cpu notifiers · 26a9fd2f
      Avi Kivity authored
      cpu notifiers are called whenever a cpu is brought up (and one day, down), so
      that drivers that manage the cpu (for example, kvmclock) can initialize
      themselves.
      
      The callback is called on the cpu that is being brought up.
      26a9fd2f
    • Avi Kivity's avatar
      percpu: per-cpu variables · 63ab89b6
      Avi Kivity authored
      Per-cpu variables can be used from contexts where preemption is disabled
      (such as interrupts) or when migration is impossible (pinned threads) for
      managing data that is replicated for each cpu.
      
      The API is a smart pointer to a variable which resolves to the object's
      location on the current cpu.
      
      Define:
      
         #include <osv/percpu.hh>
      
         PERCPU(int, my_counter);
         PERCPU(foo, my_foo);
      
      Use:
      
         ++*my_counter;
         my_foo->member = 7;
      63ab89b6
  2. Jun 16, 2013
  3. Jun 14, 2013
    • Glauber Costa's avatar
      acpi: map bios into our linear mapping · da8d3daf
      Glauber Costa authored
      
      The algorithm we follow for memory discovery is quite simple: iterate over the
      E820h map, and for every type 1 (== RAM) memory, we increment total size, and
      map it linearly to our address space mappings.
      
      That breaks on xen, however. I have no idea what is seabios doing for KVM, but
      xen's hvmloader will put most of the ACPI tables at a reserved region around
      physical address 0xfc000000. When we try to parse the ACPI tables, we will reach
      an unmapped portion of the address space and fault (BTW, those faults are really
      hard to debug, we're triple faulting directly, at least in my setup)
      
      Luckily, the acpi driver code is prepared for such scenarios, and before using
      any of that memory it will call map and unmap functions - we just don't implement
      it.
      
      This patch implements the necessary map function - and while we are at it, its
      unmap counterpart. This is all far away from being performance critical, so I am
      being as dump as possible and just servicing the request without tracking any
      previous state.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      da8d3daf
  4. Jun 13, 2013
  5. Jun 12, 2013
    • Avi Kivity's avatar
      trace: prevent recursion in function tracing · a7f920f2
      Avi Kivity authored
      The functions that are used in function tracing must not themselves be
      traced, lest we recurse endlessly.  Rather than marking them all with
      no_instrument_function, keep a nesting counter and check if we're nested.
      This way only the functions used for the test must not be traced.
      a7f920f2
    • Avi Kivity's avatar
      trace: disable interrupts during tracing · f057d76b
      Avi Kivity authored
      Seeing a trace from an interrupt incurred while tracing can be confusing, so
      disable them.
      f057d76b
    • Avi Kivity's avatar
      x64: provide some uninstrumented versions of irq flag manipulation functions · f7af76ee
      Avi Kivity authored
      In the tracer, we don't want interrupt manipulation to cause recursion, so
      provide uninstrumented versions of select functions.
      f7af76ee
    • Nadav Har'El's avatar
      Optionally enable (disabled by default) lock-free mutex · a2cb99d5
      Nadav Har'El authored
      This patch optionally enables, at compile-time, OSV to use the lock-free
      mutex instead of the spin-lock-based mutex. To use the lock-free mutex,
      change the line "#undef LOCKFREE_MUTEX" in include/osv/mutex.h to
      "#define LOCKFREE_MUTEX".
      
      LOCKFREE_MUTEX is currently disabled by default, awaiting a few more
      tests, but at this point I'm happy to say that beyond one known
      unrelated bug (see details below), it seems the lock-free mutex is
      fairly stable, and survives all tests and benchmarks I threw at it.
      
      The remaining known bug involves a thread destruction race between
      complete() and join(): complete wake()s the joiner thread, which in
      rare cases can really quickly delete the thread's stack, before wake()
      returns, causing a crash on return from wake(). This bug is really
      unrelated to the lock-free mutex, but for some unknown reason I can
      only reproduce it with the lock-free mutex on the SPECjvm2008 "sunflow"
      benchmark.
      
      To make lockfree::mutex our default mutex, this patch does the following
      when LOCKFREE_MUTEX is defined:
      
      1. In core/mutex.cc, #ifndef away out the old mutex code, leaving the
         spinlock code in case someone wants to use it directly.
      
      2. In include/osv/mutex.h, do different things in C++ and C (remember that
         lockfree::mutex is a C++ class, and cannot be used directly from C):
      
         * In C++, simply make mutex and mutex_t aliases for lockfree::mutex.
      
         * In C, make struct mutex and mutex_t an opaque 40-byte structure (in
           C++ compilation, we verify that this 40 is indeed the C++ class's
           length), and define the operations on it.
      
      3. In libc/pthread.cc, if LOCKFREE_MUTEX, unfortunately the new mutex
         will not fit into pthread_mutex_t, and neither will condvar fit now
         into pthread_cond_t. So use a lazily allocated mutex or condvar, using
         the lazy_indirect<> template.
      a2cb99d5
    • Glauber Costa's avatar
      run.py: allow command line select of alternative hypervisors · 3c354af1
      Glauber Costa authored
      
      I have been commenting in and out lines in this script to choose the right
      underlying hypervisor to run. So here is the automated version of it. I haven't
      choosed the letters h or y because they usually denote help and yes,
      respectively. Also not a kvm/no-kvm boolean because very soon we will like to
      include xen.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      3c354af1
    • Glauber Costa's avatar
      update loader Copyright. · f6e4bfb7
      Glauber Costa authored
      
      Now that we can actually see the debug message, print our name on it.
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      f6e4bfb7
    • Glauber Costa's avatar
      console: dump early messages to the serial port · 4fd29712
      Glauber Costa authored
      
      We can use a very simple outb instruction to write data to the serial
      port in case we don't have a console implementation yet. We don't need
      to be fancy, and even limited functionality will already allow us to
      print messages early, (specially debug).
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      4fd29712
    • Glauber Costa's avatar
      run console earlier · 4b5afd0f
      Glauber Costa authored
      
      We could benefit from the console being ready a bit earlier. The only
      dependency that I see to it are the interrupts that need to be working.  So as
      soon as we initialize the ioapic, we should be able to initialize the console.
      
      This is not the end of story: we still need an even earlier console to debug the
      driver initialization functions, and I was inclined to just leave console_init
      where it is, for now.
      
      But additionally, I felt that loader is really a more appropriate place for
      that than vfs_init... So I propose we switch. In the mean time, it might help
      debug things that happen between ioapic init and the old vfs_init (mem
      initialization, smp bring up, etc)
      
      Signed-off-by: default avatarGlauber Costa <glommer@cloudius-systems.com>
      4b5afd0f
  6. Jun 11, 2013
    • Nadav Har'El's avatar
      Add missing change to lfmutex.cc · d96020e3
      Nadav Har'El authored
      Sorry, forgot one hunk in "git add -p" :(
      d96020e3
    • Avi Kivity's avatar
      opensolaris: fix cv_timedwait() · da5939f9
      Avi Kivity authored
      cv_timedwait() has a relative timeout expressed in ticks (microseconds),
      while condvar_wait() has an absolute timeout expressed in nanoseconds.
      
      Replace the 1:1 macro with a function that does the correct translation.
      da5939f9
    • Nadav Har'El's avatar
      lock-free queue: update test · 954bd855
      Nadav Har'El authored
      Updated test with the new API. Sorry about forgetting to commit it earlier.
      954bd855
    • Glauber Costa's avatar
      ccad2d9b
    • Nadav Har'El's avatar
      lock-free queue: change pop() API · df217cef
      Nadav Har'El authored
      Changed lockfree::queue_mpsc (lock-free multiple-producer single-consumer
      queue) pop() API. Instead of returning separately the popped value (type
      T) and a boolean success/failed, now return a pointer to the
      linked_item<T> originally pushed(), or nullptr on failure.
      
      The new pop() API is slightly more awkward (instead of using the returned
      value directly, you need to take it's field "value") but has an important
      new feature: It gives you not just the value, but also the address where
      this value is stored. So it is now possible to change value in its original
      structure. This allows us to implement our (by now) traditional waitqueue
      technique: The values on the queue are thread pointers, and the popper,
      before waking up a thread, sets the thread pointer to zero - this way the
      woken up thread knows it isn't a spurious wakeup.
      
      A followup patch will use this capability to cleanup lockfree::mutex not
      to abuse the "owner" field as a notifier of non-spurious wakeups. After
      that patch, "owner" will be used only for implementing recursive mutex,
      and will not be part of the wakeup protocol.
      df217cef
    • Nadav Har'El's avatar
      lock-free mutex: change and clarify the role of depth and owner · 99b477dc
      Nadav Har'El authored
      The way "owner" and "depth" were used in lockfree::mutex was messy.
      Ideally, neither should be needed if we implemented a non-recursive
      mutex, but following the design of ::mutex, we (re)used "owner" also as
      a marker that a thread was waken to have the lock (and it's not a
      spurious wake).
      
      After this patch, owner and depth are used in lockfree::mutex *only*
      for implementing a recursive mutex, and building a non-recursive
      mutex should be as simple as dropping these two variables.
      
      In more detail:
      
      1. "owner" is no longer used to tell a woken up thread that the wake
         wasn't spurious. Instead, zero the thread in the wait-record. This
         is a familiar idiom, which we already used a few times before.
      
      2. "depth" isn't an atomic variable, so it should only be read by the
         same thread which set it, and this wasn't the case previously. Now,
         depth is only ever written (set to 1, incremented or decremented)
         or read by the lock-holding thread - and not the lock releasing thread.
      
      3. "owner" needs to be an atomic variable - a non-lock-holding thread
         needs to read it and recognize it isn't holding the lock - but it
         doesn't need any special memory ordering with other variables, so
         should always be accessed with "relaxed" memory ordering.
      99b477dc
    • Nadav Har'El's avatar
      sched::thread - fix very rare join() hang · cf4c46c4
      Nadav Har'El authored
      Fixed a very rare hang in sched::thread::join():
      
      thread::complete() included the following code:
      
          _status.store(status::terminated);
          if (_joiner) {
              _joiner->wake();
          }
      
      If we are preempted right after setting status to "terminated", but
      before calling wake(), this thread will never be scheduled again (it will
      remain in the terminated status forever), and will never call wake() -
      so the join()ing thread may just wait forever.
      
      I saw this happening in a test case that started and joined millions of
      threads, and eventually the join() hangs.
      
      The solution is to enclose the above lines with preempt_disable()/
      preempt_enable().
      cf4c46c4
    • Nadav Har'El's avatar
      wake(): Don't miss a preemption opportunity · aee17ba4
      Nadav Har'El authored
      wake() normally calls schedule(), but doesn't do so if preemption is
      disabled. So we should mark need_reschedule = true, to suggest that
      schedule() can be called when preemption is later enabled.
      aee17ba4
    • Avi Kivity's avatar
      x64: prevent nested exceptions from corrupting the stack · 1dbddc44
      Avi Kivity authored
      Due to the need to handle the x64 red zone, we use a separate stack for
      exceptions via the IST mechanism.  This means that a nested exception will
      reuse the parent exception's stack, corrupting it.  It is usually very hard
      to figure out the root cause when this happens.
      
      Prevent this by setting up a separate stack for nested exceptions, and
      aborting immediately if a nested exception happens.
      1dbddc44
  7. Jun 10, 2013
Loading