Skip to content
Snippets Groups Projects
  1. Apr 24, 2014
  2. Apr 22, 2014
  3. Apr 20, 2014
    • Avi Kivity's avatar
      virtio: fix virtio-blk under debug allocator · a888df1a
      Avi Kivity authored
      
      The debug allocator can allocate non-contiguous memory for large requests,
      but since b7de9871 it uses only one sg entry for the entire buffer.
      
      One possible fix is to allocate contiguous memory even under the debug
      allocator, but in the future we may wish to allow discontiguous allocation
      when not enough contiguous space is available.  So instead we implement
      a virt_to_phys() variant that takes a range, and outputs the physical
      segments that make it up, and use that to construct a minimal sg list
      depending on the input.
      
      Signed-off-by: default avatarAvi Kivity <avi@cloudius-systems.com>
      a888df1a
  4. Apr 17, 2014
  5. Apr 16, 2014
  6. Apr 15, 2014
  7. Apr 14, 2014
  8. Apr 13, 2014
  9. Apr 08, 2014
    • Avi Kivity's avatar
      sched: fix waitqueue race causing failure to wake up · 4ef65eb6
      Avi Kivity authored
      
      When waitqueue::wake_all() wakes up waiting threads, it calls
      sched::thread::wake_lock() to enqueue those waiting threads on the mutex
      protecting the waitqueue, thus avoiding needless contention on the mutex.
      However, if a thread is already waking, we let it wake naturally and acquire
      the mutex itself.
      
      The problem is that the waitqueue code (wait_object<waitqueue>::poll())
      examines the wait_record it sleeps on and see if it has woken, and if not,
      goes back to sleep.  Since nothing in that thread-already-awake path clears
      the wait_record, that is what happens, and the thread stalls, until a timeout
      occurs.
      
      Fix by clearing the wait record.  As it is protected by the mutex, no
      extra synchronization is needed.
      
      Observed with iperf -P 64 against the guest.  Likely triggered by net channels
      waking up the thread, and then before it has a chance to wake up, a FIN
      packet arrives that is processed in the driver thread; so when the packets
      are consumed the thread is in the waking state.
      
      Reviewed-by: default avatarNadav Har'El <nyh@cloudius-systems.com>
      Signed-off-by: default avatarAvi Kivity <avi@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      4ef65eb6
    • Tomasz Grabiec's avatar
      dhcp: remove lookup_opcode() · abbfc557
      Tomasz Grabiec authored
      
      The lookup_opcode() function is incorrect. It was mishandling
      DHCP_OPTION_PAD, which does not have a following length byte.
      
      Also, the while condition is reading 'op' value which never
      changes. This may result in reads beyond packet size.
      
      Since this function is unused the best fix is to remove it.
      
      Reveiwed-by: default avatarVlad Zolotarov <vladz@cloudius-systems.com>
      Signed-off-by: default avatarTomasz Grabiec <tgrabiec@cloudius-systems.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cloudius-systems.com>
      abbfc557
  10. Apr 07, 2014
  11. Apr 03, 2014
    • Nadav Har'El's avatar
      sched: fix rare crashes caused by reschedule running on the wrong CPU · ee92f736
      Nadav Har'El authored
      
      For a long time we've had the bug summarized in issue #178, where very
      rarely but consistently, in various runs such as Cassandra, Netperf and
      tst-queue-mpsc.so, we saw OSv crashing because of some corruption in the
      timer list, such as arming an already armed timer, or canceling and already
      canceled timer.
      
      It turns out the problem was the schedule() function, which basically did
      cpu::current()->schedule(). The problem is that if we're unlucky enough,
      the thread can be migrated right after calling cpu::current(), but before
      the irq disable in schedule(), which causes us to do a rescheduling for
      one CPU on a different CPU, which is a big faux pas. This can cause us,
      for example, to mess with one CPU's preemption_timer from a different CPU,
      causing the timer-related races and crashes we've seen in issue #178.
      
      Clearly, we shouldn't at all have a *method* cpu->schedule() which can
      operate on any cpu. Rather, we should have only a *function* (class-static)
      cpu::schedule() which operates on the current cpu - and makes sure we find
      that current CPU within the IRQ lock to ensure (among other things) the
      thread cannot get migrated.
      
      Another benefit of this patch is that it actually simplifies the code,
      with one less function called "schedule".
      
      Fixes #178.
      
      Signed-off-by: default avatarNadav Har'El <nyh@cloudius-systems.com>
      Signed-off-by: default avatarAvi Kivity <avi@cloudius-systems.com>
      ee92f736
  12. Apr 02, 2014
  13. Apr 01, 2014
    • Tomasz Grabiec's avatar
      core: introduce serial_timer_task · bd179712
      Tomasz Grabiec authored
      This is a wrapper of timer_task which should be used if atomicity of
      callback tasks and timer operations is required. The class accepts
      external lock to serialize all operations. It provides sufficient
      abstraction to replace callouts in the network stack.
      
      Unfortunately, it requires some cooperation from the callback code
      (see try_fire()). That's because I couldn't extract in_pcb lock
      acquisition out of the callback code in TCP stack because there are
      other locks taken before it and doing so _could_ result in lock order
      inversion problems and hence deadlocks. If we can prove these to be
      safe then the API could be simplified.
      
      It may be also worthwhile to propagate the lock passed to
      serial_timer_task down to timer_task to save extra CAS.
      bd179712
    • Tomasz Grabiec's avatar
      core: introduce deferred work framework · 34620ff0
      Tomasz Grabiec authored
      The design behind timer_task
      
      timer_task was design for making cancel() and reschedule() scale well
      with the number of threads and CPUs in the system. These methods may
      be called frequently and from different CPUs. A task scheduled on one
      CPU may be rescheduled later from another CPU. To avoid expensive
      coordination between CPUs a lockfree per-CPU worker was implemented.
      
      Every CPU has a worker (async_worker) which has task registry and a
      thread to execute them. Most of the worker's state may only be changed
      from the CPU on which it runs.
      
      When timer_task is rescheduled it registers its percpu part in current
      CPU's worker. When it is then rescheduled from another CPU, the
      previous registration is marked as not valid and new percpu part is
      registered. When percpu task fires it checks if it is the last
      registration - only then it can fire.
      
      Because timer_task's state is scattered across CPUs some extra
      housekeeping needs to be done before it can be destroyed.  We need to
      make sure that no percpu task will try to access timer_task object
      after it is destroyed. To ensure that we walk the list of
      registrations of given timer_task and atomically flip their state from
      ACTIVE to RELEASED. If that succeeds it means the task is now revoked
      and worker will not try to execute it. If that fails it means the task
      is in the middle of firing and we need to wait for it to finish. When
      the per-CPU task is moved to RELEASED state it is appended to worker's
      queue of released percpu tasks using lockfree mpsc queue. These
      objects may be later reused for registrations.
      34620ff0
Loading